Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Module rewrite with custom JSONPath implementation #974

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci-bdnbenchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
os: [ ubuntu-latest, windows-latest ]
framework: [ 'net8.0' ]
configuration: [ 'Release' ]
test: [ 'Operations.BasicOperations', 'Operations.ObjectOperations', 'Operations.HashObjectOperations', 'Cluster.ClusterMigrate', 'Cluster.ClusterOperations', 'Lua.LuaScripts', 'Lua.LuaScriptCacheOperations','Lua.LuaRunnerOperations','Operations.CustomOperations', 'Operations.RawStringOperations', 'Operations.ScriptOperations', 'Operations.ModuleOperations', 'Operations.PubSubOperations', 'Network.BasicOperations', 'Network.RawStringOperations' ]
test: [ 'Operations.BasicOperations', 'Operations.ObjectOperations', 'Operations.HashObjectOperations', 'Cluster.ClusterMigrate', 'Cluster.ClusterOperations', 'Lua.LuaScripts', 'Lua.LuaScriptCacheOperations','Lua.LuaRunnerOperations','Operations.CustomOperations', 'Operations.RawStringOperations', 'Operations.ScriptOperations', 'Operations.JsonOperations', 'Operations.ModuleOperations', 'Operations.PubSubOperations', 'Network.BasicOperations', 'Network.RawStringOperations' ]
steps:
- name: Check out code
uses: actions/checkout@v4
Expand Down
74 changes: 74 additions & 0 deletions benchmark/BDN.benchmark/Json/JsonPathQuery.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Engines;
using GarnetJSON.JSONPath;
using System.Text.Json.Nodes;

namespace BDN.benchmark.Json;

[MemoryDiagnoser]
public class JsonPathQuery
{
private JsonNode _jsonNode;

private readonly Consumer _consumer = new Consumer();

[Params(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General Q - where did these test cases come from? Just curious if this is copied from somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have asked Copilot to create sample JPaths for each type of JPath supported by Redis and Newtonsoft

"$.store.book[0].title",
"$.store.book[*].author",
"$.store.book[?(@.price < 10)].title",
"$.store.bicycle.color",
"$.store.book[*]", // all books
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indents are wonky, just replace with a single space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"$.store..price", // all prices using recursive descent
"$..author", // all authors using recursive descent
"$.store.book[?(@.price > 10 && @.price < 20)]", // filtered by price range
"$.store.book[?(@.category == 'fiction')]", // filtered by category
"$.store.book[-1:]", // last book
"$.store.book[:2]", // first two books
"$.store.book[?(@.author =~ /.*Waugh/)]", // regex match on author
"$..book[0,1]", // union of array indices
"$..*", // recursive descent all nodes
"$..['bicycle','price']", // recursive descent specfic node with name match
"$..[?(@.price < 10)]", // recursive descent specfic node with conditionally match
"$.store.book[?(@.author && @.title)]", // existence check
"$.store.*" // wildcard child
)]
public string JsonPath { get; set; }

[GlobalSetup]
public void Setup()
{
var jsonString = """
{
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
""";

_jsonNode = JsonNode.Parse(jsonString);
}

[Benchmark]
public void SelectNodes()
{
var result = _jsonNode.SelectNodes(JsonPath);
result.Consume(_consumer);
}
}
135 changes: 135 additions & 0 deletions benchmark/BDN.benchmark/Operations/JsonOperations.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

using System.Text;
using BenchmarkDotNet.Attributes;
using Embedded.server;

namespace BDN.benchmark.Operations
{
/// <summary>
/// Benchmark for ModuleOperations
/// </summary>
[MemoryDiagnoser]
public class JsonOperations : OperationsBase
{
// Existing commands
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Existing & new with regards to what? These would all be existing commands once the PR is merged...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, removed it

static ReadOnlySpan<byte> JSONGETCMD => "*3\r\n$8\r\nJSON.GET\r\n$2\r\nk3\r\n$1\r\n$\r\n"u8;
static ReadOnlySpan<byte> JSONSETCMD => "*4\r\n$8\r\nJSON.SET\r\n$2\r\nk3\r\n$4\r\n$.f2\r\n$1\r\n2\r\n"u8;

// New commands for different JsonPath patterns
static ReadOnlySpan<byte> JSONGET_DEEP => "*3\r\n$8\r\nJSON.GET\r\n$4\r\nbig1\r\n$12\r\n$.data[0].id\r\n"u8;
static ReadOnlySpan<byte> JSONGET_ARRAY => "*3\r\n$8\r\nJSON.GET\r\n$4\r\nbig1\r\n$13\r\n$.data[*]\r\n"u8;
static ReadOnlySpan<byte> JSONGET_ARRAY_ELEMENTS => "*3\r\n$8\r\nJSON.GET\r\n$4\r\nbig1\r\n$13\r\n$.data[*].id\r\n"u8;
static ReadOnlySpan<byte> JSONGET_FILTER => "*3\r\n$8\r\nJSON.GET\r\n$4\r\nbig1\r\n$29\r\n$.data[?(@.active==true)]\r\n"u8;
static ReadOnlySpan<byte> JSONGET_RECURSIVE => "*3\r\n$8\r\nJSON.GET\r\n$4\r\nbig1\r\n$4\r\n$..*\r\n"u8;

Request jsonGetCmd;
Request jsonSetCmd;
Request jsonGetDeepCmd;
Request jsonGetArrayCmd;
Request jsonGetArrayElementsCmd;
Request jsonGetFilterCmd;
Request jsonGetRecursiveCmd;

private static string GenerateLargeJson(int items)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please place the private methods after the public ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
var data = new System.Text.StringBuilder();
data.Append("{\"data\":[");

for (int i = 0; i < items; i++)
{
if (i > 0) data.Append(',');
data.Append($$"""
{
"id": {{i}},
"name": "Item{{i}}",
"active": {{(i % 2 == 0).ToString().ToLower()}},
"value": {{i * 100}},
"nested": {
"level1": {
"level2": {
"value": {{i}}
}
}
}
}
""");
}

data.Append("]}");
return data.ToString();
}

private void RegisterModules()
{
server.Register.NewModule(new NoOpModule.NoOpModule(), [], out _);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the No-op module needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

server.Register.NewModule(new GarnetJSON.Module(), [], out _);
}

public override void GlobalSetup()
{
base.GlobalSetup();
RegisterModules();

SetupOperation(ref jsonGetCmd, JSONGETCMD);
SetupOperation(ref jsonSetCmd, JSONSETCMD);
SetupOperation(ref jsonGetDeepCmd, JSONGET_DEEP);
SetupOperation(ref jsonGetArrayCmd, JSONGET_ARRAY);
SetupOperation(ref jsonGetArrayElementsCmd, JSONGET_ARRAY_ELEMENTS);
SetupOperation(ref jsonGetFilterCmd, JSONGET_FILTER);
SetupOperation(ref jsonGetRecursiveCmd, JSONGET_RECURSIVE);

// Setup test data
var largeJson = GenerateLargeJson(20);
SlowConsumeMessage(Encoding.UTF8.GetBytes($"*4\r\n$8\r\nJSON.SET\r\n$4\r\nbig1\r\n$1\r\n$\r\n${largeJson.Length}\r\n{largeJson}\r\n"));

// Existing setup
SlowConsumeMessage("*4\r\n$8\r\nJSON.SET\r\n$2\r\nk3\r\n$1\r\n$\r\n$14\r\n{\"f1\":{\"a\":1}}\r\n"u8);
SlowConsumeMessage(JSONGETCMD);
SlowConsumeMessage(JSONSETCMD);
}

[Benchmark]
public void ModuleJsonGetCommand()
{
Send(jsonGetCmd);
}

[Benchmark]
public void ModuleJsonSetCommand()
{
Send(jsonSetCmd);
}

[Benchmark]
public void ModuleJsonGetDeepPath()
{
Send(jsonGetDeepCmd);
}

[Benchmark]
public void ModuleJsonGetArrayPath()
{
Send(jsonGetArrayCmd);
}

[Benchmark]
public void ModuleJsonGetArrayElementsPath()
{
Send(jsonGetArrayElementsCmd);
}

[Benchmark]
public void ModuleJsonGetFilterPath()
{
Send(jsonGetFilterCmd);
}

[Benchmark]
public void ModuleJsonGetRecursive()
{
Send(jsonGetRecursiveCmd);
}
}
}
23 changes: 0 additions & 23 deletions benchmark/BDN.benchmark/Operations/ModuleOperations.cs
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,6 @@ public class ModuleOperations : OperationsBase
static ReadOnlySpan<byte> NOOPTXN => "*1\r\n$18\r\nNoOpModule.NOOPTXN\r\n"u8;
Request noOpTxn;

static ReadOnlySpan<byte> JSONGETCMD => "*3\r\n$8\r\nJSON.GET\r\n$2\r\nk3\r\n$1\r\n$\r\n"u8;
Request jsonGetCmd;

static ReadOnlySpan<byte> JSONSETCMD => "*4\r\n$8\r\nJSON.SET\r\n$2\r\nk3\r\n$4\r\n$.f2\r\n$1\r\n2\r\n"u8;
Request jsonSetCmd;

private void RegisterModules()
{
server.Register.NewModule(new NoOpModule.NoOpModule(), [], out _);
Expand All @@ -54,9 +48,6 @@ public override void GlobalSetup()
SetupOperation(ref noOpProc, NOOPPROC);
SetupOperation(ref noOpTxn, NOOPTXN);

SetupOperation(ref jsonGetCmd, JSONGETCMD);
SetupOperation(ref jsonSetCmd, JSONSETCMD);

SlowConsumeMessage("*3\r\n$3\r\nSET\r\n$2\r\nk1\r\n$1\r\nc\r\n"u8);
SlowConsumeMessage(NOOPCMDREAD);
SlowConsumeMessage(NOOPCMDRMW);
Expand All @@ -65,8 +56,6 @@ public override void GlobalSetup()
SlowConsumeMessage(NOOPPROC);
SlowConsumeMessage(NOOPTXN);
SlowConsumeMessage("*4\r\n$8\r\nJSON.SET\r\n$2\r\nk3\r\n$1\r\n$\r\n$14\r\n{\"f1\":{\"a\":1}}\r\n"u8);
SlowConsumeMessage(JSONGETCMD);
SlowConsumeMessage(JSONSETCMD);
}

[Benchmark]
Expand Down Expand Up @@ -104,17 +93,5 @@ public void ModuleNoOpTxn()
{
Send(noOpTxn);
}

[Benchmark]
public void ModuleJsonGetCommand()
{
Send(jsonGetCmd);
}

[Benchmark]
public void ModuleJsonSetCommand()
{
Send(jsonSetCmd);
}
}
}
22 changes: 22 additions & 0 deletions libs/server/Custom/CustomCommandUtils.cs
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,15 @@ public static unsafe void WriteBulkString(ref (IMemoryOwner<byte>, int) output,
public static unsafe void WriteError(ref (IMemoryOwner<byte>, int) output, string errorMessage)
{
var bytes = System.Text.Encoding.ASCII.GetBytes(errorMessage);
WriteError(ref output, bytes);
}

/// <summary>
/// Create output as error message, from given string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// </summary>
public static unsafe void WriteError(ref (IMemoryOwner<byte>, int) output, ReadOnlySpan<byte> errorMessage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: this class should be internal, all calls to these methods should be through CustomObjectFunctions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: while you're updating this file, for some reason there are 2 copyright comments on top, if you could remove the bottom one that'd be awesome :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
var bytes = errorMessage;
// Get space for error
var len = 1 + bytes.Length + 2;
output.Item1 = MemoryPool.Rent(len);
Expand Down Expand Up @@ -142,5 +151,18 @@ public static unsafe void WriteSimpleString(ref (IMemoryOwner<byte>, int) output
}
output.Item2 = len;
}

public static unsafe void WriteDirect(ref (IMemoryOwner<byte>, int) output, ReadOnlySpan<byte> bytes)
{
output.Item1 = MemoryPool.Rent(bytes.Length);
fixed (byte* ptr = output.Item1.Memory.Span)
{
var curr = ptr;
// NOTE: Expected to always have enough space to write into pre-allocated buffer
var success = RespWriteUtils.TryWriteDirect(bytes, ref curr, ptr + bytes.Length);
Debug.Assert(success, "Insufficient space in pre-allocated buffer");
}
output.Item2 = bytes.Length;
}
}
}
7 changes: 7 additions & 0 deletions libs/server/Custom/CustomObjectFunctions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ public abstract class CustomObjectFunctions
/// </summary>
protected static unsafe void WriteError(ref (IMemoryOwner<byte>, int) output, string errorMessage) => CustomCommandUtils.WriteError(ref output, errorMessage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add WriteError(ref (IMemoryOwner<byte>, int) output, ReadOnlySpan<byte> errorMessage) here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/// <summary>
/// Writes the specified bytes directly to the output.
/// </summary>
/// <param name="output">The output buffer and its length.</param>
/// <param name="bytes">The bytes to write.</param>
protected static unsafe void WriteDirect(ref (IMemoryOwner<byte>, int) output, ReadOnlySpan<byte> bytes) => CustomCommandUtils.WriteDirect(ref output, bytes);

/// <summary>
/// Get argument from input, at specified offset (starting from 0)
/// </summary>
Expand Down
52 changes: 52 additions & 0 deletions libs/server/Custom/RespCustomObjectOutputWriterExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

using System;
using System.Buffers;
using System.Text;

namespace Garnet.server
{
/// <summary>
/// Provides extension methods for handling custom object output writing operations.
/// </summary>
public static class RespCustomObjectOutputWriterExtensions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO I don't think extension method are the correct choice here, these can be added to CustomObjectFunctions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
/// <summary>
/// Aborts the execution of the current object store command and outputs
/// an error message to indicate a wrong number of arguments for the given command.
/// </summary>
/// <param name="cmdName">Name of the command that caused the error message.</param>
/// <returns>true if the command was completely consumed, false if the input on the receive buffer was incomplete.</returns>
public static bool AbortWithWrongNumberOfArguments(this ref (IMemoryOwner<byte>, int) output, string cmdName)
{
var errorMessage = Encoding.ASCII.GetBytes(string.Format(CmdStrings.GenericErrWrongNumArgs, cmdName));

return output.AbortWithErrorMessage(errorMessage);
}

/// <summary>
/// Aborts the execution of the current object store command and outputs a given error message.
/// </summary>
/// <param name="errorMessage">Error message to print to result stream.</param>
/// <returns>true if the command was completely consumed, false if the input on the receive buffer was incomplete.</returns>
public static bool AbortWithErrorMessage(this ref (IMemoryOwner<byte>, int) output, ReadOnlySpan<byte> errorMessage)
{
CustomCommandUtils.WriteError(ref output, errorMessage);

return true;
}

/// <summary>
/// Aborts the execution of the current object store command and outputs a given error message.
/// </summary>
/// <param name="errorMessage">Error message to print to result stream.</param>
/// <returns>true if the command was completely consumed, false if the input on the receive buffer was incomplete.</returns>
public static bool AbortWithErrorMessage(this ref (IMemoryOwner<byte>, int) output, string errorMessage)
{
CustomCommandUtils.WriteError(ref output, errorMessage);

return true;
}
}
}
2 changes: 1 addition & 1 deletion libs/server/Resp/CmdStrings.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ namespace Garnet.server
/// <summary>
/// Command strings for RESP protocol
/// </summary>
static partial class CmdStrings
public static partial class CmdStrings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that we should make CmdStrings & ExistOptions public, I've added comments about alternative solutions in JsonCommands

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand how adding the ObjectInputExtensions class will avoid making ExistOptions public. Can you expand on it? Also, a general question what is the long-term goal with CmdStrings because differently in future PRs for JSON module I need to return many of the generic response/error messages that are available in CmdStrings, do I need to create an extension method (like AbortWithSyntaxError) for each of the response? or should I create a new CmdStrings class internal to the JSON module and copy over the response that that model required?

{
/// <summary>
/// Request strings
Expand Down
2 changes: 1 addition & 1 deletion libs/server/Resp/RespEnums.cs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ internal enum EtagOption : byte
WithETag,
}

internal enum ExistOptions : byte
public enum ExistOptions : byte
{
None,
NX,
Expand Down
Loading
Loading