TODO: needs update
Benchmarks have been implemented with BenchmarkDotNet.
All CacheManager instances used in the benchmarks have only one cache handle configured, either the Dictionary, System.Runtime or Redis handle.
We are using the same configuration for all benchmarks and running two jobs each, one for x86 and one for x64. Regarding the different platforms, the conclusion is obviously that x64 is always faster than the x86 platform, but x64 consumes slightly more memory of course.
BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3194)
12th Gen Intel Core i7-12700KF, 1 CPU, 20 logical and 12 physical cores
.NET SDK 9.0.200
[Host] : .NET 8.0.13 (8.0.1325.6609), X64 RyuJIT AVX2
Job-BIBDFC : .NET 8.0.13 (8.0.1325.6609), X64 RyuJIT AVX2
IterationCount=10 LaunchCount=1 WarmupCount=2
Adding one item per run
Redis will be a lot slower in this scenario because CacheManager waits for the response to be able to return the bool
value if the key has been added or not.
In general, it is good to see how fast the Dictionary handle is compared to the System.Runtime
one. One thing you cannot see here is that also the memory footprint of the Dictionary handle is much lower.
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Dictionary | 121.0 ns | 2.00 ns | 1.04 ns | 1.00 | 0.01 | 0.0153 | - | 200 B | 1.00 |
Runtime | 637.2 ns | 28.46 ns | 16.94 ns | 5.27 | 0.14 | 0.2384 | 0.0010 | 3120 B | 15.60 |
MsMemory | 198.8 ns | 4.59 ns | 3.04 ns | 1.64 | 0.03 | 0.0260 | - | 340 B | 1.70 |
Redis | 105,395.6 ns | 3,758.09 ns | 2,485.74 ns | 871.20 | 20.86 | - | - | 1256 B | 6.28 |
Adding one item per run with using region
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|
Dictionary | 144.2 ns | 4.65 ns | 3.07 ns | 1.00 | 0.03 | 0.0231 | 304 B | 1.00 |
Runtime | 342.6 ns | 18.01 ns | 11.91 ns | 2.38 | 0.09 | 0.0610 | 800 B | 2.63 |
MsMemory | 164.1 ns | 5.74 ns | 3.80 ns | 1.14 | 0.03 | 0.0231 | 304 B | 1.00 |
Redis | 139,200.0 ns | 4,863.92 ns | 2,894.44 ns | 966.01 | 27.25 | - | 1528 B | 5.03 |
Put 1 item per run Redis is as fast as the other handles in this scenario because CacheManager uses fire and forget for those operations. For Put it doesn't matter to know if the item has been added or updated...
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Dictionary | 95.01 ns | 1.888 ns | 1.249 ns | 1.00 | 0.02 | 0.0122 | - | 160 B | 1.00 |
Runtime | 887.35 ns | 19.929 ns | 13.181 ns | 9.34 | 0.18 | 0.4263 | 0.0095 | 5576 B | 34.85 |
MsMemory | 172.40 ns | 5.030 ns | 3.327 ns | 1.81 | 0.04 | 0.0336 | - | 440 B | 2.75 |
Redis | 4,136.37 ns | 301.420 ns | 199.371 ns | 43.54 | 2.07 | 0.0839 | 0.0610 | 1095 B | 6.84 |
Get 1 item per run
With Get
operations we can clearly see how much faster an in-memory cache is, compared to the distributed variant. That's why it makes so much sense to use CacheManager with a first and secondary cache layer.
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|
Dictionary | 34.97 ns | 0.532 ns | 0.317 ns | 1.00 | 0.01 | - | - | NA |
Runtime | 179.45 ns | 2.104 ns | 1.392 ns | 5.13 | 0.06 | 0.0153 | 200 B | NA |
MsMemory | 75.12 ns | 0.338 ns | 0.201 ns | 2.15 | 0.02 | - | - | NA |
Redis | 75,534.71 ns | 5,303.351 ns | 3,507.838 ns | 2,160.12 | 97.47 | 0.1221 | 2008 B | NA |
For this, I only used the bare serializer without the cache layer overhead (e.g. Redis) which could yield wrong results. Each single performance run does 1000 iterations of serializing and deserializing the same object. Object structure was the following:
{
"L" : 1671986962,
"S" : "1625c0a0-86ce-4fd5-9047-cf2fb1d145b2",
"SList" : ["98a62a89-f3e9-49d7-93ad-a4295b21c1a1", "47a86f42-64b0-4e6d-9f18-ecb20abff2a3", "7de26dfc-57a5-4f16-b421-8999b73c9afb", "e29a8f8a-feb8-4f3f-9825-78c067215339", "5b2e1923-8a76-4f39-9366-4700c7d0d408", "febea78f-ca5e-49d6-99c9-18738e4fb36f", "7c87b429-e931-4f1a-a59a-433504c87a1c", "bf288ff7-e6c0-4df1-bfcf-677ff31cdf45", "9b7fcd6c-45ee-4584-98b6-b30d32e52f72", "2729610c-d6ce-4960-b83b-b5fd4230cc7e"],
"OList" : [{
"Id" : 1210151618,
"Val" : "6d2871c9-c5f8-44b1-bad9-4eba68683510"
}, {
"Id" : 1171177179,
"Val" : "6b12cd3f-2726-4bf9-a25c-35533de3910c"
}, {
"Id" : 1676910093,
"Val" : "66f52534-92f3-4ef4-b555-48a993a9df7a"
}, {
"Id" : 977965209,
"Val" : "80a20081-a2a5-4dcc-8d07-162f697588b4"
}, {
"Id" : 2075961031,
"Val" : "35f8710a-64e5-481d-9f18-899c65abd675"
}, {
"Id" : 328057441,
"Val" : "d17277e2-ca25-42b1-a4b4-efc00deef358"
}, {
"Id" : 2046696720,
"Val" : "4fa32d5e-f770-4d44-a55b-f6479633839c"
}, {
"Id" : 422544189,
"Val" : "de39c21e-8cb3-4f5c-bf5c-a3d228bc4c25"
}, {
"Id" : 1887998603,
"Val" : "22b00459-7820-46a6-8514-10e901810bbd"
}, {
"Id" : 852015288,
"Val" : "09cc3bd8-da23-42cb-b700-02ec461beb3f"
}
]
}
The values are randomly generated the object has one list of strings (Guids) and a list of child objects with an integer and string. Pretty simple but large enough to analyze the performance.
Results:
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
JsonSerializer | 83.21 us | 1.163 us | 0.769 us | 1.00 | 0.01 | 14.8926 | 2.4414 | 191.16 KB | 1.00 |
JsonGzSerializer | 346.37 us | 4.785 us | 3.165 us | 4.16 | 0.05 | 21.9727 | 2.9297 | 280.75 KB | 1.47 |
ProtoBufSerializer | 39.18 us | 0.701 us | 0.463 us | 0.47 | 0.01 | 8.6060 | 1.0986 | 110.7 KB | 0.58 |
BondBinarySerializer | 19.03 us | 0.398 us | 0.263 us | 0.23 | 0.00 | 4.7302 | 0.6714 | 60.53 KB | 0.32 |
BondFastBinarySerializer | 19.42 us | 0.431 us | 0.285 us | 0.23 | 0.00 | 4.7607 | 0.7324 | 60.84 KB | 0.32 |
BondSimpleJsonSerializer | 63.56 us | 1.132 us | 0.748 us | 0.76 | 0.01 | 11.9629 | 1.9531 | 153.35 KB | 0.80 |
As expected the protobuf serialization outperforms everything else by a huge margin! The compression overhead of the JsonGz serializer seems to be pretty large and may have some potential for optimizations...