Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter] exporter batcher - byte size based batching #12091

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sfc-gh-sili
Copy link
Contributor

@sfc-gh-sili sfc-gh-sili commented Jan 15, 2025

Description

This is an POC of serialized size based batching (version 2)

Configuration is supported via an additional field to MaxSizeConfig.

type MaxSizeConfig struct {
	MaxSizeItems int `mapstructure:"max_size_items"`
	MaxSizeBytes int `mapstructure:"max_size_bytes"`
}

We will validate that at most one of the above fields are specified (TODO) and switch between item count-based batching vs. byte size-based batching accordingly.

To get the byte size of otlp protos, this PR updates pdata/internal/cmd/pdatagen/internal/templates/message.go.tmpl to expose an interface Size(). This change will apply to all pdatagen-generated files.


Getting serialized byte size

Getting serialized byte size is an expensive operation that gets repeatedly invoked during bytes based merge-splitting. The POC shows two optimizations that improves by 1. reducing size calculation times 2. reducing size calculation cost

Caching size in Request

We repeated call request.ByteSize() while merg-splitting. By caching previous result in request.ByteSize(), we were able to improve the performance especially for the case where many requests are merged into a single batch.

BenchmarkSplittingBasedOnItemCountManySmallLogs-10                 	     324	   3729469 ns/op
/*Before*/ BenchmarkSplittingBasedOnByteSizeManySmallLogs-10         	       9	 120696384 ns/op
/*After*/ BenchmarkSplittingBasedOnByteSizeManySmallLogs-10                  358	   2906934 ns/op

Proto size caculation

While filling the batch, we repeated evaluate size of new items and the remaining space, and both calculations are expensive. This PR shows an optimization that calculates remaining space iteratively using new item size:

remaning -= (1 + sizeof(newItem) + sov(sizeof(newItem)))

where

func sov(x uint64) (n int) {
	return (math_bits.Len64(x|1) + 6) / 7
}

This helped significantly improve the performance. For benchmark result without this optimization, see #12017.

Performance Benchmark

Benchmark 1 - 1000 small requests merge into one batch

This benchmark merges 1000 requests x 10 logs / request. The accumulated batch is ~1MB

BenchmarkSplittingBasedOnItemCountManySmallLogs-10                 	     328	   3556645 ns/op
BenchmarkSplittingBasedOnByteSizeManySmallLogs-10                  	     447	   2689311 ns/op

Benchmark 2 - Every incoming request causes a split

Case 1: Merging 10 request, where each contains 10001 logs / ~ 1 MB and is slightly above the limit

BenchmarkSplittingBasedOnItemCountManyLogsSlightlyAboveLimit-10    	      45	  24271094 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogsSlightlyAboveLimit-10     	      31	  35001137 ns/op

Case 2: Merging 10 request, where each contains 9999 logs / ~ 1 MB and is slightly below the limit

BenchmarkSplittingBasedOnItemCountManyLogsSlightlyBelowLimit-10    	      52	  21750268 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogsSlightlyBelowLimit-10     	      38	  28904701 ns/op

Benchmark 3 - A huge request splits into 10 batches

This benchmark merge splits a request with 100000 logs / ~10MB in to 10 batches.

BenchmarkSplittingBasedOnItemCountHugeLog-10                       	      43	  29004844 ns/op
BenchmarkSplittingBasedOnByteSizeHugeLog-10                        	      15	  66705669 ns/op

Link to tracking issue

Fixes #3262

Testing

Documentation

TODO: ByteSize() should return int64 instead of int

@sfc-gh-sili sfc-gh-sili changed the title Sili byte size 2 [POC version 2] exporter batcher - byte size based batching Jan 15, 2025
@sfc-gh-sili sfc-gh-sili changed the title [POC version 2] exporter batcher - byte size based batching [POC][NEW] exporter batcher - byte size based batching Jan 15, 2025
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 75.79618% with 38 lines in your changes missing coverage. Please review.

Project coverage is 91.52%. Comparing base (b9d4e39) to head (b7d0c30).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
exporter/exporterbatcher/config.go 0.00% 8 Missing and 4 partials ⚠️
exporter/exporterhelper/logs_batch.go 91.81% 6 Missing and 3 partials ⚠️
pdata/plog/pb.go 0.00% 6 Missing ⚠️
exporter/internal/queue/default_batcher.go 75.00% 2 Missing and 1 partial ⚠️
exporter/exporterhelper/internal/request.go 0.00% 2 Missing ⚠️
exporter/exporterhelper/metrics.go 0.00% 2 Missing ⚠️
exporter/exporterhelper/traces.go 0.00% 2 Missing ⚠️
...xporter/exporterhelper/xexporterhelper/profiles.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12091      +/-   ##
==========================================
- Coverage   91.60%   91.52%   -0.08%     
==========================================
  Files         456      456              
  Lines       24091    24289     +198     
==========================================
+ Hits        22069    22231     +162     
- Misses       1649     1676      +27     
- Partials      373      382       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sfc-gh-sili sfc-gh-sili force-pushed the sili-byte-size-2 branch 5 times, most recently from 0b390a8 to b7d0c30 Compare January 16, 2025 01:17
@sfc-gh-sili
Copy link
Contributor Author

TODO: Try if we can get away with not exposing that many size functions. Ideally we can do everything from one API.

@sfc-gh-sili sfc-gh-sili changed the title [POC][NEW] exporter batcher - byte size based batching [exporter] exporter batcher - byte size based batching Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[processors/batch] Size calculations inconsistent, causing unbounded batch size in bytes
1 participant