[exporter] exporter batcher - byte size based batching #12091

sfc-gh-sili · 2025-01-15T02:55:12Z

Description

This is an POC of serialized size based batching (version 2)

Configuration is supported via an additional field to MaxSizeConfig.

type MaxSizeConfig struct {
	MaxSizeItems int `mapstructure:"max_size_items"`
	MaxSizeBytes int `mapstructure:"max_size_bytes"`
}

We will validate that at most one of the above fields are specified (TODO) and switch between item count-based batching vs. byte size-based batching accordingly.

To get the byte size of otlp protos, this PR updates pdata/internal/cmd/pdatagen/internal/templates/message.go.tmpl to expose an interface Size(). This change will apply to all pdatagen-generated files.

Getting serialized byte size

Getting serialized byte size is an expensive operation that gets repeatedly invoked during bytes based merge-splitting. The POC shows two optimizations that improves by 1. reducing size calculation times 2. reducing size calculation cost

Caching size in `Request`

We repeated call request.ByteSize() while merg-splitting. By caching previous result in request.ByteSize(), we were able to improve the performance especially for the case where many requests are merged into a single batch.

BenchmarkSplittingBasedOnItemCountManySmallLogs-10                 	     324	   3729469 ns/op
/*Before*/ BenchmarkSplittingBasedOnByteSizeManySmallLogs-10         	       9	 120696384 ns/op
/*After*/ BenchmarkSplittingBasedOnByteSizeManySmallLogs-10                  358	   2906934 ns/op

Proto size caculation

While filling the batch, we repeated evaluate size of new items and the remaining space, and both calculations are expensive. This PR shows an optimization that calculates remaining space iteratively using new item size:

remaning -= (1 + sizeof(newItem) + sov(sizeof(newItem)))

where

func sov(x uint64) (n int) {
	return (math_bits.Len64(x|1) + 6) / 7
}

This helped significantly improve the performance. For benchmark result without this optimization, see #12017.

Performance Benchmark

Benchmark 1 - 1000 small requests merge into one batch

This benchmark merges 1000 requests x 10 logs / request. The accumulated batch is ~1MB

BenchmarkSplittingBasedOnItemCountManySmallLogs-10                 	     328	   3556645 ns/op
BenchmarkSplittingBasedOnByteSizeManySmallLogs-10                  	     447	   2689311 ns/op

Benchmark 2 - Every incoming request causes a split

Case 1: Merging 10 request, where each contains 10001 logs / ~ 1 MB and is slightly above the limit

BenchmarkSplittingBasedOnItemCountManyLogsSlightlyAboveLimit-10    	      45	  24271094 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogsSlightlyAboveLimit-10     	      31	  35001137 ns/op

Case 2: Merging 10 request, where each contains 9999 logs / ~ 1 MB and is slightly below the limit

BenchmarkSplittingBasedOnItemCountManyLogsSlightlyBelowLimit-10    	      52	  21750268 ns/op
BenchmarkSplittingBasedOnByteSizeManyLogsSlightlyBelowLimit-10     	      38	  28904701 ns/op

Benchmark 3 - A huge request splits into 10 batches

This benchmark merge splits a request with 100000 logs / ~10MB in to 10 batches.

BenchmarkSplittingBasedOnItemCountHugeLog-10                       	      43	  29004844 ns/op
BenchmarkSplittingBasedOnByteSizeHugeLog-10                        	      15	  66705669 ns/op

Link to tracking issue

Fixes #3262

Testing

Documentation

TODO: ByteSize() should return int64 instead of int

codecov · 2025-01-15T03:07:12Z

Codecov Report

Attention: Patch coverage is 75.79618% with 38 lines in your changes missing coverage. Please review.

Project coverage is 91.52%. Comparing base (b9d4e39) to head (b7d0c30).
Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
exporter/exporterbatcher/config.go	0.00%	8 Missing and 4 partials ⚠️
exporter/exporterhelper/logs_batch.go	91.81%	6 Missing and 3 partials ⚠️
pdata/plog/pb.go	0.00%	6 Missing ⚠️
exporter/internal/queue/default_batcher.go	75.00%	2 Missing and 1 partial ⚠️
exporter/exporterhelper/internal/request.go	0.00%	2 Missing ⚠️
exporter/exporterhelper/metrics.go	0.00%	2 Missing ⚠️
exporter/exporterhelper/traces.go	0.00%	2 Missing ⚠️
...xporter/exporterhelper/xexporterhelper/profiles.go	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #12091      +/-   ##
==========================================
- Coverage   91.60%   91.52%   -0.08%     
==========================================
  Files         456      456              
  Lines       24091    24289     +198     
==========================================
+ Hits        22069    22231     +162     
- Misses       1649     1676      +27     
- Partials      373      382       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sfc-gh-sili · 2025-01-16T22:15:08Z

TODO: Try if we can get away with not exposing that many size functions. Ideally we can do everything from one API.

sfc-gh-sili mentioned this pull request Jan 15, 2025

[POC][DEPRECATED] exporter batcher - byte size based batching #12017

Closed

sfc-gh-sili changed the title ~~Sili byte size 2~~ [POC version 2] exporter batcher - byte size based batching Jan 15, 2025

sfc-gh-sili changed the title ~~[POC version 2] exporter batcher - byte size based batching~~ [POC][NEW] exporter batcher - byte size based batching Jan 15, 2025

sfc-gh-sili force-pushed the sili-byte-size-2 branch 5 times, most recently from 0b390a8 to b7d0c30 Compare January 16, 2025 01:17

byte based batching

15d0004

sfc-gh-sili force-pushed the sili-byte-size-2 branch from b7d0c30 to 15d0004 Compare January 22, 2025 02:46

sfc-gh-sili changed the title ~~[POC][NEW] exporter batcher - byte size based batching~~ [exporter] exporter batcher - byte size based batching Jan 22, 2025

sfc-gh-sili mentioned this pull request Jan 22, 2025

[processors/batch] Size calculations inconsistent, causing unbounded batch size in bytes #3262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[exporter] exporter batcher - byte size based batching #12091

[exporter] exporter batcher - byte size based batching #12091

sfc-gh-sili commented Jan 15, 2025 •

edited

Loading

codecov bot commented Jan 15, 2025 •

edited

Loading

sfc-gh-sili commented Jan 16, 2025

[exporter] exporter batcher - byte size based batching #12091

Are you sure you want to change the base?

[exporter] exporter batcher - byte size based batching #12091

Conversation

sfc-gh-sili commented Jan 15, 2025 • edited Loading

Description

Getting serialized byte size

Caching size in Request

Proto size caculation

Performance Benchmark

Benchmark 1 - 1000 small requests merge into one batch

Benchmark 2 - Every incoming request causes a split

Benchmark 3 - A huge request splits into 10 batches

Link to tracking issue

Testing

Documentation

codecov bot commented Jan 15, 2025 • edited Loading

Codecov Report

sfc-gh-sili commented Jan 16, 2025

sfc-gh-sili commented Jan 15, 2025 •

edited

Loading

Caching size in `Request`

codecov bot commented Jan 15, 2025 •

edited

Loading