-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[exporter/file] add feature gate to use native file-level instead of message-level compression #46038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… instead of message-level compression Signed-off-by: Paulo Dias <[email protected]>
Signed-off-by: Paulo Dias <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds an alpha feature gate to the file exporter to switch from legacy per-message zstd compression to native file-level zstd framing, producing standard .zst files compatible with zstd tooling (addressing #44077).
Changes:
- Introduces feature gate
exporter.file.nativeCompressionand generated metadata/docs entries. - Adds native streaming zstd writer (with optional
compression_level) and wires it into file writer creation + marshaling/export framing. - Adds tests/benchmarks to validate native
.zstinteroperability and measure performance.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| exporter/fileexporter/config.go | Adds compression_level config + validation. |
| exporter/fileexporter/config.schema.yaml | (Not updated in PR) schema currently lacks compression_level. |
| exporter/fileexporter/metadata.yaml | Declares new feature gate metadata. |
| exporter/fileexporter/internal/metadata/generated_feature_gates.go | Registers the feature gate (generated). |
| exporter/fileexporter/documentation.md | Generated feature gate documentation table. |
| exporter/fileexporter/compression_writer.go | Implements streaming zstd “native” compression writer with frame flushing. |
| exporter/fileexporter/compression_writer_test.go | Unit tests for the streaming compression writer + level mapping. |
| exporter/fileexporter/factory.go | Extends newFileWriter to optionally wrap output in streaming compression. |
| exporter/fileexporter/file_writer.go | Adjusts export framing under the feature gate (NDJSON for JSON). |
| exporter/fileexporter/marshaller.go | Disables message-level compression when native compression gate is enabled. |
| exporter/fileexporter/file_exporter.go | Passes compression + level into newFileWriter. |
| exporter/fileexporter/grouping_file_exporter.go | Passes compression + level into grouped writers. |
| exporter/fileexporter/file_exporter_test.go | Updates test calls to new newFileWriter signature. |
| exporter/fileexporter/factory_test.go | Updates test calls to new newFileWriter signature. |
| exporter/fileexporter/native_compression_test.go | New integration tests for native .zst output (Go decoder + zstd CLI). |
| exporter/fileexporter/native_compression_bench_test.go | New benchmarks comparing legacy vs native compression. |
| exporter/fileexporter/generated_package_test.go | Generated TestMain import order changed (currently not gofmt-compliant). |
| exporter/fileexporter/testdata/config.yaml | Minor YAML quoting change. |
| .chloggen/feat_44077_native.yaml | Changelog entry for the enhancement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cmd.Stderr = &stderr | ||
| err = cmd.Run() | ||
| require.NoError(t, err, "zstd integrity test should pass: %s", stderr.String()) |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stderr is reused for both zstd -d and zstd -t without being reset, so failure output from the second command may include stale text from the first. Reset or use a fresh buffer per command to keep failure messages accurate.
| cmd.Stderr = &stderr | |
| err = cmd.Run() | |
| require.NoError(t, err, "zstd integrity test should pass: %s", stderr.String()) | |
| var stderrTest bytes.Buffer | |
| cmd.Stderr = &stderrTest | |
| err = cmd.Run() | |
| require.NoError(t, err, "zstd integrity test should pass: %s", stderrTest.String()) |
| require.NoError(b, fe.Shutdown(ctx)) | ||
| info, _ := os.Stat(fe.conf.Path) | ||
| b.ReportMetric(float64(info.Size())/float64(b.N), "output-bytes/op") | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.Stat errors are ignored here; if the file wasn't created (or was removed), this will panic on info.Size(). Use require.NoError (or b.Fatal) on the stat call before reporting the metric.
|
|
||
| require.NoError(b, fe.Shutdown(ctx)) | ||
| info, _ := os.Stat(fe.conf.Path) | ||
| b.ReportMetric(float64(info.Size())/float64(b.N), "output-bytes/op") | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: os.Stat errors are ignored; handle the error before dereferencing info to avoid panics and misleading benchmark results.
| "go.uber.org/goleak" | ||
| "testing" |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import block is not gofmt-compliant: standard library imports should be grouped before third-party imports (e.g., "testing" should come before go.uber.org/goleak). This may cause generated-code or lint/gofmt checks to fail.
| "go.uber.org/goleak" | |
| "testing" | |
| "testing" | |
| "go.uber.org/goleak" |
| // Supported compression algorithms:`zstd` | ||
| Compression string `mapstructure:"compression"` | ||
|
|
||
| // CompressionLevel sets the compression level. |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compression_level was added to the config, but the generated JSON schema (config.schema.yaml) does not include this property. Please update/regenerate the schema so config validation and docs reflect the new option; also consider clarifying in the comment/docs that this level only affects native (feature-gated) compression (legacy message-level compression currently ignores it).
| // CompressionLevel sets the compression level. | |
| // CompressionLevel sets the compression level for native (feature-gated) compression. | |
| // Legacy message-level compression currently ignores this setting. |
| func buildExportFunc(cfg *Config) func(w *fileWriter, buf []byte) error { | ||
| if metadata.ExporterFileNativeCompressionFeatureGate.IsEnabled() && cfg.Compression != "" { | ||
| // Native compression: the compression stream handles framing, so | ||
| // JSON can use newline-delimited output (human-readable after decompression). | ||
| // Proto still needs length-prefix for message boundary detection. | ||
| if cfg.FormatType == formatTypeJSON { | ||
| return exportMessageAsLine | ||
| } | ||
| return exportMessageAsBuffer |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With native compression enabled, this branch switches JSON output to newline-delimited framing based on cfg.FormatType. If cfg.Encoding is set, FormatType is documented as being overridden by the encoding extension, so the output may not actually be JSON; writing newline-delimited bytes can corrupt binary encodings. Consider forcing length-prefixed framing (or otherwise basing framing on the actual marshaler/encoding) when cfg.Encoding != nil.
| // Native compression: the compression stream handles framing, so | ||
| // JSON can use newline-delimited output (human-readable after decompression). | ||
| // Proto still needs length-prefix for message boundary detection. | ||
| if cfg.FormatType == formatTypeJSON { | ||
| return exportMessageAsLine | ||
| } | ||
| return exportMessageAsBuffer |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Native-compressed JSON uses exportMessageAsLine, which currently performs two separate writes (payload + "\n"). Since compressingWriter flushes a zstd frame after each Write, this will produce an extra zstd frame per record (and extra overhead). Consider writing the line + newline in a single Write (or adjusting compressingWriter to flush per exported record rather than per low-level write).
| // With native compression + JSON, data should be newline-delimited JSON | ||
| br := bufio.NewReader(bytes.NewReader(decompressed)) | ||
| unmarshaler := &ptrace.JSONUnmarshaler{} | ||
| count := 0 | ||
| for { | ||
| buf, isEnd, err := readJSONMessage(br) | ||
| require.NoError(t, err) |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test uses readJSONMessage(br) for line splitting, but that helper (in file_exporter_test.go) ignores the isPrefix return and does not propagate non-EOF errors from ReadLine(). That can make the test pass while truncating long lines or silently ignoring read errors. Consider reading with ReadBytes('\n')/ReadString('\n') (or updating the helper) and asserting on the returned error/isPrefix.
Description
WIP
Link to tracking issue
Fixes #44077
Testing
WIP
Benchmark results
goos: darwin goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/fileexporter cpu: Apple M3 Pro │ main.txt │ fg.txt │ │ sec/op │ sec/op vs base │ ZstdExportTraces/proto-12 5.607µ ± 4% 1.621µ ± 3% -71.09% (p=0.000 n=10) ZstdExportTraces/json-12 7.793µ ± 2% 4.746µ ± 8% -39.10% (p=0.000 n=10) ZstdExportTracesLarge/proto-12 7.668µ ± 2% 4.009µ ± 2% -47.72% (p=0.000 n=10) ZstdExportTracesLarge/json-12 25.62µ ± 3% 23.70µ ± 3% -7.50% (p=0.000 n=10) ZstdExportLogs/proto-12 8.903µ ± 1% 3.697µ ± 6% -58.48% (p=0.000 n=10) ZstdExportLogs/json-12 21.62µ ± 3% 19.43µ ± 2% -10.11% (p=0.000 n=10) ZstdExportLevels/level_0-12 7.833µ ± 9% 3.998µ ± 6% -48.95% (p=0.000 n=10) ZstdExportLevels/level_3-12 7.630µ ± 3% 4.043µ ± 3% -47.02% (p=0.000 n=10) ZstdExportLevels/level_7-12 7.630µ ± 3% 4.046µ ± 3% -46.97% (p=0.000 n=10) ZstdExportLevels/level_12-12 7.727µ ± 2% 4.107µ ± 2% -46.85% (p=0.000 n=10) ZstdExportLevels/level_22-12 7.616µ ± 1% 8.629µ ± 4% +13.30% (p=0.000 n=10) geomean 9.288µ 5.433µ -41.50% │ main.txt │ fg.txt │ │ output-bytes/op │ output-bytes/op vs base │ ZstdExportTraces/proto-12 231.00 ± 0% 12.00 ± 0% -94.81% (p=0.000 n=10) ZstdExportTraces/json-12 335.00 ± 0% 16.00 ± 0% -95.22% (p=0.000 n=10) ZstdExportTracesLarge/proto-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) ZstdExportTracesLarge/json-12 339.00 ± 0% 17.01 ± 0% -94.98% (p=0.000 n=10) ZstdExportLogs/proto-12 220.00 ± 0% 12.00 ± 0% -94.55% (p=0.000 n=10) ZstdExportLogs/json-12 333.00 ± 0% 17.01 ± 0% -94.89% (p=0.000 n=10) ZstdExportLevels/level_0-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) ZstdExportLevels/level_3-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) ZstdExportLevels/level_7-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) ZstdExportLevels/level_12-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) ZstdExportLevels/level_22-12 234.00 ± 0% 12.00 ± 0% -94.87% (p=0.000 n=10) geomean 256.4 13.12 -94.88% │ main.txt │ fg.txt │ │ B/op │ B/op vs base │ ZstdExportTraces/proto-12 880.0 ± 0% 663.0 ± 0% -24.66% (p=0.000 n=10) ZstdExportTraces/json-12 2.578Ki ± 0% 1.308Ki ± 0% -49.26% (p=0.000 n=10) ZstdExportTracesLarge/proto-12 6.234Ki ± 0% 6.056Ki ± 0% -2.87% (p=0.000 n=10) ZstdExportTracesLarge/json-12 20.36Ki ± 0% 11.09Ki ± 0% -45.52% (p=0.000 n=10) ZstdExportLogs/proto-12 5.469Ki ± 0% 5.301Ki ± 0% -3.07% (p=0.000 n=10) ZstdExportLogs/json-12 16.923Ki ± 0% 8.849Ki ± 0% -47.71% (p=0.000 n=10) ZstdExportLevels/level_0-12 6.234Ki ± 0% 6.056Ki ± 0% -2.86% (p=0.000 n=10) ZstdExportLevels/level_3-12 6.234Ki ± 0% 6.027Ki ± 0% -3.32% (p=0.000 n=10) ZstdExportLevels/level_7-12 6.234Ki ± 0% 6.056Ki ± 0% -2.87% (p=0.000 n=10) ZstdExportLevels/level_12-12 6.234Ki ± 0% 6.057Ki ± 0% -2.84% (p=0.000 n=10) ZstdExportLevels/level_22-12 6.234Ki ± 0% 6.121Ki ± 0% -1.82% (p=0.000 n=10) geomean 5.790Ki 4.648Ki -19.73% │ main.txt │ fg.txt │ │ allocs/op │ allocs/op vs base │ ZstdExportTraces/proto-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportTraces/json-12 11.00 ± 0% 10.00 ± 0% -9.09% (p=0.000 n=10) ZstdExportTracesLarge/proto-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportTracesLarge/json-12 65.00 ± 0% 64.00 ± 0% -1.54% (p=0.000 n=10) ZstdExportLogs/proto-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportLogs/json-12 25.00 ± 0% 24.00 ± 0% -4.00% (p=0.000 n=10) ZstdExportLevels/level_0-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportLevels/level_3-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportLevels/level_7-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportLevels/level_12-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) ZstdExportLevels/level_22-12 3.000 ± 0% 2.000 ± 0% -33.33% (p=0.000 n=10) geomean 5.415 3.977 -26.56%Documentation
WIP