From 7da986ff0f2cf8eebd6e614a38a2c4e17b59b76c Mon Sep 17 00:00:00 2001 From: Marc Lopez Rubio Date: Tue, 1 Apr 2025 14:36:28 +0800 Subject: [PATCH] Add APM Server known issue for TBS Signed-off-by: Marc Lopez Rubio --- docs/en/observability/apm/known-issues.asciidoc | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/en/observability/apm/known-issues.asciidoc b/docs/en/observability/apm/known-issues.asciidoc index 94e8828871..53c85cc0af 100644 --- a/docs/en/observability/apm/known-issues.asciidoc +++ b/docs/en/observability/apm/known-issues.asciidoc @@ -21,6 +21,17 @@ _Versions: XX.XX.XX, YY.YY.YY, ZZ.ZZ.ZZ_ // If applicable, link to fix //// +[discrete] +== Tail Sampling may not compact / expired TTLs as quickly as desired, causing increased storage usage. + +_Elastic Stack versions: 8.0.0+ < 9.0**_ + +There are some issues with the Tail Sampling implementation in versions 8.0.0+ < 9.0 that may cause the buffered traces to not be compacted or expired as quickly as desired. This can lead to increased storage usage for longer than the default 30m TTL. + +This may manifest in two ways, increased value log (vlog) file size and increased SST (LSM) file size. LSM growth and late compaction is particularly troublesome given how the underlying K/V database performs compactions on its layers. There is noticeable LSM growth for use-cases where traces are under 1KB in size, since they are written to the LSM layer directly. + +This issue is fixed in 9.0.0, due to a re-implementation of how the underlying tail sampling databases are used. The new implementation uses a more efficient partitioning scheme, allowing more efficient expiration of traces. + [discrete] == APM Server v8.6.x and prior with Elasticsearch v8.15.x and later has broken APM UI