6403 influxdb3 perf tuning #6420

jstirnaman · 2025-09-25T20:13:31Z

Closes #6403
This is the "release" branch that started out as PR #6414.
I'm splitting it apart to make it easier to review, but I'll release the related PRs together b/c the pages contain lots of cross-linking.

- Remove --verbose from global flags (it's serve-specific) - Document --num-io-threads as global-only flag - Add clear examples showing correct flag positioning - Update serve.md files with global flag usage notes - Fix config-options.md to separate Core/Enterprise examples Resolves incorrect CLI usage patterns that would cause errors. Global flags must go before 'serve', serve-specific flags go after.

- Remove detailed Tokio runtime options tables from CLI index pages - Replace with simplified global options and link to config-options - Add examples showing correct global flag positioning - Fix --verbose usage to be serve-specific (after serve command) - Add --num-io-threads example as global flag (before serve command) These detailed options are now documented in config-options.md with proper global vs serve-specific categorization.

… reference doc for /metrics output.- Add monitoring guide: - Core and general metrics - Enterprise cluster and node-specific metrics - Using metrics and relabeling using Prometheus or Telegraf.

…idation

…uthentication - Add system:metrics:read token creation examples to Enterprise token docs - Document both CLI and HTTP API approaches for creating metrics tokens - Clarify that Enterprise supports both admin and fine-grained tokens for /metrics - Add node identification explanation to monitoring documentation Addresses PR #6422 review comments about authentication and token permissions. Test validation: All examples validated with InfluxDB 3.5.0 Enterprise.

@sanderson

Addresses @sanderson's comment: "Do irrelevant metrics still get reported? Do all nodes report the same metric, no matter what mode they're running in?" ## Changes ### Documentation Updates **content/shared/influxdb3-reference/metrics.md:** - Add "Metrics reporting across node modes" section under cluster considerations - Explain that all nodes report the same 120 metrics regardless of mode - Clarify differences appear in values/labels, not metric availability - Remove mention of HTTP/gRPC metrics appearing dynamically (less relevant) **content/shared/influxdb3-admin/monitor-metrics.md:** - Add Note callout in "Metric categories" section - Provide same clarifications in more prominent location - Simplify explanation for better readability ### Testing Configuration **compose.yaml:** - Add specialized Enterprise nodes for testing: - influxdb3-enterprise-write (mode: ingest, port 8183) - influxdb3-enterprise-query (mode: query, port 8184) - Fix port conflicts between specialized nodes - Enable validation of metrics behavior across node modes ## Test Results Validated with running Enterprise nodes in different modes: - All nodes expose same 120 unique metrics - Metrics not filtered by node specialization - Metric values reflect actual node activity - Confirmed standard Prometheus behavior See .context/issues/pr-6422-comment-responses.md for detailed test results.

@sanderson

…ection Addresses @sanderson's comment: "Why not suggest users use Telegraf to collect these metrics and store them in another InfluxDB instance rather than Prometheus?" ## Changes ### Enterprise Monitoring Setup **Before:** Prometheus configuration appeared first **After:** Telegraf configuration with "(recommended)" label appears first **New Telegraf section includes:** - Complete configuration with `outputs.influxdb_v3` for storing in monitoring instance - `inputs.prometheus` for scraping cluster node metrics - `processors.regex` for extracting node_name and node_role from URLs - Start commands for running Telegraf as a service - SQL query examples for analyzing collected metrics in InfluxDB **Prometheus section:** - Moved to "Alternative: Prometheus configuration" - Retained for users preferring Prometheus ecosystem - Includes separate "Add node identification with Prometheus" section ### Core Monitoring Setup **Before:** Only Prometheus configuration shown **After:** Telegraf appears first with "(recommended)" label **New sections:** - "Collect metrics with Telegraf (recommended)" with complete config - "Alternative: Prometheus configuration" for Prometheus users - SQL query examples for monitoring InfluxDB 3 Core metrics ## Benefits 1. **InfluxDB-native workflow**: Collect InfluxDB metrics → Store in InfluxDB → Query with SQL 2. **Consistent tooling**: Users already familiar with Telegraf for data collection 3. **SQL queries**: Natural fit for InfluxDB users vs learning PromQL 4. **Centralized monitoring**: Store metrics in separate InfluxDB instance 5. **Platform agnostic**: Telegraf runs anywhere without Prometheus infrastructure ## Documentation Coverage - ✅ Complete Telegraf configurations for both Core and Enterprise - ✅ Node identification through processor plugins - ✅ SQL query examples for common monitoring scenarios - ✅ Prometheus approach retained as alternative - ✅ Clear "(recommended)" and "Alternative" labels throughout Addresses PR #6422 comment 4.

…trics - Split TOC into separate Core and Enterprise versions using show-in shortcodes - Core TOC focuses on single-node monitoring workflows - Enterprise TOC includes cluster-specific and node-specific monitoring sections - Improves navigation by showing only relevant sections per product - Fix: Remove duplicate "InfluxDB" word in metrics.md

Influxdb3 monitor metrics

jstirnaman added 2 commits September 25, 2025 13:56

jstirnaman self-assigned this Sep 25, 2025

jstirnaman added the InfluxDB 3 Core and Enterprise label Sep 25, 2025

feat(influxdb3): Core and Ent. metrics reference and monitoring.- Add…

113ac16

… reference doc for /metrics output.- Add monitoring guide: - Core and general metrics - Enterprise cluster and node-specific metrics - Using metrics and relabeling using Prometheus or Telegraf.

jstirnaman mentioned this pull request Sep 26, 2025

Influxdb3 monitor metrics #6422

Merged

jstirnaman added 9 commits September 29, 2025 14:59

Merge branch 'master' into 6403-influxdb3-perf-tuning

d79a6d7

test: Use custom influxdb3 image - Docker Hub doesn't have latest ARM64

e8ccfb2

test(influxdb3): Add metrics output and Prometheus tests for docs val…

3182f8b

…idation

docs(influxdb3): Provide Telegraf config for Core and Enterprise cluster

7d8f3e6

Merge pull request #6422 from influxdata:influxdb3-monitor-metrics

f14c244

Influxdb3 monitor metrics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

6403 influxdb3 perf tuning #6420

6403 influxdb3 perf tuning #6420

Uh oh!

jstirnaman commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

6403 influxdb3 perf tuning #6420

Are you sure you want to change the base?

6403 influxdb3 perf tuning #6420

Uh oh!

Conversation

jstirnaman commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant