-
Notifications
You must be signed in to change notification settings - Fork 314
6403 influxdb3 perf tuning #6420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jstirnaman
wants to merge
12
commits into
master
Choose a base branch
from
6403-influxdb3-perf-tuning
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Remove --verbose from global flags (it's serve-specific) - Document --num-io-threads as global-only flag - Add clear examples showing correct flag positioning - Update serve.md files with global flag usage notes - Fix config-options.md to separate Core/Enterprise examples Resolves incorrect CLI usage patterns that would cause errors. Global flags must go before 'serve', serve-specific flags go after.
- Remove detailed Tokio runtime options tables from CLI index pages - Replace with simplified global options and link to config-options - Add examples showing correct global flag positioning - Fix --verbose usage to be serve-specific (after serve command) - Add --num-io-threads example as global flag (before serve command) These detailed options are now documented in config-options.md with proper global vs serve-specific categorization.
… reference doc for /metrics output.- Add monitoring guide: - Core and general metrics - Enterprise cluster and node-specific metrics - Using metrics and relabeling using Prometheus or Telegraf.
…uthentication - Add system:metrics:read token creation examples to Enterprise token docs - Document both CLI and HTTP API approaches for creating metrics tokens - Clarify that Enterprise supports both admin and fine-grained tokens for /metrics - Add node identification explanation to monitoring documentation Addresses PR #6422 review comments about authentication and token permissions. Test validation: All examples validated with InfluxDB 3.5.0 Enterprise.
Addresses @sanderson's comment: "Do irrelevant metrics still get reported? Do all nodes report the same metric, no matter what mode they're running in?" ## Changes ### Documentation Updates **content/shared/influxdb3-reference/metrics.md:** - Add "Metrics reporting across node modes" section under cluster considerations - Explain that all nodes report the same 120 metrics regardless of mode - Clarify differences appear in values/labels, not metric availability - Remove mention of HTTP/gRPC metrics appearing dynamically (less relevant) **content/shared/influxdb3-admin/monitor-metrics.md:** - Add Note callout in "Metric categories" section - Provide same clarifications in more prominent location - Simplify explanation for better readability ### Testing Configuration **compose.yaml:** - Add specialized Enterprise nodes for testing: - influxdb3-enterprise-write (mode: ingest, port 8183) - influxdb3-enterprise-query (mode: query, port 8184) - Fix port conflicts between specialized nodes - Enable validation of metrics behavior across node modes ## Test Results Validated with running Enterprise nodes in different modes: - All nodes expose same 120 unique metrics - Metrics not filtered by node specialization - Metric values reflect actual node activity - Confirmed standard Prometheus behavior See .context/issues/pr-6422-comment-responses.md for detailed test results.
…ection Addresses @sanderson's comment: "Why not suggest users use Telegraf to collect these metrics and store them in another InfluxDB instance rather than Prometheus?" ## Changes ### Enterprise Monitoring Setup **Before:** Prometheus configuration appeared first **After:** Telegraf configuration with "(recommended)" label appears first **New Telegraf section includes:** - Complete configuration with `outputs.influxdb_v3` for storing in monitoring instance - `inputs.prometheus` for scraping cluster node metrics - `processors.regex` for extracting node_name and node_role from URLs - Start commands for running Telegraf as a service - SQL query examples for analyzing collected metrics in InfluxDB **Prometheus section:** - Moved to "Alternative: Prometheus configuration" - Retained for users preferring Prometheus ecosystem - Includes separate "Add node identification with Prometheus" section ### Core Monitoring Setup **Before:** Only Prometheus configuration shown **After:** Telegraf appears first with "(recommended)" label **New sections:** - "Collect metrics with Telegraf (recommended)" with complete config - "Alternative: Prometheus configuration" for Prometheus users - SQL query examples for monitoring InfluxDB 3 Core metrics ## Benefits 1. **InfluxDB-native workflow**: Collect InfluxDB metrics → Store in InfluxDB → Query with SQL 2. **Consistent tooling**: Users already familiar with Telegraf for data collection 3. **SQL queries**: Natural fit for InfluxDB users vs learning PromQL 4. **Centralized monitoring**: Store metrics in separate InfluxDB instance 5. **Platform agnostic**: Telegraf runs anywhere without Prometheus infrastructure ## Documentation Coverage - ✅ Complete Telegraf configurations for both Core and Enterprise - ✅ Node identification through processor plugins - ✅ SQL query examples for common monitoring scenarios - ✅ Prometheus approach retained as alternative - ✅ Clear "(recommended)" and "Alternative" labels throughout Addresses PR #6422 comment 4.
…trics - Split TOC into separate Core and Enterprise versions using show-in shortcodes - Core TOC focuses on single-node monitoring workflows - Enterprise TOC includes cluster-specific and node-specific monitoring sections - Improves navigation by showing only relevant sections per product - Fix: Remove duplicate "InfluxDB" word in metrics.md
Influxdb3 monitor metrics
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #6403
This is the "release" branch that started out as PR #6414.
I'm splitting it apart to make it easier to review, but I'll release the related PRs together b/c the pages contain lots of cross-linking.