Skip to content

feat(telemetry): enrich daily ping with gap detection and quality metrics#1001

Open
pszymkowiak wants to merge 3 commits intodevelopfrom
feat/telemetry-enrichment
Open

feat(telemetry): enrich daily ping with gap detection and quality metrics#1001
pszymkowiak wants to merge 3 commits intodevelopfrom
feat/telemetry-enrichment

Conversation

@pszymkowiak
Copy link
Copy Markdown
Collaborator

@pszymkowiak pszymkowiak commented Apr 3, 2026

Summary

Enrich the anonymous daily telemetry ping with product-piloting metrics. All data is aggregate counts or anonymized — no file paths, no arguments, no personal data.

New fields (21 total, up from 10)

Category Fields Purpose
Quality passthrough_top, parse_failures_24h, low_savings_commands, avg_savings_per_command Identify missing filters and weak ones
Retention first_seen_days, active_days_30d, commands_total Engagement and churn signals
Ecosystem ecosystem_mix Prioritize filter development (git 45%, cargo 20%, etc.)
Economics tokens_saved_30d, estimated_savings_usd_30d Quantify value delivered ($5/Mtok estimate)
Adoption hook_type, custom_toml_filters AI agent coverage, DSL adoption
Config has_config_toml, exclude_commands_count, projects_count User maturity and customization
Features meta_usage Which RTK features are used (gain, discover, proxy, etc.)

Files changed

  • src/core/tracking.rs — 10 new query methods (first_seen_days, active_days_30d, commands_total, ecosystem_mix, tokens_saved_30d, projects_count + previous 4)
  • src/core/telemetry.rs — EnrichedStats struct with all fields, detect_hook_type, detect_has_config, count_exclude_commands, count_custom_toml_filters, build_meta_usage, categorize_command
  • README.md — Updated privacy section with full field-by-field table explaining what is collected and why

Privacy

README.md now has a detailed table showing every collected field and its purpose. All new fields follow the same principles: aggregate counts, anonymized command names (first 3 words), no file paths, no arguments, no user data.

Test plan

  • cargo fmt --all — clean
  • cargo clippy --all-targets — no new warnings
  • cargo test telemetry — 13 tests passed (3 new)
  • cargo test --all — 1256 passed

…rics

Add 6 new fields to the anonymous daily telemetry ping to help identify
which commands need filters and which filters need improvement:

- passthrough_top: top 5 commands with 0% savings (missing filters)
- parse_failures_24h: count of parse failures (filter fragility)
- low_savings_commands: commands averaging <30% savings (weak filters)
- avg_savings_per_command: unweighted average savings
- hook_type: which AI agent hook is installed (claude/gemini/codex/etc)
- custom_toml_filters: count of user-defined TOML filter files

New tracking.rs queries: top_passthrough(), parse_failures_since(),
low_savings_commands(), avg_savings_per_command().

Signed-off-by: Patrick szymkowiak <[email protected]>
Extend the daily anonymous ping with product-piloting metrics:

Retention: first_seen_days, active_days_30d, commands_total
Ecosystem: ecosystem_mix (category distribution percentages)
Economics: tokens_saved_30d, estimated_savings_usd_30d
Config: has_config_toml, exclude_commands_count, projects_count
Features: meta_usage (gain, discover, proxy, verify, learn counts)

Update README.md privacy section with full field-by-field table
explaining what is collected and why it helps improve RTK.

Signed-off-by: Patrick szymkowiak <[email protected]>
Comprehensive telemetry documentation covering:
- Why we collect (roadmap prioritization, filter quality, value measurement)
- How it works (daily ping, background thread, fire-and-forget)
- Every field with example values and purpose
- What is NOT collected (explicit exclusion list)
- Opt-out instructions
- Data handling and privacy guarantees
- Contributor guide for adding new fields

Link added from README.md privacy section.

Signed-off-by: Patrick szymkowiak <[email protected]>
@aeppling
Copy link
Copy Markdown
Contributor

aeppling commented Apr 3, 2026

Could you move new TELEMETRY.md documentation into docs/usage/ folder please ?

This will match the docs folder re-organization of PR #978

@aeppling
Copy link
Copy Markdown
Contributor

aeppling commented Apr 3, 2026

Possible data accuracy issues

1. top_passthrough SQL condition is wrong

WHERE input_tokens = 0 AND output_tokens = 0
Passthrough commands (proxy mode) track input = output (both non-zero, savings = 0%). This condition instead matches records where token tracking failed entirely. It will report wrong commands as "missing filters."

2. parse_failures_since assumes a parse_failures table exists

SELECT COUNT(*) FROM parse_failures WHERE timestamp >= ?1
If this table doesn't exist in all deployed databases, the query fails , but falls back to 0 via unwrap_or(0). Silently always reports 0 for affected users.

3. projects_count assumes project_path column exists

SELECT COUNT(DISTINCT project_path) FROM commands WHERE project_path != ''
Same pattern : fails silently to 0 if the column doesn't exist in older database schemas.

@aeppling
Copy link
Copy Markdown
Contributor

aeppling commented Apr 3, 2026

@pszymkowiak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants