-
Notifications
You must be signed in to change notification settings - Fork 862
feat(google-generativeai): Add metrics support #3506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds optional metrics to the Google Generative AI instrumentation: creates token and duration histograms when a meter provider is present and metrics enabled, threads those histograms through wrappers and handlers, records token counts and request durations, and exposes is_metrics_enabled() and a helper to create metrics. Changes
Sequence Diagram(s)sequenceDiagram
participant Instr as Instrumentor
participant Meter as MeterProvider / Meter
participant Wrapper as Wrapper (sync/async)
participant Client as GenAI Client
participant SpanUtil as Span Utils
Instr->>Meter: get_meter() / _create_metrics() (if meter provider && metrics enabled)
Meter-->>Instr: token_histogram, duration_histogram
Instr->>Wrapper: create wrapper (pass histograms)
Wrapper->>Wrapper: start timer (if duration_histogram)
Wrapper->>Client: send request / stream
Client-->>Wrapper: response / chunks
Wrapper->>SpanUtil: set_model_response_attributes(span, response, model, token_histogram)
SpanUtil->>Meter: record prompt/output token counts (if token_histogram)
Wrapper->>Meter: record elapsed time to duration_histogram (if provided)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to 012121e in 2 minutes and 29 seconds. Click for details.
- Reviewed
294lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py:308
- Draft comment:
Clarify default behavior: is_metrics_enabled() defaults to true if the env var is unset. Confirm if this is intentional or if false should be the default. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py:481
- Draft comment:
Minor style note: The indentation of the second token_histogram.record call is inconsistent compared to the first call. Consider aligning them for readability. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_ewO5ELdqUUn3w64P
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Show resolved
Hide resolved
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Outdated
Show resolved
Hide resolved
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
348-378:NameErrorwhen metrics are disabled.
token_histogramandduration_histogramare only defined inside theif is_metrics_enabled():block at lines 351-352, butwrapper_argsat line 366 references them unconditionally. WhenTRACELOOP_METRICS_ENABLED=false, these variables will be undefined, causing aNameError.Apply this diff to initialize the histograms to
Nonebefore the conditional:meter_provider = kwargs.get("meter_provider") meter = get_meter(__name__, __version__, meter_provider) + token_histogram = None + duration_histogram = None if is_metrics_enabled(): token_histogram, duration_histogram = _create_metrics(meter)
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
39-39: Unused importMeter.
Meteris imported but onlyget_meteris used. The type hint on_create_metricscould use it, but currently it's just documentation.-from opentelemetry.metrics import Meter, get_meter +from opentelemetry.metrics import get_meterAlternatively, if you want to keep it for type annotation, that's also fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py(12 hunks)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
set_model_response_attributes(449-490)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/utils.py (2)
should_emit_events(44-50)wrapper(23-33)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Lint
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
348-372: Undefinedtoken_histogram/duration_histogramwhen metrics are disabledIf
TRACELOOP_METRICS_ENABLEDis set to something other than"true",is_metrics_enabled()returnsFalseand the_create_metricscall is skipped. In that case,token_histogramandduration_histogramare never defined, but they are still passed inwrapper_args. This will raise anUnboundLocalErrorthe first time_instrumentruns with metrics disabled.This matches the earlier review feedback and should be fixed by always initializing the variables before the
if is_metrics_enabled()guard.A minimal, backward‑compatible fix:
meter_provider = kwargs.get("meter_provider") - meter = get_meter(__name__, __version__, meter_provider) - - if is_metrics_enabled(): - token_histogram, duration_histogram = _create_metrics(meter) + meter = get_meter(__name__, __version__, meter_provider) + + token_histogram = None + duration_histogram = None + + if is_metrics_enabled(): + token_histogram, duration_histogram = _create_metrics(meter)Also, please double‑check that
get_meter(__name__, __version__, meter_provider)matches the OpenTelemetry API version you target; in many setups, a provided meter provider is used viameter_provider.get_meter(...)and the thirdget_meterargument is a schema URL rather than a provider.Verify the recommended way to obtain a `Meter` in the current OpenTelemetry Python metrics API: should instrumentations call `get_meter(__name__, __version__)` and rely on the global provider, or use a provided `meter_provider.get_meter(__name__, __version__)` when a custom provider is passed in?
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
449-490: Token metrics integration inset_model_response_attributeslooks correctThe new
token_histogramrecording is properly gated byspan.is_recording()andhasattr(response, "usage_metadata"), and it cleanly handles theNonecase fortoken_histogramwhile preserving existing span attributes and status behavior.If you want to tidy later, you could avoid repeating
hasattr(response, "usage_metadata")by grabbingusage = getattr(response, "usage_metadata", None)once and using it for both span attributes and histogram recording, but this is purely cosmetic.packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
308-325: Metrics helpers (is_metrics_enabled,_create_metrics) align with existing patternsThe
is_metrics_enabled()helper matches the existing Bedrock pattern (env var with"true"default), and_create_metrics()definesLLM_TOKEN_USAGEandLLM_OPERATION_DURATIONhistograms with sensible units and descriptions. Once the initialization bug in_instrumentis fixed, these helpers should behave as intended.Please just confirm that
Meters.LLM_TOKEN_USAGEandMeters.LLM_OPERATION_DURATIONmatch the names used in your metrics backends and any dashboards, to avoid creating duplicate or unused instruments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py(12 hunks)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
🧬 Code graph analysis (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)
set_model_response_attributes(293-313)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (3)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
set_model_response_attributes(449-490)set_response_attributes(418-446)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/utils.py (1)
should_emit_events(44-50)packages/opentelemetry-instrumentation-bedrock/opentelemetry/instrumentation/bedrock/__init__.py (2)
is_metrics_enabled(111-112)_create_metrics(467-556)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)
207-221: Duration histogram recording is correctly implementedUsing
time.perf_counter()to measure durations and recording them only whenduration_histogramis truthy is a good pattern here. The attributes (GEN_AI_SYSTEMandGEN_AI_RESPONSE_MODEL) are appropriate and consistent for both async (_awrap) and sync (_wrap) paths.Also applies to: 278-291
81-87: Token histogram propagation through streaming and response handlers looks soundPassing
token_histograminto_build_from_streaming_response,_abuild_from_streaming_response, and_handle_response, and then intoset_model_response_attributes, ensures token metrics are recorded for both streaming and non‑streaming responses without changing existing span attribute behavior. The use oflast_chunk or response/last_chunk if last_chunk else responseis reasonable for picking upusage_metadataon streaming responses.Also applies to: 97-103, 105-122, 135-143, 221-232, 293-302
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (1)
128-130: Also filterx-goog-api-keyin VCR configGiven that the cassette for
test_client_spansincludes anx-goog-api-keyheader, this header should also be filtered here to prevent future recordings from capturing real API keys.Suggested change:
- return {"filter_headers": ["authorization"]} + return {"filter_headers": ["authorization", "x-goog-api-key"]}This, combined with scrubbing the existing cassette, prevents secret leakage in recorded HTTP fixtures. [As per coding guidelines, …]
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
36-67: Client span assertions look solid; consider slightly more defensive checksThe new
test_client_spansexercises the right path and validates key GenAI + LLM attributes and token usage; this is good coverage for the new metrics/plumbing.If you want to make the test a bit more robust against potential future schema changes, you could optionally guard direct index lookups with
in attrs(as you already do for the prompt/completion content keys) before accessing values likeSpanAttributes.LLM_USAGE_TOTAL_TOKENSto get clearer assertion errors when attributes go missing, but this is not strictly necessary.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml(1 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py(3 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py(2 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.pypackages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
**/cassettes/**/*.{yaml,yml,json}
📄 CodeRabbit inference engine (CLAUDE.md)
Never commit secrets or PII in VCR cassettes; scrub sensitive data
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml
🧠 Learnings (3)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-09-02T09:00:53.586Z
Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧬 Code graph analysis (2)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(328-396)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(328-396)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)
reader(37-41)meter_provider(45-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Lint
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (3)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)
79-94: Metrics test context wiring looks correct; no explicit uninstrument requiredThe
metrics_test_context+clear_metrics_test_contextpair correctly sets up anInMemoryMetricReader,MeterProvider, and global meter provider, then shuts them down at the end of the test session. Passingmeter_provider=providerintoGoogleGenerativeAiInstrumentor().instrument()aligns with the new metrics API and ensures histograms can be created.Leaving instrumentation active at session scope is acceptable here since tests that need different instrumentation behavior already manage it via their own instrumentor fixtures.
67-76: Ensuretracer_providerfixture still exists or update instrument_ fixtures*
instrument_legacy,instrument_with_content, andinstrument_with_no_contentall depend on atracer_providerfixture, but this file no longer defines one. If there isn't a sharedtracer_providerfixture elsewhere, these fixtures will cause pytest to fail with "fixture 'tracer_provider' not found".Two options:
If a global
tracer_providerfixture exists elsewhere
Confirm the lifecycle is compatible with the newexporterfixture and that it still sets the provider before instrumentation.If no such fixture exists anymore
Either reintroduce a localtracer_providerfixture, or refactor these fixtures to rely on the session-level setup now done inexporter(e.g., drop thetracer_providerparameter and let the instrumentor use the global provider).packages/opentelemetry-instrumentation-google-generativeai/tests/test_new_library_instrumentation.py (1)
2-51: Lifecycle test and_is_instrumentedhelper look correct and valuableUsing
wraptand__wrapped__to detect instrumentation is consistent with how OpenTelemetry instrumentations wrap functions, and the lifecycle test thoroughly exercises uninstrument → instrument → idempotent instrument → uninstrument flows for bothModelsandAsyncModels. This should catch regressions in WRAPPED_METHODS or uninstrument behavior.
...rumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
130-132: Consider renaming unused loop variable.The loop control variable
token_typeis unpacked but not used within the loop body. While the code is functionally correct, renaming it to_or_token_typewould follow Python conventions for intentionally unused variables and silence the static analysis warning.Apply this diff:
- for token_type, dp in token_points_by_type.items(): + for _, dp in token_points_by_type.items(): assert dp.count >= 1 assert dp.sum >= 0
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml(1 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_generate_metrics.yaml(1 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py(4 hunks)packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_generate_metrics.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-google-generativeai/tests/cassettes/test_generate_content/test_client_spans.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
🧠 Learnings (4)
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-09-02T09:00:53.586Z
Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧬 Code graph analysis (2)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (3)
packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (2)
export(45-51)InMemorySpanExporter(22-61)packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(328-396)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)
reader(37-41)meter_provider(45-50)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(328-396)packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (2)
SpanAttributes(64-245)Meters(36-61)
🪛 Ruff (0.14.8)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
130-130: Loop control variable token_type not used within loop body
Rename unused token_type to _token_type
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (8)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (3)
1-12: LGTM!The imports are well-organized and include all necessary modules for testing spans, metrics, and OpenTelemetry instrumentation.
37-68: LGTM!The test correctly validates span creation, span attributes, and token usage for the Google Generative AI instrumentation. The assertions are comprehensive and cover all critical span attributes.
29-34: Remove the unusedmock_instrumentorfixture or document its intended purpose.The fixture mocks the
instrument()anduninstrument()methods but is not referenced by any tests in this file. If it's not used elsewhere in the test suite, remove it to reduce code maintenance overhead. If it's intended for future tests, add a comment explaining its purpose.packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (5)
13-13: LGTM!The new imports provide the necessary OpenTelemetry SDK components for metrics and tracing infrastructure in tests.
Also applies to: 20-26, 28-30
63-66: LGTM!Returning the full
clientobject instead ofclient.modelsprovides better flexibility for tests to access different client APIs (e.g.,client.chats,client.models).
130-132: LGTM!Adding
x-goog-api-keyto the filtered headers is essential for preventing API keys from being recorded in VCR cassettes.
135-138: LGTM!The
environmentfixture ensures the Google API key is available for all tests, providing a sensible default for test environments.
36-47: Critical: Double instrumentation detected.The
exporterfixture (line 45) andmetrics_test_contextfixture (line 87) both create separateGoogleGenerativeAiInstrumentorinstances and callinstrument(). Since both fixtures are session-scoped, this results in double-wrapping the same methods, which can cause duplicate spans, metrics, and unpredictable behavior.Recommended solution: Create a single session-scoped fixture that sets up both tracing and metrics providers, then instruments once with both:
@pytest.fixture(scope="session") def instrumentation_setup(): # Tracing setup span_exporter = InMemorySpanExporter() span_processor = SimpleSpanProcessor(span_exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) # Metrics setup resource = Resource.create() metric_reader = InMemoryMetricReader() meter_provider = MeterProvider(metric_readers=[metric_reader], resource=resource) metrics.set_meter_provider(meter_provider) # Single instrumentation with both providers instrumentor = GoogleGenerativeAiInstrumentor() instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider) yield span_exporter, metric_reader, meter_provider metric_reader.shutdown() meter_provider.shutdown() instrumentor.uninstrument() @pytest.fixture(scope="session") def exporter(instrumentation_setup): span_exporter, _, _ = instrumentation_setup return span_exporter @pytest.fixture(scope="session") def metrics_test_context(instrumentation_setup): _, metric_reader, meter_provider = instrumentation_setup return meter_provider, metric_reader⛔ Skipped due to learnings
Learnt from: CR Repo: traceloop/openllmetry PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-08-17T15:06:48.109Z Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)
69-78: This fixture depends on missingtracer_provider.As noted above, this fixture will fail because
tracer_provideris no longer defined. See the earlier comment for the fix.
99-128: These fixtures depend on missingtracer_provider.Both
instrument_with_contentandinstrument_with_no_contentdepend on the undefinedtracer_providerfixture. See the earlier comment for the fix.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧠 Learnings (4)
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-09-02T09:00:53.586Z
Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧬 Code graph analysis (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(328-396)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (2)
reader(37-41)meter_provider(45-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Lint
🔇 Additional comments (5)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (5)
13-30: LGTM!The new imports are correctly structured and necessary for the metrics and tracing infrastructure changes.
63-66: LGTM!The fixture correctly reads the API key from an environment variable, adhering to secure credential handling practices. Based on learnings, this follows the pattern for test fixtures in this repository.
91-96: Previous issue addressed - fixture lifecycle is now correct.The fixture properly yields before performing cleanup, ensuring metrics infrastructure remains available during test execution.
131-133: Good addition of API key header filtering.Filtering
x-goog-api-keyfrom VCR recordings prevents accidental secret exposure in test cassettes. Based on learnings, this follows the recommended pattern for scrubbing secrets during recording.
136-139: LGTM!The fallback to a placeholder API key enables tests to run without requiring real credentials, which is appropriate for unit tests using VCR cassettes.
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
Show resolved
Hide resolved
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
Show resolved
Hide resolved
16c1899 to
1dcce3e
Compare
038756d to
e3bbde2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (1)
141-144: Consider session scope for environment fixture.The fixture sets a global environment variable but is function-scoped (default), meaning it executes before each test. Since
GOOGLE_API_KEYis set once and persists for the session, this is inefficient.🔎 Proposed refactor
-@pytest.fixture(autouse=True) +@pytest.fixture(scope="session", autouse=True) def environment(): if "GOOGLE_API_KEY" not in os.environ: os.environ["GOOGLE_API_KEY"] = "test_api_key"
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧠 Learnings (5)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-09-02T09:00:53.586Z
Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.12)
🔇 Additional comments (7)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (7)
13-30: LGTM! Necessary imports for metrics support.The added imports provide the required infrastructure for metrics testing (MeterProvider, InMemoryMetricReader) and tracer setup (Resource, set_tracer_provider, SimpleSpanProcessor, InMemorySpanExporter).
36-48: LGTM! Session-level instrumentation correctly consolidated.The exporter fixture properly sets up session-scoped tracing and metrics infrastructure. Previous critical issues (missing tracer_provider fixture, double instrumentation) have been resolved.
51-54: LGTM! TracerProvider fixture for function-scoped tests.This fixture correctly provides a fresh TracerProvider instance for function-scoped instrumentation fixtures.
96-102: LGTM! Fixture lifecycle corrected.The
yieldstatement on line 98 correctly allows tests to run before cleanup. This fixes the critical issue from the previous review.
138-138: LGTM! Proper secret filtering for VCR cassettes.The addition of
"x-goog-api-key"to the filter ensures API keys are scrubbed from VCR recordings.Based on learnings, VCR filters should be used to scrub secrets/PII during recording.
70-72: LGTM! Client fixture returns the client instance.The fixture correctly returns the
genai.Clientinstance, allowing tests to access methods as needed.
87-93: Function-scoped fixtures inherit the global MeterProvider when meter_provider is not explicitly passed.The session-scoped
metrics_test_contextfixture sets a global MeterProvider (line 93). When function-scoped fixtures (instrument_legacy,instrument_with_content,instrument_with_no_content) callinstrument()without passingmeter_provider, the instrumentor'sget_meter()call falls back to the global provider. This behavior is intentional and not problematic:
- The session fixture properly controls the global MeterProvider for all tests
_uninstrument()correctly unwraps function wrappers; meter cleanup is handled by the session fixture's teardown- The function-scoped fixtures appear to exist as alternative configurations but are not used in actual tests
If clarity is desired, consider adding a comment to these fixtures explaining they are alternative configurations that inherit the global provider, or remove them if they are not intended for use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (1)
142-145: Consider restoring the environment state after tests.The fixture sets
GOOGLE_API_KEYbut doesn't restore the original environment state. If tests rely on the absence of this variable or if its presence affects subsequent tests, this could cause issues.🔎 Suggested enhancement to restore environment
@pytest.fixture(autouse=True) def environment(): + original_key = os.environ.get("GOOGLE_API_KEY") if "GOOGLE_API_KEY" not in os.environ: os.environ["GOOGLE_API_KEY"] = "test_api_key" + yield + # Restore original state + if original_key is None: + os.environ.pop("GOOGLE_API_KEY", None) + else: + os.environ["GOOGLE_API_KEY"] = original_key
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
🧠 Learnings (5)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: For debugging OpenTelemetry spans, use ConsoleSpanExporter with Traceloop to print spans to console
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-09-02T09:00:53.586Z
Learnt from: nina-kollman
Repo: traceloop/openllmetry PR: 3358
File: packages/sample-app/sample_app/gemini.py:28-31
Timestamp: 2025-09-02T09:00:53.586Z
Learning: In the google.genai package, async operations are accessed through client.aio (e.g., client.aio.models.generate_content), not through a separate AsyncClient class.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/conftest.py : Use VCR filters (e.g., filter_headers, before_record) or framework equivalents to scrub secrets/PII during recording
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Applies to tests/**/*.py : Tests that make API calls must utilize VCR cassettes
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (1)
36-48: No double instrumentation concern exists in the current test structure.The review comment raises a concern about tests using both the session-scoped
exporterfixture (which instruments) and function-scopedinstrument_*fixtures, but actual test usage shows:
test_client_spansuses onlyexportertest_generate_metricsusesmetrics_test_contextdirectly, notexporter- Tests in
test_new_library_instrumentation.pydon't use any conftest fixturesNo test has both
exporterand aninstrument_*fixture in its parameters, and session-scoped fixtures are only applied to tests that explicitly request them, not implicitly to all tests.Likely an incorrect or invalid review comment.
galkleinman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all looks good. Make sure to align the semantic-conventions with the latest otel spec, and make use of the otel semconv pacakge constants instead of the local openllmetry branched one.
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Outdated
Show resolved
Hide resolved
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Outdated
Show resolved
Hide resolved
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Show resolved
Hide resolved
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Outdated
Show resolved
Hide resolved
...umentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
Outdated
Show resolved
Hide resolved
|
@galkleinman |
Just import them from the otel semconv package instead of the local semconv pacakge (if it isn't like that already). |
135c03f to
851e4e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
449-489: Use semantic convention constants instead of hardcoded attribute strings.The attribute keys are hardcoded as strings (
"gen_ai.provider.name","gen_ai.token.type","gen_ai.response.model"), but per the PR comments, these should be imported from the OpenTelemetry semantic conventions package for consistency and maintainability.Additionally, there's inconsistent indentation in the second
token_histogram.record()call (lines 481-488).♻️ Suggested refactor
- if token_histogram and hasattr(response, "usage_metadata"): - token_histogram.record( - response.usage_metadata.prompt_token_count, - attributes={ - "gen_ai.provider.name": "Google", - "gen_ai.token.type": "input", - "gen_ai.response.model": llm_model, - } - ) - token_histogram.record( - response.usage_metadata.candidates_token_count, - attributes={ - "gen_ai.provider.name": "Google", - "gen_ai.token.type": "output", - "gen_ai.response.model": llm_model, - }, - ) + if token_histogram and hasattr(response, "usage_metadata"): + token_histogram.record( + response.usage_metadata.prompt_token_count, + attributes={ + GenAIAttributes.GEN_AI_SYSTEM: "Google", + GenAIAttributes.GEN_AI_TOKEN_TYPE: "input", + GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, + } + ) + token_histogram.record( + response.usage_metadata.candidates_token_count, + attributes={ + GenAIAttributes.GEN_AI_SYSTEM: "Google", + GenAIAttributes.GEN_AI_TOKEN_TYPE: "output", + GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, + }, + )Note: Verify that
GenAIAttributes.GEN_AI_SYSTEMis the correct constant for the provider name attribute. The test file usesGenAIAttributes.GEN_AI_TOKEN_TYPEfor token type assertions, so constants should be used consistently. Based on learnings, semantic conventions must follow the OpenTelemetry GenAI specification.packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2)
29-34: Unused fixture:mock_instrumentor.This fixture is defined but not used by any test in this file. Consider removing it or adding a test that uses it.
130-136: Unused loop variable and hardcoded attribute strings.
The loop variable
token_typeis not used within the loop body. Rename to_token_typeper Python convention.Consider using semantic convention constants instead of hardcoded strings for consistency (e.g., line 123 uses
GenAIAttributes.GEN_AI_TOKEN_TYPE).♻️ Suggested fix
- for token_type, dp in token_points_by_type.items(): + for _token_type, dp in token_points_by_type.items(): assert dp.count >= 1 assert dp.sum >= 0 # Required semantic attributes - assert "gen_ai.provider.name" in dp.attributes - assert "gen_ai.response.model" in dp.attributes + assert GenAIAttributes.GEN_AI_SYSTEM in dp.attributes + assert GenAIAttributes.GEN_AI_RESPONSE_MODEL in dp.attributespackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)
211-219: Use semantic convention constants for metric attributes.The duration histogram attributes use hardcoded strings. For consistency with the rest of the codebase and per PR discussion, consider using semantic convention constants.
♻️ Suggested refactor
if duration_histogram: duration = time.perf_counter() - start_time duration_histogram.record( duration, attributes={ - "gen_ai.provider.name": "Google", - "gen_ai.response.model": llm_model + GenAIAttributes.GEN_AI_SYSTEM: "Google", + GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, }, )The same change should be applied to the
_wrapfunction (lines 282-290).
311-324: Consider usingMetersconstants for histogram names.The histogram names are hardcoded as
"gen_ai.client.token.usage"and"gen_ai.client.operation.duration". The test file imports and usesMeters.LLM_TOKEN_USAGEandMeters.LLM_OPERATION_DURATIONconstants. Using these constants here would ensure consistency between implementation and tests.♻️ Suggested refactor
First, add the import:
from opentelemetry.semconv_ai import ( SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY, LLMRequestTypeValues, SpanAttributes, Meters, # Add this )Then update
_create_metrics:def _create_metrics(meter: Meter): token_histogram = meter.create_histogram( - name="gen_ai.client.token.usage", + name=Meters.LLM_TOKEN_USAGE, unit="token", description="Measures number of input and output tokens used", ) duration_histogram = meter.create_histogram( - name="gen_ai.client.operation.duration", + name=Meters.LLM_OPERATION_DURATION, unit="s", description="GenAI operation duration", ) return token_histogram, duration_histogram
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
🧠 Learnings (3)
📚 Learning: 2025-12-02T21:09:48.690Z
Learnt from: duanyutong
Repo: traceloop/openllmetry PR: 3487
File: packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py:177-178
Timestamp: 2025-12-02T21:09:48.690Z
Learning: The opentelemetry-instrumentation-openai and opentelemetry-instrumentation-openai-agents packages must remain independent and not share code, so code duplication between them is acceptable.
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Semantic conventions must follow the OpenTelemetry GenAI specification
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Instrumentation packages must leverage the semantic conventions package and emit OTel-compliant spans
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
🧬 Code graph analysis (3)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)
set_model_response_attributes(293-313)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
set_model_response_attributes(449-490)set_response_attributes(418-446)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (5)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
GoogleGenerativeAiInstrumentor(327-395)packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (2)
SpanAttributes(64-245)Meters(36-61)packages/opentelemetry-instrumentation-google-generativeai/tests/conftest.py (3)
exporter(37-48)genai_client(71-73)metrics_test_context(89-94)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (1)
reader(37-41)
🪛 Ruff (0.14.10)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
130-130: Loop control variable token_type not used within loop body
Rename unused token_type to _token_type
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (3)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
37-68: LGTM!The test correctly validates span attributes, kind, status, and token usage. Good coverage of the expected semantic conventions.
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (2)
307-308: Metrics enabled by default — verify this is intentional.
is_metrics_enabled()defaults totruewhenTRACELOOP_METRICS_ENABLEDis not set. This means metrics collection will be active for all users after this change, which could be a breaking change for users not expecting metrics overhead.Consider whether this should default to
falsefor backward compatibility, or document this behavior change prominently.
343-386: LGTM!The instrumentation correctly initializes metrics when enabled and properly passes the histogram objects through the wrapper chain. The conditional wrapper selection based on
AsyncModelsis appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In
@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py:
- Around line 211-219: Replace the hardcoded attribute keys in the
duration_histogram.record call with the semantic convention constants from
GenAIAttributes: use GenAIAttributes.GEN_AI_PROVIDER_NAME for the provider key
and GenAIAttributes.GEN_AI_RESPONSE_MODEL for the model key (keep the values
"Google" and llm_model unchanged); update the attributes dict in the
duration_histogram.record invocation inside the block guarded by
duration_histogram to reference these constants instead of the literal strings.
- Around line 282-290: Replace the hardcoded attribute keys in the sync
wrapper's duration_histogram.record call with the semantic convention constants
used by the async wrapper (e.g., GEN_AI_PROVIDER_NAME and GEN_AI_RESPONSE_MODEL
or the exact constant names exported from your
semantic_conventions/SemanticAttributes module), add the corresponding import at
the top of this module, and use those constants in the attributes dict passed to
duration_histogram.record in the sync wrapper so it matches the async
implementation and avoids duplicate string literals.
In
@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py:
- Around line 29-34: The fixture mock_instrumentor (which creates a
GoogleGenerativeAiInstrumentor and mocks its instrument/uninstrument methods) is
unused; remove the entire mock_instrumentor fixture definition from
tests/test_generate_content.py to clean up unused test code, or alternatively
add a test that accepts mock_instrumentor as a parameter and exercises its
instrument/uninstrument behavior (referencing GoogleGenerativeAiInstrumentor and
the mock methods instrument/uninstrument) so the fixture is actually utilized.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (1)
177-178: Extraneous blank lines in function signatures.Lines 177 and 247 have trailing blank lines before the closing parenthesis, which is inconsistent formatting.
♻️ Suggested fix
async def _awrap( tracer, event_logger, to_wrap, token_histogram, duration_histogram, wrapped, instance, args, kwargs, - ):def _wrap( tracer, event_logger, to_wrap, token_histogram, duration_histogram, wrapped, instance, args, kwargs, - ):Also applies to: 247-248
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
130-132: Unused loop variabletoken_type.The loop variable
token_typeis not used within the loop body. Rename to_token_typeto indicate it's intentionally unused.♻️ Suggested fix
- for token_type, dp in token_points_by_type.items(): + for _token_type, dp in token_points_by_type.items(): assert dp.count >= 1 assert dp.sum >= 0
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules
Files:
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
🧠 Learnings (2)
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Semantic conventions must follow the OpenTelemetry GenAI specification
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.pypackages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
📚 Learning: 2025-08-17T15:06:48.109Z
Learnt from: CR
Repo: traceloop/openllmetry PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.109Z
Learning: Instrumentation packages must leverage the semantic conventions package and emit OTel-compliant spans
Applied to files:
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
🪛 Ruff (0.14.10)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
130-130: Loop control variable token_type not used within loop body
Rename unused token_type to _token_type
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Lint
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.10)
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py (4)
206-219: Duration recorded before checking response validity.The duration is recorded at lines 211-219 regardless of whether the response processing succeeds or fails. If an exception occurs after
wrapped(*args, **kwargs), the duration is still recorded but the span may not complete properly. Consider whether duration should only be recorded for successful operations, or wrap in try/finally to ensure consistency.
311-324: Histogram creation follows OTel GenAI semantic conventions.The metric names
gen_ai.client.token.usageandgen_ai.client.operation.durationalign with OpenTelemetry GenAI semantic conventions. Good implementation.
347-354: Metrics creation conditional onis_metrics_enabled()is correct.The instrumentation properly gates metric creation behind the feature flag, avoiding unnecessary overhead when metrics are disabled.
307-308: The review comment is incorrect. The opt-out default pattern used inis_metrics_enabled()is the project-wide standard applied consistently across all instrumentation packages (traceloop-sdk, openai, anthropic, bedrock, groq, and others). This is not an unexpected behavior or inconsistency unique to the google_generativeai instrumentation—it's the intentional design standard for the entire codebase. No changes needed.Likely an incorrect or invalid review comment.
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (2)
37-68: Span attribute validation test is comprehensive.The test validates span kind, status, semantic attributes, and token usage fields. Good coverage of the instrumentation output.
71-136: Metrics validation test properly checks semantic conventions.The test validates both required metrics (
gen_ai.client.operation.durationandgen_ai.client.token.usage), their units, data points, and required attributes. The check for both input and output token types at line 128 is thorough.
| if duration_histogram: | ||
| duration = time.perf_counter() - start_time | ||
| duration_histogram.record( | ||
| duration, | ||
| attributes={ | ||
| "gen_ai.provider.name": "Google", | ||
| "gen_ai.response.model": llm_model | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use semantic convention constants instead of hardcoded strings.
The attributes "gen_ai.provider.name" and "gen_ai.response.model" are hardcoded strings, but the codebase already imports GenAIAttributes from opentelemetry.semconv._incubating.attributes. This is inconsistent with the existing code style and the PR discussion where the reviewer suggested importing standard names from the otel semconv package.
Based on learnings, semantic conventions must follow the OpenTelemetry GenAI specification.
♻️ Suggested refactor to use semantic conventions
if duration_histogram:
duration = time.perf_counter() - start_time
duration_histogram.record(
duration,
attributes={
- "gen_ai.provider.name": "Google",
- "gen_ai.response.model": llm_model
+ GenAIAttributes.GEN_AI_SYSTEM: "Google",
+ GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model,
},
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if duration_histogram: | |
| duration = time.perf_counter() - start_time | |
| duration_histogram.record( | |
| duration, | |
| attributes={ | |
| "gen_ai.provider.name": "Google", | |
| "gen_ai.response.model": llm_model | |
| }, | |
| ) | |
| if duration_histogram: | |
| duration = time.perf_counter() - start_time | |
| duration_histogram.record( | |
| duration, | |
| attributes={ | |
| GenAIAttributes.GEN_AI_SYSTEM: "Google", | |
| GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, | |
| }, | |
| ) |
🤖 Prompt for AI Agents
In
@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
around lines 211 - 219, Replace the hardcoded attribute keys in the
duration_histogram.record call with the semantic convention constants from
GenAIAttributes: use GenAIAttributes.GEN_AI_PROVIDER_NAME for the provider key
and GenAIAttributes.GEN_AI_RESPONSE_MODEL for the model key (keep the values
"Google" and llm_model unchanged); update the attributes dict in the
duration_histogram.record invocation inside the block guarded by
duration_histogram to reference these constants instead of the literal strings.
| if duration_histogram: | ||
| duration = time.perf_counter() - start_time | ||
| duration_histogram.record( | ||
| duration, | ||
| attributes={ | ||
| "gen_ai.provider.name": "Google", | ||
| "gen_ai.response.model": llm_model | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Duplicate hardcoded attribute strings in sync wrapper.
Same issue as the async wrapper - use semantic convention constants for consistency.
♻️ Suggested refactor
if duration_histogram:
duration = time.perf_counter() - start_time
duration_histogram.record(
duration,
attributes={
- "gen_ai.provider.name": "Google",
- "gen_ai.response.model": llm_model
+ GenAIAttributes.GEN_AI_SYSTEM: "Google",
+ GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model,
},
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if duration_histogram: | |
| duration = time.perf_counter() - start_time | |
| duration_histogram.record( | |
| duration, | |
| attributes={ | |
| "gen_ai.provider.name": "Google", | |
| "gen_ai.response.model": llm_model | |
| }, | |
| ) | |
| if duration_histogram: | |
| duration = time.perf_counter() - start_time | |
| duration_histogram.record( | |
| duration, | |
| attributes={ | |
| GenAIAttributes.GEN_AI_SYSTEM: "Google", | |
| GenAIAttributes.GEN_AI_RESPONSE_MODEL: llm_model, | |
| }, | |
| ) |
🤖 Prompt for AI Agents
In
@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/__init__.py
around lines 282 - 290, Replace the hardcoded attribute keys in the sync
wrapper's duration_histogram.record call with the semantic convention constants
used by the async wrapper (e.g., GEN_AI_PROVIDER_NAME and GEN_AI_RESPONSE_MODEL
or the exact constant names exported from your
semantic_conventions/SemanticAttributes module), add the corresponding import at
the top of this module, and use those constants in the attributes dict passed to
duration_histogram.record in the sync wrapper so it matches the async
implementation and avoids duplicate string literals.
| @pytest.fixture | ||
| def mock_instrumentor(): | ||
| instrumentor = GoogleGenerativeAiInstrumentor() | ||
| instrumentor.instrument = MagicMock() | ||
| instrumentor.uninstrument = MagicMock() | ||
| return instrumentor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "mock_instrumentor" --type pyRepository: traceloop/openllmetry
Length of output: 1215
🏁 Script executed:
cat -n "packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py" | head -150Repository: traceloop/openllmetry
Length of output: 5405
Remove unused fixture mock_instrumentor.
This fixture is defined but not used by any test in this file. Remove it or add a test that requires it.
🤖 Prompt for AI Agents
In
@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
around lines 29 - 34, The fixture mock_instrumentor (which creates a
GoogleGenerativeAiInstrumentor and mocks its instrument/uninstrument methods) is
unused; remove the entire mock_instrumentor fixture definition from
tests/test_generate_content.py to clean up unused test code, or alternatively
add a test that accepts mock_instrumentor as a parameter and exercises its
instrument/uninstrument behavior (referencing GoogleGenerativeAiInstrumentor and
the mock methods instrument/uninstrument) so the fixture is actually utilized.
feat(instrumentation): ...orfix(instrumentation): ....Important
Adds metrics support to Google Generative AI instrumentation, introducing histograms for token usage and operation duration.
token_histogramandduration_histogramin__init__.pyfor tracking token usage and operation duration.is_metrics_enabled()to check if metrics are enabled viaTRACELOOP_METRICS_ENABLEDenvironment variable._create_metrics()to create histograms for tokens and duration._awrap()and_wrap()in__init__.pyto record operation duration usingduration_histogram._build_from_streaming_response(),_abuild_from_streaming_response(), and_handle_response()to includetoken_histogram.set_model_response_attributes()inspan_utils.pyto record token counts usingtoken_histogram.This description was created by
for 012121e. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Bug Fixes / Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.