Skip to content

feat(bedrock): add event-based instrumentation #2476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 57 additions & 1 deletion packages/opentelemetry-instrumentation-bedrock/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@

This library allows tracing any of AWS Bedrock's models prompts and completions sent with [Boto3](https://github.com/boto/boto3) to Bedrock.

## Features

- Traces all calls to Bedrock's model endpoints
- Supports both legacy attribute-based and new event-based semantic conventions
- Captures prompts, completions, and token usage metrics
- Supports streaming responses with chunk-by-chunk instrumentation
- Handles multiple model providers (Anthropic, Cohere, AI21, Meta, Amazon)

## Installation

```bash
Expand All @@ -17,12 +25,30 @@ pip install opentelemetry-instrumentation-bedrock
```python
from opentelemetry.instrumentation.bedrock import BedrockInstrumentor

# Use legacy attribute-based semantic conventions (default)
BedrockInstrumentor().instrument()

# Or use new event-based semantic conventions
from opentelemetry.instrumentation.bedrock.config import Config
Config.use_legacy_attributes = False
BedrockInstrumentor().instrument()
```

## Configuration

The instrumentation can be configured using the following options:

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `use_legacy_attributes` | bool | `True` | Controls whether to use legacy attribute-based semantic conventions or new event-based semantic conventions. When `True`, prompts and completions are stored as span attributes. When `False`, they are stored as span events following the new OpenTelemetry semantic conventions. |
| `enrich_token_usage` | bool | `False` | When `True`, calculates token usage even when not provided by the API response. |
| `exception_logger` | Callable | `None` | Optional callback for logging exceptions that occur during instrumentation. |

## Privacy

**By default, this instrumentation logs prompts, completions, and embeddings to span attributes**. This gives you a clear visibility into how your LLM application is working, and can make it easy to debug and evaluate the quality of the outputs.
**By default, this instrumentation logs prompts, completions, and embeddings**. This gives you clear visibility into how your LLM application is working, and can make it easy to debug and evaluate the quality of the outputs.

The data can be stored either as span attributes (legacy mode) or span events (new mode), controlled by the `use_legacy_attributes` configuration option.

However, you may want to disable this logging for privacy reasons, as they may contain highly sensitive data from your users. You may also simply want to reduce the size of your traces.

Expand All @@ -31,3 +57,33 @@ To disable logging, set the `TRACELOOP_TRACE_CONTENT` environment variable to `f
```bash
TRACELOOP_TRACE_CONTENT=false
```

## Semantic Conventions

This instrumentation supports two modes of operation:

### Legacy Attribute-based Mode (Default)

In this mode, prompts and completions are stored as span attributes following the pattern:
- `gen_ai.prompt.{index}.content` - The prompt text
- `gen_ai.prompt.{index}.role` - The role (e.g., "user", "system")
- `gen_ai.completion.{index}.content` - The completion text
- `gen_ai.completion.{index}.role` - The role (e.g., "assistant")

### Event-based Mode

In this mode, prompts and completions are stored as span events following the new OpenTelemetry semantic conventions:
- Prompt events with attributes:
- `llm.prompt.index` - The index of the prompt
- `llm.prompt.type` - The type of prompt (e.g., "chat", "completion")
- `llm.prompt.content` - The prompt text
- `llm.prompt.role` - The role (e.g., "user", "system")
- Completion events with attributes:
- `llm.completion.index` - The index of the completion
- `llm.completion.content` - The completion text
- `llm.completion.role` - The role (e.g., "assistant")
- `llm.completion.stop_reason` - The reason the completion stopped

For streaming responses in event-based mode, each chunk generates a `llm.content.completion.chunk` event with the chunk's content.

Token usage metrics and other metadata are recorded in both modes using standard attributes.
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
import time
from typing import Collection
from opentelemetry.instrumentation.bedrock.config import Config
from opentelemetry.instrumentation.bedrock.events import (
emit_prompt_event,
emit_completion_event,
)
from opentelemetry.instrumentation.bedrock.reusable_streaming_body import (
ReusableStreamingBody,
)
Expand Down Expand Up @@ -353,8 +357,6 @@ def _set_cohere_span_attributes(span, request_body, response_body, metric_params
input_tokens = response_body.get("token_count", {}).get("prompt_tokens")
output_tokens = response_body.get("token_count", {}).get("response_tokens")

print("response_body", response_body)

if input_tokens is None or output_tokens is None:
meta = response_body.get("meta", {})
billed_units = meta.get("billed_units", {})
Expand All @@ -370,17 +372,31 @@ def _set_cohere_span_attributes(span, request_body, response_body, metric_params
)

if should_send_prompts():
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)

for i, generation in enumerate(response_body.get("generations")):
if Config.use_legacy_attributes:
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
generation.get("text"),
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)

for i, generation in enumerate(response_body.get("generations")):
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
generation.get("text"),
)
else:
# Event-based instrumentation
emit_prompt_event(span, prompt=request_body.get("prompt"))
for i, generation in enumerate(response_body.get("generations")):
emit_completion_event(
span,
{
"content": generation.get("text"),
"model": metric_params.model,
},
index=i,
is_streaming=metric_params.is_stream,
)


def _set_anthropic_completion_span_attributes(
span, request_body, response_body, metric_params
Expand Down Expand Up @@ -427,14 +443,27 @@ def _set_anthropic_completion_span_attributes(
)

if should_send_prompts():
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
response_body.get("completion"),
)
if Config.use_legacy_attributes:
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
response_body.get("completion"),
)
else:
# Event-based instrumentation
emit_prompt_event(span, prompt=request_body.get("prompt"))
emit_completion_event(
span,
{
"content": response_body.get("completion"),
"model": metric_params.model,
"stop_reason": response_body.get("stop_reason"),
},
is_streaming=metric_params.is_stream,
)


def _set_anthropic_messages_span_attributes(
Expand Down Expand Up @@ -487,24 +516,40 @@ def _set_anthropic_messages_span_attributes(
_record_usage_to_span(span, prompt_tokens, completion_tokens, metric_params)

if should_send_prompts():
for idx, message in enumerate(request_body.get("messages")):
if Config.use_legacy_attributes:
for idx, message in enumerate(request_body.get("messages")):
_set_span_attribute(
span,
f"{SpanAttributes.LLM_PROMPTS}.{idx}.role",
message.get("role"),
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_PROMPTS}.0.content",
json.dumps(message.get("content")),
)

_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.{idx}.role", message.get("role")
span, f"{SpanAttributes.LLM_COMPLETIONS}.0.content", "assistant"
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_PROMPTS}.0.content",
json.dumps(message.get("content")),
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
json.dumps(response_body.get("content")),
)
else:
# Event-based instrumentation
emit_prompt_event(span, messages=request_body.get("messages"))
emit_completion_event(
span,
{
"content": response_body.get("content"),
"model": metric_params.model,
"role": "assistant",
"stop_reason": response_body.get("stop_reason"),
},
is_streaming=metric_params.is_stream,
)

_set_span_attribute(
span, f"{SpanAttributes.LLM_COMPLETIONS}.0.content", "assistant"
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
json.dumps(response_body.get("content")),
)


def _count_anthropic_tokens(messages: list[str]):
Expand Down Expand Up @@ -536,17 +581,31 @@ def _set_ai21_span_attributes(span, request_body, response_body, metric_params):
)

if should_send_prompts():
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)

for i, completion in enumerate(response_body.get("completions")):
if Config.use_legacy_attributes:
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
completion.get("data").get("text"),
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("prompt")
)

for i, completion in enumerate(response_body.get("completions")):
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
completion.get("data").get("text"),
)
else:
# Event-based instrumentation
emit_prompt_event(span, prompt=request_body.get("prompt"))
for i, completion in enumerate(response_body.get("completions")):
emit_completion_event(
span,
{
"content": completion.get("data").get("text"),
"model": metric_params.model,
},
index=i,
is_streaming=metric_params.is_stream,
)


def _set_llama_span_attributes(span, request_body, response_body, metric_params):
_set_span_attribute(
Expand All @@ -570,28 +629,58 @@ def _set_llama_span_attributes(span, request_body, response_body, metric_params)
)

if should_send_prompts():
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.content", request_body.get("prompt")
)
_set_span_attribute(span, f"{SpanAttributes.LLM_PROMPTS}.0.role", "user")

if response_body.get("generation"):
_set_span_attribute(
span, f"{SpanAttributes.LLM_COMPLETIONS}.0.role", "assistant"
)
if Config.use_legacy_attributes:
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
response_body.get("generation"),
f"{SpanAttributes.LLM_PROMPTS}.0.content",
request_body.get("prompt"),
)
else:
for i, generation in enumerate(response_body.get("generations")):
_set_span_attribute(span, f"{SpanAttributes.LLM_PROMPTS}.0.role", "user")

if response_body.get("generation"):
_set_span_attribute(
span, f"{SpanAttributes.LLM_COMPLETIONS}.{i}.role", "assistant"
span, f"{SpanAttributes.LLM_COMPLETIONS}.0.role", "assistant"
)
_set_span_attribute(
span, f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content", generation
span,
f"{SpanAttributes.LLM_COMPLETIONS}.0.content",
response_body.get("generation"),
)
else:
for i, generation in enumerate(response_body.get("generations")):
_set_span_attribute(
span, f"{SpanAttributes.LLM_COMPLETIONS}.{i}.role", "assistant"
)
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
generation,
)
else:
# Event-based instrumentation
emit_prompt_event(span, prompt=request_body.get("prompt"))
if response_body.get("generation"):
emit_completion_event(
span,
{
"content": response_body.get("generation"),
"model": metric_params.model,
"role": "assistant",
},
is_streaming=metric_params.is_stream,
)
else:
for i, generation in enumerate(response_body.get("generations")):
emit_completion_event(
span,
{
"content": generation,
"model": metric_params.model,
"role": "assistant",
},
index=i,
is_streaming=metric_params.is_stream,
)


def _set_amazon_span_attributes(span, request_body, response_body, metric_params):
Expand All @@ -615,17 +704,33 @@ def _set_amazon_span_attributes(span, request_body, response_body, metric_params
)

if should_send_prompts():
_set_span_attribute(
span, f"{SpanAttributes.LLM_PROMPTS}.0.user", request_body.get("inputText")
)

for i, result in enumerate(response_body.get("results")):
if Config.use_legacy_attributes:
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
result.get("outputText"),
f"{SpanAttributes.LLM_PROMPTS}.0.user",
request_body.get("inputText"),
)

for i, result in enumerate(response_body.get("results")):
_set_span_attribute(
span,
f"{SpanAttributes.LLM_COMPLETIONS}.{i}.content",
result.get("outputText"),
)
else:
# Event-based instrumentation
emit_prompt_event(span, prompt=request_body.get("inputText"))
for i, result in enumerate(response_body.get("results")):
emit_completion_event(
span,
{
"content": result.get("outputText"),
"model": metric_params.model,
},
index=i,
is_streaming=metric_params.is_stream,
)


def _create_metrics(meter: Meter):
token_histogram = meter.create_histogram(
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
class Config:
"""Configuration for Bedrock instrumentation."""

enrich_token_usage = False
exception_logger = None
use_legacy_attributes: bool = True # Controls whether to use legacy attributes or new event-based semantic conventions
Loading