feat(pipecat): add auto instrumentor for pipecat #2441

duncankmckinnon · 2025-11-13T18:26:56Z

Add OpenInference Instrumentation for Pipecat

This PR implements comprehensive OpenTelemetry tracing for Pipecat voice agents using OpenInference semantic conventions, enabling production-ready observability for voice AI applications.

Overview

Adds automatic instrumentation for Pipecat pipelines that captures:

Turn-level spans: Complete conversation exchanges with user input/output
Service-level spans: Individual LLM, TTS, and STT operations with proper directionality
Flat span hierarchy: All service spans as siblings under turn spans for clear visualization
Rich attributes: Model names, providers, token counts, latency metrics, and full conversation history in Arize-compatible format

Key Features

1. Automatic Instrumentation via Observer Pattern

The instrumentor wraps PipelineTask.__init__ to automatically inject an observer into every task:

from openinference.instrumentation.pipecat import PipecatInstrumentor
from arize.otel import register

tracer_provider = register(
    space_id=os.getenv("ARIZE_SPACE_ID"),
    api_key=os.getenv("ARIZE_API_KEY"),
    project_name=os.getenv("ARIZE_PROJECT_NAME"),
)

PipecatInstrumentor().instrument(
    tracer_provider=tracer_provider,
    debug_log_filename="debug.log"  # Optional
)

No code changes needed in your pipeline - just instrument once and all PipelineTask instances get automatic tracing.

2. Intelligent Turn Tracking

Implements conversation turn tracking using speaking events to define natural conversation boundaries:

Start turn: UserStartedSpeakingFrame or StartFrame (first pipeline frame)
End turn: Timeout after BotStoppedSpeakingFrame (configurable, default 2.5s)
Interruption handling: New turn starts immediately when user interrupts bot
Auto-start: First service activity auto-starts a turn if none active

This approach ensures one turn span per actual conversation exchange with proper handling of multi-part bot responses (e.g., function calls causing multiple TTS segments).

3. Bidirectional Frame Processing with Deduplication

Captures frames both entering (input) and leaving (output) services with intelligent deduplication:

Directional filtering: INPUT_VALUE only captured when is_input=True, OUTPUT_VALUE only when is_input=False
Streaming accumulation: LLM and TTS streaming chunks accumulated with smart deduplication
Special handling for STT: TranscriptionFrame is OUTPUT from STT but recorded as INPUT for observability
TTS filtering: Only captures TTS text when going to BaseOutputTransport (final output)

Deduplication handles cumulative chunks (e.g., "Hello" → "Hello world" → "Hello world!") by detecting overlaps and extracting only new content.

4. Multiple LLM Invocations Per Turn

Properly handles multiple LLM calls within a single turn (e.g., function calling flows):

Detects new invocations via LLMContextFrame
Finishes previous LLM span before starting new one
Each LLM call gets its own span with full message context
Prevents output accumulation across different invocations

5. Flattened Message Format for Arize

Message history is exported in the flattened format that Arize expects:

# Instead of a single JSON string:
"llm.input_messages": "[{role: 'user', content: '...'}]"

# We set individual attributes:
"llm.input_messages.0.message.role": "system"
"llm.input_messages.0.message.content": "You are a helpful assistant"
"llm.input_messages.1.message.role": "user"
"llm.input_messages.1.message.content": "What is quantum computing?"
# ... and so on

This enables proper display in Arize's UI with message-level filtering and analysis.

6. Consistent Time Handling

Uses time.time_ns() (Unix epoch) consistently for all span timestamps:

Span start time: Recorded at service span creation
Span end time: Calculated from start_time_ns + processing_time_seconds when metrics available
Avoids mixing clocks: Does not use Pipecat's monotonic_ns() timestamps which are relative to pipeline start

Ensures end_time >= start_time invariant required by OpenTelemetry.

7. Multi-Provider Service Detection

Automatically detects and attributes service types and providers:

LLM Services: OpenAI, Anthropic (sets llm.provider, llm.model_name)
TTS Services: OpenAI, ElevenLabs, Cartesia (sets audio.voice, audio.voice_id)
STT Services: OpenAI, Deepgram, Cartesia
Generic detection: Works with any service inheriting from Pipecat base classes

Sets service.name to the actual service class name for unique identification.

8. Session Tracking

Automatically extracts conversation_id from PipelineTask and sets as session.id attribute on all spans, enabling conversation-level filtering in observability platforms.

Implementation Details

Core Components

PipecatInstrumentor (init.py)

Wraps PipelineTask.__init__ using wrapt
Injects OpenInferenceObserver into each task
Supports optional debug_log_filename parameter for detailed frame logging
Thread-safe: creates separate observer instance per task

OpenInferenceObserver (_observer.py)

Implements Pipecat's BaseObserver interface
Listens to on_push_frame events with bidirectional processing (input/output)
Creates turn spans and service spans with proper OpenTelemetry context propagation
Tracks turn state: active turn, user text, bot text, speaking status
Handles frame deduplication to avoid processing propagated frames twice
Auto-starts turns when first service activity detected
Finishes service spans before turn spans to maintain proper hierarchy

Frame Attribute Extractors (_attributes.py)

Extracts OpenInference-compliant attributes from Pipecat frames
Handles multiple frame types: TranscriptionFrame, LLMContextFrame, LLMTextFrame, TTSTextFrame, MetricsFrame, etc.
Captures: LLM messages (flattened format), audio metadata, token counts, processing times, errors
Service attribute extraction for span creation with provider-specific details

Span Hierarchy

pipecat.conversation.turn (trace_id: abc123)
├── pipecat.stt (parent_id: turn_span_id, trace_id: abc123)
├── pipecat.llm (parent_id: turn_span_id, trace_id: abc123)
├── pipecat.llm (parent_id: turn_span_id, trace_id: abc123)  # Second invocation
└── pipecat.tts (parent_id: turn_span_id, trace_id: abc123)

Flat hierarchy: All service spans are siblings under the turn span (no nesting) for clearer visualization in tracing UIs. All spans within a turn share the same trace_id and have session.id attribute set.

Context Propagation

Service spans are created with the turn span's context:

turn_context = trace_api.set_span_in_context(self._turn_span)
span = self._tracer.start_span(
    name=f"pipecat.{service_type}",
    context=turn_context,  # Links to turn span
)

This ensures proper parent-child relationships and enables distributed tracing.

Testing

Test Coverage

69 tests covering:

Instrumentor Basics (test_instrumentor.py):
- Initialization, instrumentation, uninstrumentation
- Observer injection into tasks
- Singleton behavior
- Configuration handling
Turn Tracking (test_turn_tracking.py):
- Turn creation on user/bot speech
- Multiple sequential turns
- Turn interruption handling
- Input/output text capture
- Session ID attribution
- Turn span hierarchy
Service Detection (test_service_detection.py):
- LLM/TTS/STT service type detection
- Multi-provider detection (OpenAI, Anthropic, ElevenLabs, Deepgram)
- Metadata extraction (models, voices, providers)
- Custom service inheritance
Provider Spans (test_provider_spans.py):
- Span creation for different providers
- Correct span attributes per service type
- Input/output capture for each service
- Mixed provider pipelines
- Provider-specific attributes (model names, voice IDs)

Mock Infrastructure

Comprehensive mocks in conftest.py:

Mock LLM/TTS/STT services with configurable metadata
Helper functions for running pipeline tasks
Span extraction and assertion utilities
Support for multiple provider combinations

All tests use in-memory span exporters for fast, isolated testing.

Example Usage

Complete Tracing Example

See examples/trace/001-trace.py for a full working example:

from openinference.instrumentation.pipecat import PipecatInstrumentor
from arize.otel import register

# Generate unique conversation ID
conversation_id = f"conversation-{datetime.now().strftime('%Y%m%d_%H%M%S')}"
debug_log_filename = f"pipecat_frames_{conversation_id}.log"

# Set up tracing
tracer_provider = register(
    space_id=os.getenv("ARIZE_SPACE_ID"),
    api_key=os.getenv("ARIZE_API_KEY"),
    project_name=os.getenv("ARIZE_PROJECT_NAME"),
)

PipecatInstrumentor().instrument(
    tracer_provider=tracer_provider,
    debug_log_filename=debug_log_filename,
)

# Create your pipeline (STT -> LLM -> TTS)
pipeline = Pipeline([stt, llm, tts, transport.output()])

# Create task with conversation ID
task = PipelineTask(
    pipeline,
    conversation_id=conversation_id,
    params=PipelineParams(enable_metrics=True)
)

# Run - tracing happens automatically!
await runner.run(task)

What Gets Traced

For a single user query → bot response with a follow-up question:

Turn Span (pipecat.conversation.turn):

session.id: "conversation-20251113_152502"
input.value: "What is quantum computing?"
output.value: "Quantum computing is a type of computing that uses quantum mechanics..."
conversation.turn_number: 1
conversation.turn_duration_seconds: 3.5
conversation.end_reason: "completed"

STT Span (pipecat.stt):

service.name: "OpenAISTTService"
service.type: "stt"
llm.provider: "openai"
llm.model_name: "gpt-4o-transcribe"
input.value: "What is quantum computing?" (transcribed text)
audio.transcript: "What is quantum computing?"
Duration: 0.78 seconds

LLM Span (pipecat.llm):

service.name: "OpenAILLMService"
service.type: "llm"
llm.provider: "openai"
llm.model_name: "gpt-4"
input.value: "What is quantum computing?" (last user message)
output.value: "Quantum computing is..." (accumulated streaming response)
llm.input_messages.0.message.role: "system"
llm.input_messages.0.message.content: "You are a helpful assistant"
llm.input_messages.1.message.role: "user"
llm.input_messages.1.message.content: "What is quantum computing?"
llm.output_messages.0.message.role: "assistant"
llm.output_messages.0.message.content: "Quantum computing is..."
llm.token_count.total: 520
llm.token_count.prompt: 380
llm.token_count.completion: 140
Duration: 2.77 seconds

TTS Span (pipecat.tts):

service.name: "OpenAITTSService"
service.type: "tts"
llm.provider: "openai"
llm.model_name: "gpt-4o-mini-tts"
audio.voice: "ballad"
audio.voice_id: "ballad"
output.value: "Quantum computing is..." (synthesized text)
service.processing_time_seconds: 1.57
Duration: 1.57 seconds

Key Improvements in This PR

Fixed STT input.value capture: Special case handling for TranscriptionFrame as OUTPUT from STT but recorded as INPUT for observability
Fixed span timing: Consistent use of Unix epoch nanoseconds (time.time_ns()) instead of mixing with Pipecat's monotonic clock
Fixed processing_time_seconds: Proper storage and use for calculating span end time
Fixed LLM message format: Flattened attribute format (llm.input_messages.{index}.message.{field}) instead of single JSON string
Added LLM output messages: Flattened format for output messages matching input format
Improved deduplication: Smart handling of cumulative streaming chunks with overlap detection
Multiple LLM invocations: Separate spans for each LLM call within a turn
Bidirectional processing: Proper handling of frames as both inputs and outputs with directional filtering

Note

Adds a Pipecat auto-instrumentation package that converts Pipecat frames into OpenInference/OTel spans with turn tracking, service spans (LLM/TTS/STT), rich attributes, examples, tests, and CI integration.

New package python/instrumentation/openinference-instrumentation-pipecat:
- Instrumentor: PipecatInstrumentor auto-injects an observer by wrapping PipelineTask.__init__.
- Observer: OpenInferenceObserver creates pipecat.conversation.turn spans and sibling service spans (pipecat.llm, pipecat.tts, pipecat.stt), with session IDs, streaming chunk deduplication, timing, and interruption handling.
- Attribute extraction: _attributes.py extracts frame/service attributes (flattened LLM messages, tool calls, token/latency metrics, provider/model detection) and sets OpenInference/GenAI semconv fields.
- Packaging: pyproject.toml, entry points for OTel/OpenInference, version 0.1.0, LICENSE, CHANGELOG, README.
- Examples: examples/trace/001-trace.py and example.env for Phoenix/OTLP setup.
- Tests: Comprehensive suites for instrumentor behavior, provider/service detection, span attributes, and turn tracking with mocks.
CI/Tooling:
- python/tox.ini: add pipecat and pipecat-latest envs and install steps.
Repo housekeeping:
- .gitignore: ignore *.code-workspace.

^{Written by Cursor Bugbot for commit 9cb7459. This will update automatically on new commits. Configure here.}

...openinference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_observer.py

...eninference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_attributes.py

...nce-instrumentation-pipecat/tests/openinference/instrumentation/pipecat/test_instrumentor.py

...eninference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_attributes.py

cursor · 2025-11-20T20:30:57Z

python/instrumentation/openinference-instrumentation-pipecat/pyproject.toml

+  "openinference-semantic-conventions>=0.1.21",
+  "websockets>=13.1,<16.0",
+  "mypy>=1.18.2",
+]


Bug: Missing wrapt dependency in package dependencies

The code imports wrapt in __init__.py with from wrapt import wrap_function_wrapper, but wrapt is not listed in the dependencies section of pyproject.toml. While it's listed in the mypy overrides to ignore missing imports, it's not actually installed as a dependency. This will cause import errors when the package is installed and used.

cursor · 2025-11-21T01:20:27Z

...eninference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_attributes.py

+            elif isinstance(frame, TextFrame):
+                # Generic text frame (output)
+                results[SpanAttributes.OUTPUT_VALUE] = text
+                results["llm.output_messages.0.message.role"] = "user"


Bug: Incorrect message role for generic text output

Generic TextFrame output is assigned message role "user" but should be "assistant" since it represents output from the model. This violates OpenInference semantic conventions where output messages should have role "assistant". The code already correctly sets this as OUTPUT_VALUE on line 205, confirming this is output data that should be attributed to the assistant.

duncankmckinnon added 30 commits October 17, 2025 16:44

pipecat project setup

30ae811

Merge branch 'main' into pipecat

7f084f4

adding planning document for work with claude

4deeef0

pipecat instrumentation

dede91e

adding turn handling

a26d99b

updating example

f9c1c5c

updating tracing logic

f299be8

updates to tracing logic

0e6b1da

updates to tracing

2972f35

Merge branch 'main' into pipecat

b79f6b4

Update .gitignore

138c603

Delete PR_DESCRIPTION.md

344a810

remove local gitignore

64e6aa7

tox updates

bbb9f91

tox version updates

60e4995

ruff format

83c4352

formatting

a1035d8

formatting

ac2c33f

cleaning up example

3e5573c

updates to turn tracking

b02ae08

formatting

e548dce

remove tests

a15088d

adding websockets

4ada36c

updating tracing logic

d309e6d

updating attribute extraction

3eef134

remove service detector as separate class

8a8d3d7

format

914c3fd

remove mcp

97470de

cleaning up token handling

54736d5

quick fix for asyncio

3580784

duncankmckinnon added 9 commits November 5, 2025 16:59

removing plan and example reqs (uses uv extra)

9a0a315

Merge branch 'main' into pipecat

9501123

using built-in safe_json_dumps

25ef3df

updates to attribute tracing

c11f04a

llm message handling and timing

ca6d645

formatting

6b91dc6

removing unused files

e596a8d

Merge branch 'main' into incorporate-pipecat

f76cb8c

Merge branch 'main' into incorporate-pipecat

16ba4be

duncankmckinnon requested a review from a team as a code owner November 13, 2025 18:26

github-project-automation bot added this to Instrumentation Nov 13, 2025

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 13, 2025

duncankmckinnon changed the title ~~Incorporate pipecat~~ (feat): add auto instrumentor for pipecat Nov 13, 2025

cursor bot reviewed Nov 13, 2025

View reviewed changes

...openinference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_observer.py Outdated Show resolved Hide resolved

...eninference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_attributes.py Outdated Show resolved Hide resolved

duncankmckinnon changed the title ~~(feat): add auto instrumentor for pipecat~~ feat(pipecat): add auto instrumentor for pipecat Nov 13, 2025

test and formatting updates

bab6c2d

cursor bot reviewed Nov 13, 2025

View reviewed changes

...nce-instrumentation-pipecat/tests/openinference/instrumentation/pipecat/test_instrumentor.py Outdated Show resolved Hide resolved

...nce-instrumentation-pipecat/tests/openinference/instrumentation/pipecat/test_instrumentor.py Outdated Show resolved Hide resolved

duncankmckinnon added 2 commits November 14, 2025 10:31

clean up

84cb809

Merge branch 'main' into incorporate-pipecat

743e972

cursor bot reviewed Nov 14, 2025

View reviewed changes

...eninference-instrumentation-pipecat/src/openinference/instrumentation/pipecat/_attributes.py Outdated Show resolved Hide resolved

duncankmckinnon added 4 commits November 14, 2025 10:45

refactoring check for efficiency

9ab7042

cleaning up attribute handling

1b0e51c

fixing metric retrievals

9e49ecf

explicitly setting new context

0f7bde1

cursor bot reviewed Nov 20, 2025

View reviewed changes

adding additional span attribute handling

9cb7459

cursor bot reviewed Nov 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(pipecat): add auto instrumentor for pipecat #2441

feat(pipecat): add auto instrumentor for pipecat #2441

duncankmckinnon commented Nov 13, 2025 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Nov 20, 2025

Uh oh!

cursor bot Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(pipecat): add auto instrumentor for pipecat #2441

Are you sure you want to change the base?

feat(pipecat): add auto instrumentor for pipecat #2441

Conversation

duncankmckinnon commented Nov 13, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add OpenInference Instrumentation for Pipecat

Overview

Key Features

1. Automatic Instrumentation via Observer Pattern

2. Intelligent Turn Tracking

3. Bidirectional Frame Processing with Deduplication

4. Multiple LLM Invocations Per Turn

5. Flattened Message Format for Arize

6. Consistent Time Handling

7. Multi-Provider Service Detection

8. Session Tracking

Implementation Details

Core Components

Span Hierarchy

Context Propagation

Testing

Test Coverage

Mock Infrastructure

Example Usage

Complete Tracing Example

What Gets Traced

Key Improvements in This PR

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Nov 20, 2025

Choose a reason for hiding this comment

Bug: Missing wrapt dependency in package dependencies

Uh oh!

cursor bot Nov 21, 2025

Choose a reason for hiding this comment

Bug: Incorrect message role for generic text output

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

duncankmckinnon commented Nov 13, 2025 •

edited by cursor bot

Loading