-
Notifications
You must be signed in to change notification settings - Fork 535
Add design docs #2657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
cijothomas
merged 7 commits into
open-telemetry:main
from
cijothomas:cijothomas/doc-design
Feb 13, 2025
Merged
Add design docs #2657
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
ed14d07
Add design docs
cijothomas fc275c1
more
cijothomas 0cdf9c7
address feedback and add mermaid
cijothomas 8487495
add link to crates
cijothomas 9055377
fix link
cijothomas dd1cf76
Merge branch 'main' into cijothomas/doc-design
cijothomas 55698d3
Merge branch 'main' into cijothomas/doc-design
lalitb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,302 @@ | ||
# OpenTelemetry Rust Logs Design | ||
|
||
Status: | ||
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md) | ||
|
||
## Overview | ||
|
||
[OpenTelemetry (OTel) | ||
Logs](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/README.md) | ||
support differs from Metrics and Traces as it does not introduce a new logging | ||
API for end users. Instead, OTel recommends leveraging existing logging | ||
libraries such as [log](https://crates.io/crates/log) and | ||
[tracing](https://crates.io/crates/tracing), while providing bridges (appenders) | ||
to route logs through OpenTelemetry. | ||
|
||
OTel took this different approach due to the long history of existing logging | ||
solutions. In Rust, these are [log](https://crates.io/crates/log) and | ||
[tracing](https://crates.io/crates/tracing), and have been embraced in the | ||
community for some time. OTel Rust maintains appenders for these libraries, | ||
allowing users to seamlessly integrate with OpenTelemetry without changing their | ||
existing logging instrumentation. | ||
|
||
The `tracing` appender is particularly optimized for performance due to its | ||
widespread adoption and the fact that `tracing` itself has a bridge from the | ||
`log` crate. Notably, OpenTelemetry Rust itself is instrumented using `tracing` | ||
for internal logs. Additionally, when OTel began supporting logging as a signal, | ||
the `log` crate lacked structured logging support, reinforcing the decision to | ||
prioritize `tracing`. | ||
|
||
## Benefits of OpenTelemetry Logs | ||
|
||
- **Unified configuration** across Traces, Metrics, and Logs. | ||
- **Automatic correlation** with Traces. | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- **Consistent Resource attributes** across signals. | ||
- **Multiple destinations support**: Logs can continue flowing to existing | ||
destinations like stdout etc. while also being sent to an | ||
OpenTelemetry-capable backend, typically via an OTLP Exporter or exporters | ||
that export to operating system native systems like `Windows ETW` or `Linux | ||
user_events`. | ||
- **Standalone logging support** for applications that use OpenTelemetry as | ||
their primary logging mechanism. | ||
|
||
## Key Design Principles | ||
|
||
- High performance - no locks/contention in the hot path with minimal/no heap | ||
allocation where possible. | ||
- Capped resource (memory) usage - well-defined behavior when overloaded. | ||
- Self-observable - exposes telemetry about itself to aid in troubleshooting | ||
etc. | ||
- Robust error handling, returning Result where possible instead of panicking. | ||
- Minimal public API, exposing based on need only. | ||
|
||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Architecture Overview | ||
|
||
```mermaid | ||
graph TD | ||
subgraph Application | ||
A1[Application Code] | ||
end | ||
subgraph Logging Libraries | ||
B1[log crate] | ||
B2[tracing crate] | ||
end | ||
subgraph OpenTelemetry | ||
C1[OpenTelemetry Appender for log] | ||
C2[OpenTelemetry Appender for tracing] | ||
C3[OpenTelemetry Logs API] | ||
C4[OpenTelemetry Logs SDK] | ||
C5[OTLP Exporter] | ||
end | ||
subgraph Observability Backend | ||
D1[OTLP-Compatible Backend] | ||
end | ||
A1 --> |Emits Logs| B1 | ||
A1 --> |Emits Logs| B2 | ||
B1 --> |Bridged by| C1 | ||
B2 --> |Bridged by| C2 | ||
C1 --> |Sends to| C3 | ||
C2 --> |Sends to| C3 | ||
C3 --> |Processes with| C4 | ||
C4 --> |Exports via| C5 | ||
C5 --> |Sends to| D1 | ||
``` | ||
|
||
## Logs API | ||
|
||
Logs API is part of the [opentelemetry](https://crates.io/crates/opentelemetry) | ||
crate. | ||
|
||
The OTel Logs API is not intended for direct end-user usage. Instead, it is | ||
designed for appender/bridge authors to integrate existing logging libraries | ||
with OpenTelemetry. However, there is nothing preventing it from being used by | ||
end-users. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit - |
||
|
||
### API Components | ||
|
||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1. **Key-Value Structs**: Used in `LogRecord`, where `Key` struct is shared | ||
across signals but `Value` struct differ from Metrics and Traces. This is | ||
because values in Logs can contain more complex structures than those in | ||
Traces and Metrics. | ||
2. **Traits**: | ||
- `LoggerProvider` - provides methods to obtain Logger. | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- `Logger` - provides methods to create LogRecord and emit the created | ||
LogRecord. | ||
- `LogRecord` - provides methods to populate LogRecord. | ||
3. **No-Op Implementations**: By default, the API performs no operations until | ||
an SDK is attached. | ||
|
||
### Logs Flow | ||
|
||
1. Obtain a `LoggerProvider` implementation. | ||
2. Use the `LoggerProvider` to create `Logger` instances, specifying a scope | ||
name (module/component emitting logs). Optional attributes and version are | ||
also supported. | ||
3. Use the `Logger` to create an empty `LogRecord` instance. | ||
4. Populate the `LogRecord` with body, timestamp, attributes, etc. | ||
5. Call `Logger.emit(LogRecord)` to process and export the log. | ||
|
||
If only the Logs API is used (without an SDK), all the above steps result in no | ||
operations, following OpenTelemetry’s philosophy of separating API from SDK. The | ||
official Logs SDK provides real implementations to process and export logs. | ||
Users or vendors can also provide alternative SDK implementations. | ||
|
||
## Logs SDK | ||
|
||
Logs SDK is part of the | ||
[opentelemetry_sdk](https://crates.io/crates/opentelemetry_sdk) crate. | ||
|
||
The OpenTelemetry Logs SDK provides an OTel specification-compliant | ||
implementation of the Logs API, handling log processing and export. | ||
|
||
### Core Components | ||
|
||
#### `SdkLoggerProvider` | ||
|
||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This is the implementation of the `LoggerProvider` and deals with concerns such | ||
as processing and exporting Logs. | ||
|
||
- Implements the `LoggerProvider` trait. | ||
- Creates and manages `SdkLogger` instances. | ||
- Holds logging configuration, including `Resource` and processors. | ||
- Does not retain a list of created loggers. Instead, it passes an owned clone | ||
of itself to each logger created. This is done so that loggers get a hold of | ||
the configuration (like which processor to invoke). | ||
- Uses an `Arc<LoggerProviderInner>` and delegates all configuration to | ||
`LoggerProviderInner`. This allows cheap cloning of itself and ensures all | ||
clones point to the same underlying configuration. | ||
- As `SdkLoggerProvider` only holds an `Arc` of its inner, it can only take | ||
`&self` in its methods like flush and shutdown. Else it needs to rely on | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
interior mutability that comes with runtime performance costs. Since methods | ||
like shutdown usually need to mutate interior state, but this component can | ||
only take `&self`, it defers to components like exporter to use interior | ||
mutability to handle shutdown. (More on this in the exporter section) | ||
- An alternative design was to let `SdkLogger` hold a `Weak` reference to the | ||
`SdkLoggerProvider`. This would be a `weak->arc` upgrade in every log | ||
emission, significantly affecting throughput. | ||
- `LoggerProviderInner` implements `Drop`, triggering `shutdown()` when no | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
references remain. However, in practice, loggers are often stored statically | ||
inside appenders (like tracing-appender), so explicit shutdown by the user is | ||
required. | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### `SdkLogger` | ||
|
||
This is an implementation of the `Logger`, and contains functionality to create | ||
and emit logs. | ||
|
||
- Implements the `Logger` trait. | ||
- Creates `SdkLogRecord` instances and emits them. | ||
- Calls `OnEmit()` on all registered processors when emitting logs. | ||
- Passes mutable references to each processor (`&mut log_record`), i.e., | ||
ownership is not passed to the processor. This ensures that the logger avoids | ||
cloning costs. Since a mutable reference is passed, processors can modify the | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
log, and it will be visible to the next processor in the chain. | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Since the processor only gets a reference to the log, it cannot store it | ||
beyond the `OnEmit()`. If a processor needs to buffer logs, it must explicitly | ||
copy them to the heap. | ||
- This design allows for stack-only log processing when exporting to operating | ||
system native facilities like `Windows ETW` or `Linux user_events`. | ||
- OTLP Exporting requires network calls (HTTP/gRPC) and batching of logs for | ||
efficiency purposes. These exporters buffer log records by copying them to the | ||
heap. (More on this in the BatchLogRecordProcessor section) | ||
lalitb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### `LogRecord` | ||
|
||
- Holds log data, including attributes. | ||
- Uses an inline array for up to 5 attributes to optimize stack usage. | ||
- Falls back to a heap-allocated `Vec` if more attributes are required. | ||
- Inspired by Go’s `slog` library for efficiency. | ||
|
||
#### LogRecord Processors | ||
|
||
`SdkLoggerProvider` allows being configured with any number of LogProcessors. | ||
They get called in the order of registration. Log records are passed to the | ||
`OnEmit` method of LogProcessor. LogProcessors can be used to process the log | ||
records, enrich them, filter them, and export to destinations by leveraging | ||
LogRecord Exporters. | ||
|
||
Following built-in Log processors are provided in the Log SDK: | ||
|
||
##### SimpleLogProcessor | ||
|
||
This processor is designed to be used for exporting purposes. Export is handled | ||
by an Exporter (which is a separate component). SimpleLogProcessor is "simple" | ||
in the sense that it does not attempt to do any processing - it just calls the | ||
cijothomas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
exporter and passes the log record to it. To comply with OTel specification, it | ||
synchronizes calls to the `Export()` method, i.e., only one `Export()` call will | ||
be done at any given time. | ||
|
||
SimpleLogProcessor is only used for test/learning purposes and is often used | ||
along with a `stdout` exporter. | ||
|
||
##### BatchLogProcessor | ||
|
||
This is another "exporting" processor. As with SimpleLogProcessor, a different | ||
component named LogExporter handles the actual export logic. BatchLogProcessor | ||
buffers/batches the logs it receives into an in-memory buffer. It invokes the | ||
exporter every 1 second or when 512 items are in the batch (customizable). It | ||
uses a background thread to do the export, and communication between the user | ||
thread (where logs are emitted) and the background thread occurs with `mpsc` | ||
channels. | ||
|
||
The max amount of items the buffer holds is 2048 (customizable). Once the limit | ||
is reached, any *new* logs are dropped. It *does not* apply back-pressure to the | ||
user thread and instead drops logs. | ||
|
||
As with SimpleLogProcessor, this component also ensures only one export is | ||
active at a given time. A modified version of this is required to achieve higher | ||
throughput in some environments. | ||
|
||
In this design, at most 2048+512 logs can be in memory at any given point. In | ||
other words, that many logs can be lost if the app crashes in the middle. | ||
|
||
## LogExporters | ||
|
||
LogExporters are responsible for exporting logs to a destination. Some of them | ||
include: | ||
|
||
1. **InMemoryExporter** - exports to an in-memory list, primarily for | ||
unit-testing. This is used extensively in the repo itself, and external users | ||
are also encouraged to use this. | ||
2. **Stdout exporter** - prints telemetry to stdout. Only for debugging/learning | ||
purposes. The output format is not defined and also is not performance | ||
optimized. A production-recommended version with a standardized output format | ||
is in the plan. | ||
3. **OTLP Exporter** - OTel's official exporter which uses the OTLP protocol | ||
that is designed with the OTel data model in mind. Both HTTP and gRPC-based | ||
exporting is offered. | ||
4. **Exporters to OS Kernel facilities** - These exporters are not maintained in | ||
the core repo but listed for completion. They export telemetry to Windows ETW | ||
or Linux user_events. They are designed for high-performance workloads. Due | ||
to their nature of synchronous exporting, they do not require | ||
buffering/batching. This allows logs to operate entirely on the stack and can | ||
scale easily with the number of CPU cores. (Kernel uses per-CPU buffers for | ||
the events, ensuring no contention) | ||
|
||
## `tracing` Log Appender | ||
|
||
Tracing appender is part of the | ||
[opentelemetry-appender-tracing](https://crates.io/crates/opentelemetry-appender-tracing) | ||
crate. | ||
|
||
The `tracing` appender bridges `tracing` logs to OpenTelemetry. Logs emitted via | ||
`tracing` macros (`info!`, `warn!`, etc.) are forwarded to OpenTelemetry through | ||
this integration. | ||
|
||
- `tracing` is designed for high performance, using *layers* or *subscribers* to | ||
handle emitted logs (events). | ||
- The appender implements a `Layer`, receiving logs from `tracing`. | ||
- Uses the OTel Logs API to create `LogRecord`, populate it, and emit it via | ||
`Logger.emit(LogRecord)`. | ||
- If no Logs SDK is present, the process is a no-op. | ||
|
||
Note on terminology: Within OpenTelemetry, "tracing" refers to distributed | ||
tracing (i.e creation of Spans) and not in-process structured logging and | ||
execution traces. The crate "tracing" has notion of creating Spans as well as | ||
Events. The events from "tracing" crate is what gets converted to OTel Logs, | ||
when using this appender. Spans created using "tracing" crate is not handled by | ||
this crate. | ||
|
||
## Performance | ||
|
||
// Call out things done specifically for performance | ||
|
||
### Perf test - benchmarks | ||
|
||
// Share ~~ numbers | ||
|
||
### Perf test - stress test | ||
|
||
// Share ~~ numbers | ||
|
||
## Summary | ||
|
||
- OpenTelemetry Logs does not provide a user-facing logging API. | ||
- Instead, it integrates with existing logging libraries (`log`, `tracing`). | ||
- The Logs API defines key traits but performs no operations unless an SDK is | ||
installed. | ||
- The Logs SDK enables log processing, transformation, and export. | ||
- The Logs SDK is performance optimized to minimize copying and heap allocation, | ||
wherever feasible. | ||
- The `tracing` appender efficiently routes logs to OpenTelemetry without | ||
modifying existing logging workflows. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# OpenTelemetry Rust Metrics Design | ||
|
||
Status: | ||
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md) | ||
|
||
TODO: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# OpenTelemetry Rust Traces Design | ||
|
||
Status: | ||
[Development](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md) | ||
|
||
TODO: |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.