Skip to content

Conversation

@jmacd
Copy link
Contributor

@jmacd jmacd commented Jan 20, 2026

Part of #1771.

Part of #1736.

Implements 4 of the 5 ProviderMode values.

Uses the ObservedStateStore's associated thread and channel to process console_async messages.

Replaces most of #1771.

Undoes portions of #1818:

  • ObservedEvent is an enum for Engine, Log events
  • Engine events return to Option<String> message, no structured message
  • Removes info_event! and error_event! structured message constructor macros
  • Moves LogRecord::Serialize support to where it's used

Adds new LoggingProviders selector admin to configure how the admin threads use internal logging. The new setting defaults to ConsoleDirect, i.e., the admin components will use synchronous console logging.

Configures the Tokio tracing subscriber globally, in engine threads, and in admin threads according to the ProviderMode.

The asynchronous tracing subscriber (which sends to console_async; will send to ITS in the future) uses the internal provider mode itself as a fallback. However, it does this directly, choosing the Noop or ConsoleDirect modes, OpenTelemetry mode is not supported here.

Resolves a TODO about inconsistency in the otel_xxx! macros. These now support full Tokio syntax following raw_error!
EDIT: portions of this PR were moved into #1843. This PR removes the top-level log dependency.

@github-actions github-actions bot added the rust Pull requests that update Rust code label Jan 20, 2026
@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 62.86595% with 241 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.71%. Comparing base (0ca8647) to head (3529280).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1841      +/-   ##
==========================================
+ Coverage   84.64%   84.71%   +0.07%     
==========================================
  Files         502      503       +1     
  Lines      148560   149956    +1396     
==========================================
+ Hits       125749   127041    +1292     
- Misses      22277    22381     +104     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 86.10% <62.86%> (+0.09%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.52% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
quiver 90.66% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jmacd jmacd marked this pull request as ready for review January 20, 2026 20:09
@jmacd jmacd requested a review from a team as a code owner January 20, 2026 20:09
jmacd

This comment was marked as outdated.

Copy link
Contributor Author

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typical startup log,

2026-01-20T20:09:00.592Z  INFO  otap-df-otap::receiver.start (crates/otap/src/fake_data_generator.rs:131):  [signals_per_second=100 signals/sec, max_batch_size=100, metrics_per_iteration=0, traces_per_iteration=0, logs_per_iteration=100]

github-merge-queue bot pushed a commit that referenced this pull request Jan 20, 2026
…#1843)

Part of #1771.

Part of #1736.

Overlaps with #1841 by copying the file
crates/telemetry/src/internal_events.rs to extend the otel_xxx macros to
full Tokio syntax, to replace uses of log formatting as needed.

After this, #1841 can remove "log" from the workspace Cargo.toml b/c
crates/state will have the remaining "log" references fixed there.
}

fn observe(&self, event: ObservedEvent) {
let sent = self.sender.send_timeout(event, self.timeout);
Copy link
Contributor

@utpilla utpilla Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the channel is full, this would block the thread until the channel has space or the timeout expires? Since this would be called by the engine threads, do we want a non-blocking behavior instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had also paused on this point. In addition to what I mentioned in another comment, I think that 1) we should first use try_send, and if that fails, we should then consult a policy, for example an enum with variants like timeout, drop, or block, and take the appropriate action based on that policy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a SendPolicy

/// How to act when an asynchronous event can't be sent.
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
pub struct SendPolicy {
    /// If set, wait for a timeout.
    pub blocking_timeout: Option<Duration>,

    /// If failed, issue a raw error to the console.
    pub console_fallback: bool,
}

and let the existing engine events block / console log, while ordinary logging events will not block and drop.

Copy link
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a few adjustments to make, but overall I'm fine with it. If my comments aren't clear enough, contact me on Slack.
Thanks for the update on the documentation.

Comment on lines +23 to +47
/// How to act when an asynchronous event can't be sent.
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
pub struct SendPolicy {
/// If set, wait for a timeout.
pub blocking_timeout: Option<Duration>,

/// If failed, issue a raw error to the console.
pub console_fallback: bool,
}

impl Default for ObservedStateSettings {
fn default() -> Self {
Self {
reporting_channel_size: 100,
reporting_timeout: Duration::from_millis(1),
engine_events: SendPolicy {
blocking_timeout: Some(Duration::from_millis(1)),
console_fallback: true,
},
logging_events: SendPolicy {
blocking_timeout: None,
console_fallback: false,
},
}
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI

Copy link
Contributor

@utpilla utpilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions to cleanup code.

@jmacd jmacd enabled auto-merge January 21, 2026 20:40
@jmacd jmacd added this pull request to the merge queue Jan 21, 2026
Merged via the queue into open-telemetry:main with commit 9ef8217 Jan 21, 2026
43 of 44 checks passed
@jmacd jmacd deleted the jmacd/internal_logging_7 branch January 21, 2026 21:05
cijothomas added a commit to cijothomas/otel-arrow that referenced this pull request Jan 22, 2026
…y#1841)

Part of open-telemetry#1771.

Part of open-telemetry#1736.

Implements 4 of the 5 ProviderMode values.

Uses the ObservedStateStore's associated thread and channel to process
console_async messages.

Replaces most of open-telemetry#1771.

Undoes portions of open-telemetry#1818:

- ObservedEvent is an enum for Engine, Log events
- Engine events return to `Option<String>` message, no structured
message
- Removes info_event! and error_event! structured message constructor
macros
- Moves LogRecord::Serialize support to where it's used

Adds new LoggingProviders selector `admin` to configure how the admin
threads use internal logging. The new setting defaults to ConsoleDirect,
i.e., the admin components will use synchronous console logging.

Configures the Tokio tracing subscriber globally, in engine threads, and
in admin threads according to the ProviderMode.

The asynchronous tracing subscriber (which sends to console_async; will
send to ITS in the future) uses the `internal` provider mode itself as a
fallback. However, it does this directly, choosing the Noop or
ConsoleDirect modes, OpenTelemetry mode is not supported here.

~Resolves a TODO about inconsistency in the otel_xxx! macros. These now
support full Tokio syntax following raw_error!~
EDIT: portions of this PR were moved into open-telemetry#1843. This PR removes the
top-level `log` dependency.

---------

Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Co-authored-by: Lalit Kumar Bhasin <lalit_fin@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants