Skip to content

Conversation

@bruceg
Copy link
Member

@bruceg bruceg commented Jan 9, 2026

Summary

This is a combination of two major changes:

  1. We want to record the time at which events are ingested in order to track
    latency as each travels through the topology. This is currently recorded for log
    events using the Vector namespace in the vector.ingest_timestamp metadata
    field, but we want it to be usable for all event types. As such, we need a new
    field in struct EventMetadata.

This change adds the new field as an Option so as to retain sane semantics for
Default implementations and to avoid extra calls to Utc::now. The
SourceSender::send method sets this if the source doesn't so as to ensure
complete coverage.

For backward compatibility, this metadata is still inserted into the Vector log
namespace metadata, taken from this new field. Since this metadata is set up
before passing the events to the SourceSender, the ingest timestamp is set
manually in sources that can create Vector namespace logs.

  1. This adds an optional trait BufferInstrumentation hook to struct BufferSender which is called at the very start of the buffer send path. We use
    that hook to take the previously added universal event ingest_timestamp
    metadata and from it calculate the total time spent processing the event,
    including buffering delays. This time is emitted in internal metrics as a
    event_processing_time_seconds histogram and event_processing_time_mean_seconds gauge, the
    latter using an EWMA to smooth the mean over time.

This PR is broken down into 4 incremental changes, so if it is too large to review in one go, I can trivially break it apart.

Vector configuration

Same configuration as #24453 with simply internal_metrics, a transform, and console, and there are the new metrics.

How did you test this PR?

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

bruceg added 2 commits January 9, 2026 15:49
We want to record the time at which events are ingested in order to track
latency as each travels through the topology. This is currently recorded for log
events using the Vector namespace in the `vector.ingest_timestamp` metadata
field, but we want it to be usable for all event types.  As such, we need a new
field in `struct EventMetadata`.

This change adds the new field as an `Option` so as to retain sane semantics for
`Default` implementations and to avoid extra calls to `Utc::now`.  The
`SourceSender::send` method sets this if the source doesn't so as to ensure
complete coverage.

For backward compatibility, this metadata is still inserted into the Vector log
namespace metadata, taken from this new field. Since this metadata is set up
before passing the events to the `SourceSender`, the ingest timestamp is set
manually in sources that can create Vector namespace logs.
@bruceg bruceg added the type: enhancement A value-adding code change that enhances its existing functionality. label Jan 9, 2026
@bruceg bruceg requested review from a team as code owners January 9, 2026 22:04
@bruceg bruceg added the domain: observability Anything related to monitoring/observing Vector label Jan 9, 2026
@bruceg bruceg requested a review from a team as a code owner January 9, 2026 22:04
@bruceg bruceg added domain: transforms Anything related to Vector's transform components domain: core Anything related to core crates i.e. vector-core, core-common, etc labels Jan 9, 2026
@github-actions github-actions bot added domain: topology Anything related to Vector's topology code domain: sources Anything related to the Vector's sources domain: sinks Anything related to the Vector's sinks domain: external docs Anything related to Vector's external, public documentation and removed domain: transforms Anything related to Vector's transform components labels Jan 9, 2026
@bruceg
Copy link
Member Author

bruceg commented Jan 9, 2026

One question regarding the scale of the metrics here for use in a histogram: The processing time for each event is typically very small. This results in a histogram where all (or almost all) of the counts land in the smallest bucket which is rather undesirable. Should we scale the numbers to milliseconds (relatively standard) or microseconds (more precise), extend the smallest buckets down smaller yet, or something else?

…sing time

This adds an optional `trait BufferInstrumentation` hook to `struct
BufferSender` which is called at the very start of the buffer send path. We use
that hook to take the previously added universal event `ingest_timestamp`
metadata and from it calculate the total time spent processing the event,
including buffering delays. This time is emitted in internal metrics as a
`event_processing_time_seconds` histogram and
`event_processing_time_mean_seconds` gauge, the latter using an EWMA to smooth
the mean over time.
@bruceg bruceg force-pushed the bruceg/event-processing-time branch from edcd7d8 to d5e5b7e Compare January 9, 2026 22:16
Copy link
Contributor

@drichards-87 drichards-87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a small suggestion on the PR from the Docs Team and approved the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: external docs Anything related to Vector's external, public documentation domain: observability Anything related to monitoring/observing Vector domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources domain: topology Anything related to Vector's topology code type: enhancement A value-adding code change that enhances its existing functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants