-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
improv(webapp): Add new table for optimized logs search #3036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
d322cec
a614149
5eca0e3
cc6dc02
51d5852
f3534c4
51ceac9
b9324cd
8dd9258
58022cc
96163c6
9a83b02
be9888d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚩 CANCELLED status SPANs are unreachable by any level filter The
(Refers to lines 298-301) Was this helpful? React with 👍 or 👎 to provide feedback.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will be figured out in the future after some alpha testing in prod. This status is not that used and we might not want to filter by it. |
devin-ai-integration[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| -- +goose Up | ||
| CREATE TABLE IF NOT EXISTS trigger_dev.task_events_search_v1 | ||
| ( | ||
| environment_id String, | ||
| organization_id String, | ||
| project_id String, | ||
| triggered_timestamp DateTime64(9) CODEC(Delta(8), ZSTD(1)), | ||
| trace_id String CODEC(ZSTD(1)), | ||
| span_id String CODEC(ZSTD(1)), | ||
| run_id String CODEC(ZSTD(1)), | ||
| task_identifier String CODEC(ZSTD(1)), | ||
| start_time DateTime64(9) CODEC(Delta(8), ZSTD(1)), | ||
| inserted_at DateTime64(3), | ||
| message String CODEC(ZSTD(1)), | ||
| kind LowCardinality(String) CODEC(ZSTD(1)), | ||
| status LowCardinality(String) CODEC(ZSTD(1)), | ||
| duration UInt64 CODEC(ZSTD(1)), | ||
| parent_span_id String CODEC(ZSTD(1)), | ||
| attributes_text String CODEC(ZSTD(1)), | ||
|
|
||
| INDEX idx_run_id run_id TYPE bloom_filter(0.001) GRANULARITY 1, | ||
| INDEX idx_message_text_search lower(message) TYPE ngrambf_v1(3, 32768, 2, 0) GRANULARITY 1, | ||
| INDEX idx_attributes_text_search lower(attributes_text) TYPE ngrambf_v1(3, 32768, 2, 0) GRANULARITY 1 | ||
| ) | ||
| ENGINE = MergeTree | ||
| PARTITION BY toDate(triggered_timestamp) | ||
| ORDER BY (organization_id, environment_id, triggered_timestamp, span_id) | ||
| --Right now we have maximum retention of up to 30 days based on plan. | ||
| --We put a logical limit for now, the 90 DAY TTL is just a backup | ||
| --This might need to be updated for longer retention periods | ||
| TTL toDateTime(triggered_timestamp) + INTERVAL 90 DAY | ||
| SETTINGS ttl_only_drop_parts = 1; | ||
|
|
||
| CREATE MATERIALIZED VIEW IF NOT EXISTS trigger_dev.task_events_search_mv_v1 | ||
| TO trigger_dev.task_events_search_v1 AS | ||
| SELECT | ||
| environment_id, | ||
| organization_id, | ||
| project_id, | ||
| trace_id, | ||
| span_id, | ||
| run_id, | ||
| task_identifier, | ||
| start_time, | ||
| inserted_at, | ||
| message, | ||
| kind, | ||
| status, | ||
| duration, | ||
| parent_span_id, | ||
| toJSONString(attributes) AS attributes_text, | ||
| fromUnixTimestamp64Nano(toUnixTimestamp64Nano(start_time) + toInt64(duration)) AS triggered_timestamp | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚩 triggered_timestamp = start_time + duration means log events and span events sort differently than before The MV computes Was this helpful? React with 👍 or 👎 to provide feedback.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. intended |
||
| FROM trigger_dev.task_events_v2 | ||
| WHERE | ||
| kind != 'DEBUG_EVENT' | ||
| AND status != 'PARTIAL' | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚩 MV's status != 'PARTIAL' is broader than the old SPAN-only PARTIAL filter The old code had Was this helpful? React with 👍 or 👎 to provide feedback. |
||
| AND NOT (kind = 'SPAN_EVENT' AND attributes_text = '{}') | ||
| AND kind != 'ANCESTOR_OVERRIDE' | ||
| AND message != 'trigger.dev/start'; | ||
mpcgrid marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| -- +goose Down | ||
| DROP VIEW IF EXISTS trigger_dev.task_events_search_mv_v1; | ||
| DROP TABLE IF EXISTS trigger_dev.task_events_search_v1; | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚩 Partition pruning with
inserted_atis ineffective on the new search tableThe LogsListPresenter applies
inserted_at >= ...andinserted_at <= ...filters (lines 269-271, 282-284). On the oldtask_events_v2table, these were effective for partition pruning since it usedPARTITION BY toDate(inserted_at)(010_add_task_events_v2.sql:48). However, the newtask_events_search_v1table usesPARTITION BY toDate(triggered_timestamp)(016_add_task_events_search_v1.sql:26). Theinserted_atfilter will still execute correctly but won't help prune partitions, making it dead weight. For effective partition pruning, the time range filter should targettriggered_timestampinstead. Thestart_timefilters that are also applied are close totriggered_timestamp(sincetriggered_timestamp = start_time + duration) but won't trigger partition pruning either since they reference a different column. This won't cause incorrect results but may degrade query performance for large datasets.(Refers to lines 266-289)
Was this helpful? React with 👍 or 👎 to provide feedback.