Skip to content

Conversation

VaggelisD
Copy link
Contributor

Follow up PR on model freshness (Previous PR)

This PR enables mixed SQLMesh & external (source) model freshness, e.g:

A: External model 
B: SELECT * FROM A
C: SELECT * FROM A, B

Model C will be evaluated if its upstream models are fresh, which is determined by:

  • C's last_altered_ts compared to external model's (A) INFORMATION_SCHEMA metadata
  • The existence (or not) of parent intervals for SQLMesh models (B), signalling if any model will be evaluated as part of this run

A new freshness-specific test file is added to break up the original test in various scenarios.

Comment on lines +316 to +319
# filters:
# branches:
# only:
# - main
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Reinstate before merging

upstream_parent_snapshots = {p for p in parent_snapshots if not p.is_external}
external_parents = snapshot.node.depends_on - {p.name for p in upstream_parent_snapshots}

if context.parent_intervals:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should exclude external models from this check. There could be intervals for external models to, for example, run audits, but it doesn't mean that the table is actually fresh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I guess I can exclude them at the time of parent_intervals construction, since that's a list so we've lost the snapshot mapping of each Interval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 9a8af2e, external snapshots are excluded from parent_intervals.

if parent.snapshot_id not in snapshot_intervals:
continue
_, p_intervals = snapshot_intervals[parent.snapshot_id]
parent_intervals.append(p_intervals)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right place to construct this. The missing intervals will change as we apply signal checks to snapshots. If you set them once here, the parent signals will not be reflected.

Copy link
Contributor Author

@VaggelisD VaggelisD Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

I guess our best choice would be to check snapshot_batches then (same loop) which applies the signal to filter out the unready intervals.

So, for each snapshot in the DAG, its ExecutionContext::parent_intervals list will be extended by the snapshot_batches[parent], provided the parent snapshot is NOT external.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this sounds correct, assuming I understood the idea correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 9a8af2e

@VaggelisD VaggelisD force-pushed the vaggelisd/mixed_model_freshness branch from 46715f5 to 9a8af2e Compare October 13, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants