distributed query execution blog post #7020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

rubywtl wants to merge 1 commit into cortexproject:master from rubywtl:distributed-exec-blog

Contributor

rubywtl commented Sep 5, 2025

What this PR does:
This is the blog post for distributed query execution project.

Which issue(s) this PR fixes:
Related to distributed query execution proposal

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]


          distributed query execution blog post

5f24188

Signed-off-by: rubywtl <[email protected]>

pull-request-size bot added the size/M label

rubywtl marked this pull request as draft

September 5, 2025 23:35

dosubot bot added the component/documentation label

dsabsay reviewed

View reviewed changes

Contributor

dsabsay left a comment

@rubywtl This wonderful and ambitious work (the code and the blog post both). Great diagrams, also. I left some suggestions.

website/content/en/blog/2025/distributed-exec-blog.md


		Our new distributed query execution approach takes a more sophisticated route by breaking queries down to executable fragments at the expression level. Unlike traditional query-level processing where results are merged only at the query frontend, this fine-grained approach enables dynamic collaboration between execution units during runtime. Each fragment operates independently while maintaining the ability to exchange partial results with other units as needed. This enhanced granularity not only increases opportunities for parallel processing but also enables deeper query optimization. The system ultimately combines these distributed results to produce the final output, achieving better resource utilization and performance compared to conventional query splitting strategies.

		![Comparison between previous query splitting strategy v.s. distributed query execution](/images/blog/2025/distributed-exec-splittingStrat.png)

Contributor

dsabsay Sep 12, 2025

The expression-level splitting part of this diagram may not be obvious to some readers. In particular, it's not clear what is meant by an "operator". Perhaps we could add an example PromQL expression to this diagram to help people see how it gets split?

website/content/en/blog/2025/distributed-exec-blog.md


		## Results

		Distributed query execution has demonstrated significant improvements in query processing by effectively reducing resource contention. This enhancement is achieved by distributing query workloads across multiple queriers, effectively increasing the practical memory limit for high-cardinality queries that previously failed due to memory constraints.

Contributor

dsabsay Sep 12, 2025

Again, not sure contention is the right word.

website/content/en/blog/2025/distributed-exec-blog.md

		@@ -0,0 +1,67 @@
		# A First Look at Distributed Query Execution in Cortex

		> One of the persistent challenges in Cortex has been dealing with resource contention in a single querier node, which occurs when too much data is pulled. While implementing safeguards through limits has been effective as a temporary solution, we needed to address this issue at its root to allow Cortex to process more data efficiently. This is where distributed query execution comes into play, offering a solution that both expands query limits and improves efficiency through parallel processing and result aggregation.

Contributor

dsabsay Sep 12, 2025

This is very pedantic of me, but I don't think resource contention is the root problem, but really the memory capacity of single nodes.

Contention implies a competition for resources (e.g. between two concurrently running queries), which we might address with a more sophisticated scheduling algorithm. In this case, we are improving the horizontal scalability of queries. While it might alleviate contention, it doesn't necessarily. I think it would be more precise to reference vertical vs. horizontal scalability. Again, I am being pedantic :)

I would reword this first sentence to say "One of the persistent challenges in Cortex has been vertical scaling limits in individual querier nodes, ..."

website/content/en/blog/2025/distributed-exec-blog.md


		### Query Scheduler: Fragment Coordination

		The Query Scheduler implements a sophisticated coordination mechanism that orchestrates the distributed execution of query fragments. It performs a bottom-up traversal of the logical plan, identifying cut points at remote nodes to ensure proper execution order of child-to-root. This approach guarantees that child fragments are enqueued and processed before their parent fragments, maintaining data dependency requirements and ensure there are not too many idle querier.

Contributor

dsabsay Sep 12, 2025

Suggested change

      
            The Query Scheduler implements a sophisticated coordination mechanism that orchestrates the distributed execution of query fragments. It performs a bottom-up traversal of the logical plan, identifying cut points at remote nodes to ensure proper execution order of child-to-root. This approach guarantees that child fragments are enqueued and processed before their parent fragments, maintaining data dependency requirements and ensure there are not too many idle querier.
          
            The Query Scheduler implements a sophisticated coordination mechanism that orchestrates the distributed execution of query fragments. It performs a bottom-up traversal of the logical plan, identifying cut points at remote nodes to ensure proper execution order of child-to-root. This approach guarantees that child fragments are enqueued and processed before their parent fragments, maintaining data dependency requirements and ensure there are not too many idle queriers.

website/content/en/blog/2025/distributed-exec-blog.md

+              While the current implementation of distributed query execution already offers benefits, it represents just the beginning of Cortex's optimization journey. To fully realize its potential, several key enhancements are needed:
+              ### Enhance Distributed Optimizer
+              Distributed optimizer currently support binary expressions, but in the future it should manage more complex operations, including complicated join functions and advanced query patterns.

Contributor

dsabsay Sep 12, 2025

Suggested change

      
            Distributed optimizer currently support binary expressions, but in the future it should manage more complex operations, including complicated join functions and advanced query patterns.
          
            Distributed optimizer currently supports binary expressions, but in the future it should manage more complex operations, including complicated join functions and advanced query patterns.

website/content/en/blog/2025/distributed-exec-blog.md

+              Distributed optimizer currently support binary expressions, but in the future it should manage more complex operations, including complicated join functions and advanced query patterns.
+              ### Storage Layer Sharding
+              Implementing more sophisticated storage sharding strategies can better distribute data across the cluster. For example, allow max(A) to be split to max(A[shard=0]) and max(A[shard=1]). Initial experiments with ingester storage have already demonstrated impressive results: binary expressions show 1.7-2.8x performance improvements, while multiple binary expressions achieve up to 5x speed.While not included in the initial release, we invite contributors to continue to develop these sharding capabilities for both ingestor and store-gateway components.

Contributor

dsabsay Sep 12, 2025

Suggested change

      
            Implementing more sophisticated storage sharding strategies can better distribute data across the cluster. For example, allow max(A) to be split to max(A[shard=0]) and max(A[shard=1]). Initial experiments with ingester storage have already demonstrated impressive results: binary expressions show 1.7-2.8x performance improvements, while multiple binary expressions achieve up to 5x speed.While not included in the initial release, we invite contributors to continue to develop these sharding capabilities for both ingestor and store-gateway components.
          
            Implementing more sophisticated storage sharding strategies can better distribute data across the cluster. For example, allow max(A) to be split to max(A[shard=0]) and max(A[shard=1]). Initial experiments with ingester storage have already demonstrated impressive results: binary expressions show 1.7-2.8x performance improvements, while multiple binary expressions achieve up to 5x speed. While not included in the initial release, we invite contributors to continue to develop these sharding capabilities for both ingestor and store-gateway components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/documentation size/M