Skip to content

AggregateExec not cancellable #16193

Closed
@pepijnve

Description

@pepijnve

Describe the bug

The streams created by AggregateExec consume their input in a loop in their poll implementations without ever explicitly yielding. If the input stream never returns Pending (which is often the case for file input), Tokio will never have the opportunity to abort the running task.

This is often hidden by the presence of CoalesceExec in a query plan which runs the aggregation in a separate task. CoalesceExec uses RecordBatchReceiverStream which does return Pending itself and as a consequence is cancellable, but dropping/aborting the aggregation task will still not immediately stop it since Tokio can only stop a task when it yields.

To Reproduce

  • Start datafusion-cli
  • Execute SET datafusion.execution.target_partitions = 1;
  • Execute SELECT sum(column) from table; on some relatively large table so that the query takes long enough to run
  • Press ctrl-c

Expected behavior

  • The running query is cancelled

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions