Skip to content

ES|QL: Improve generative tests for FORK [130015] #131206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

svilen-mihaylov-elastic
Copy link
Contributor

@svilen-mihaylov-elastic svilen-mihaylov-elastic commented Jul 14, 2025

Addresses #130015

@svilen-mihaylov-elastic svilen-mihaylov-elastic changed the title ForkGen Extend tests for Fork Jul 16, 2025
@svilen-mihaylov-elastic svilen-mihaylov-elastic marked this pull request as ready for review July 16, 2025 17:43
@svilen-mihaylov-elastic svilen-mihaylov-elastic added >test Issues or PRs that are addressing/adding tests :Search Relevance/Search Catch all for Search Relevance labels Jul 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 16, 2025
@svilen-mihaylov-elastic svilen-mihaylov-elastic changed the title Extend tests for Fork ES|QL: Improve generative tests for FORK https://github.com/elastic/elasticsearch/issues/130015 Jul 16, 2025
@svilen-mihaylov-elastic svilen-mihaylov-elastic changed the title ES|QL: Improve generative tests for FORK https://github.com/elastic/elasticsearch/issues/130015 ES|QL: Improve generative tests for FORK #130015 Jul 16, 2025
@svilen-mihaylov-elastic svilen-mihaylov-elastic changed the title ES|QL: Improve generative tests for FORK #130015 ES|QL: Improve generative tests for FORK [130015] Jul 16, 2025
@ioanatia ioanatia requested a review from luigidellaquila July 16, 2025 17:55
Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final String command = current.commandString();

// Try appending new command to parent of Fork. If we successfully execute (without exception) AND still retain the same
// schema (all Fork branches must have the same schema), we append the command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a strict constraint? Isn't it enough that there are no type conflicts between branches?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would generate more interesting commands if we did not have this restriction here.
Right now AFAICS when I run this, the FORK branches contain mostly WHERE/MV_EXPAND/SORT and ENRICH sometimes.

FORK branches don't need to have the same schema - we just need to be sure that if a column is present in multiple branches, it has the same data type everywhere.

Copy link
Contributor Author

@svilen-mihaylov-elastic svilen-mihaylov-elastic Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct, the schemas do not need to be exactly the same, just the overlapping columns need to have the same type.

I spend some time thinking about this and decided to go with the simple solution (for now). I think it will be challenging to allow different schemas which have compatible types AND allow for (mostly independent) fork sub-pipelines. Particularly for non-trivial subpipelines (> 5 stages), it will be non-trivial to keep adding random stages and checking if the types remain compatible. This may lead to a lot of repetitions and discarded results which will make the test run (possibly) a lot slower.

I will update the comment., and in any case, options are open later to iterate to further on this condition to balance coverage of Fork sub-pipelines and performance of the test itself.

final String command = current.commandString();

// Try appending new command to parent of Fork. If we successfully execute (without exception) AND still retain the same
// schema (all Fork branches must have the same schema), we append the command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would generate more interesting commands if we did not have this restriction here.
Right now AFAICS when I run this, the FORK branches contain mostly WHERE/MV_EXPAND/SORT and ENRICH sometimes.

FORK branches don't need to have the same schema - we just need to be sure that if a column is present in multiple branches, it has the same data type everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Relevance/Search Catch all for Search Relevance Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants