Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Resolve duplicate key exception in GetDatafeedRunningStateAction #125477

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hye-on
Copy link

@hye-on hye-on commented Mar 24, 2025

This PR fixes issue #104160 where a duplicate key exception occurs in GetDatafeedRunningStateAction.Response.fromResponses(). The issue happens when a datafeed is force-stopped and restarted before its local task cancellation completes. This creates a situation where two local tasks for the same datafeed temporarily coexist on the ML node (one cancelling, one starting), causing the duplicate key error when both report their state.
The solution implements a merge function in the toMap collector that selects the most appropriate state when duplicates are found, based on the searchInterval data.

The solution implements a merge function in the toMap collector that selects the most appropriate state when duplicates are found, based on the searchInterval data.
Select the most appropriate state based on:

  1. Prefer state with more recent searchInterval.startMs when both exist
  2. Prefer states with searchInterval over those without
  3. Default to second state when all criteria are equal

Comment

I'm new to Elasticsearch and open source contributions in general. I went with searchInterval.startMs as the selection criteria, but I'd appreciate any feedback on whether there might be a better approach for handling these duplicate states. Thank you for your guidance! :)

Implement merge function for duplicate datafeed states when a datafeed is force-stopped and restarted before cancellation completes. Select the most appropriate state based on:
1. Prefer state with more recent searchInterval.startMs when both exist
2. Prefer states with searchInterval over those without
3. Default to second state when all criteria are equal
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Mar 24, 2025
@hye-on hye-on changed the title Resolve duplicate key exception in GetDatafeedRunningStateAction [ML] Resolve duplicate key exception in GetDatafeedRunningStateAction Mar 26, 2025
@AI-IshanBhatt AI-IshanBhatt added the :Search Relevance/Search Catch all for Search Relevance label Mar 26, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 26, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Relevance/Search Catch all for Search Relevance Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants