Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] DataStreamLifecycleDownsampleDisruptionIT testDataStreamLifecycleDownsampleRollingRestart failing #123769

Open
elasticsearchmachine opened this issue Feb 28, 2025 · 6 comments · Fixed by #125478
Assignees
Labels
:Data Management/Data streams Data streams and their lifecycles low-risk An open issue or test failure that is a low risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine commented Feb 28, 2025

Build Scans:

Reproduction Line:

gradlew ":x-pack:plugin:downsample:internalClusterTest" --tests "org.elasticsearch.xpack.downsample.DataStreamLifecycleDownsampleDisruptionIT.testDataStreamLifecycleDownsampleRollingRestart" -Dtests.seed=EA337873B547979F -Dtests.locale=bs-BA -Dtests.timezone=US/Alaska -Druntime.java=24

Applicable branches:
main

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: safeGet: listener was completed exceptionally

Issue Reasons:

  • [main] 2 consecutive failures in test testDataStreamLifecycleDownsampleRollingRestart
  • [main] 3 failures in test testDataStreamLifecycleDownsampleRollingRestart (1.7% fail rate in 177 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :Data Management/Data streams Data streams and their lifecycles >test-failure Triaged test failures from CI labels Feb 28, 2025
@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch main

Mute Reasons:

  • [main] 3 failures in test testDataStreamLifecycleDownsampleRollingRestart (0.4% fail rate in 844 executions)
  • [main] 2 failures in pipeline elasticsearch-periodic (13.3% fail rate in 15 executions)

Build Scans:

elasticsearchmachine added a commit that referenced this issue Feb 28, 2025
…DisruptionIT testDataStreamLifecycleDownsampleRollingRestart #123769
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Feb 28, 2025
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-data-management (Team:Data Management)

@dakrone dakrone added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Mar 4, 2025
@gmarouli
Copy link
Contributor

This failure appears to be the same with #122056 (comment). The ensureGreen(...) waits also for the relocating shards and this is why it timed out, because the shard of the downsampled index was relocating. I cannot reproduce it but I was able to get the following sneak peak from the cluster state:

global_routing_table{[default=>routing_table:
-- index [[downsample-5m-.ds-metrics-foo-2025.02.28-000001/n91rNoOCQqqIBeWxeZ0rMQ]]
----shard_id [downsample-5m-.ds-metrics-foo-2025.02.28-000001][0]
--------[downsample-5m-.ds-metrics-foo-2025.02.28-000001][0], node[Qo-EHHEPQSGLP_NcfW0Zfw], relocating [Jd4pWAHbTqe9F7QHQ2p51g], [P], s[RELOCATING], a[id=-o1yDP6bRNOTMHKBAFU6_g, rId=ZM6jSVlLQcGDbM5mqxbbvA], failed_attempts[0], expected_shard_size[172100]

-- index [[.ds-metrics-foo-2025.02.28-000002/iQ06TCVuSs-NEUIkeSRf1A]]
----shard_id [.ds-metrics-foo-2025.02.28-000002][0]
--------[.ds-metrics-foo-2025.02.28-000002][0], node[jolO9HlcT42OhfRi99hKMg], [P], s[STARTED], a[id=_z9d4KRoQgmkEFZXzBh8pQ], failed_attempts[0]

]}routing_nodes:
-----node_id[Qo-EHHEPQSGLP_NcfW0Zfw][V]
--------[downsample-5m-.ds-metrics-foo-2025.02.28-000001][0], node[Qo-EHHEPQSGLP_NcfW0Zfw], relocating [Jd4pWAHbTqe9F7QHQ2p51g], [P], s[RELOCATING], a[id=-o1yDP6bRNOTMHKBAFU6_g, rId=ZM6jSVlLQcGDbM5mqxbbvA], failed_attempts[0], expected_shard_size[172100]
-----node_id[jolO9HlcT42OhfRi99hKMg][V]
--------[.ds-metrics-foo-2025.02.28-000002][0], node[jolO9HlcT42OhfRi99hKMg], [P], s[STARTED], a[id=_z9d4KRoQgmkEFZXzBh8pQ], failed_attempts[0]
-----node_id[Jd4pWAHbTqe9F7QHQ2p51g][V]
--------[downsample-5m-.ds-metrics-foo-2025.02.28-000001][0], node[Jd4pWAHbTqe9F7QHQ2p51g], relocating [Qo-EHHEPQSGLP_NcfW0Zfw], [P], recovery_source[peer recovery], s[INITIALIZING], a[id=ZM6jSVlLQcGDbM5mqxbbvA, rId=-o1yDP6bRNOTMHKBAFU6_g], failed_attempts[0], expected_shard_size[172100]
---- unassigned
pending tasks:
{
  "tasks" : [ ]
}
hot threads:
Hot threads at 2025-02-28T22:19:11.877Z, interval=500ms, busiestThreads=9999, ignoreIdleThreads=false:

81.8% [cpu=35.9%, other=45.9%] (409ms out of 500ms) cpu usage by thread 'elasticsearch[node_t1][generic][T#2]'
  4/10 snapshots sharing following 25 elements
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:245)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  2/10 snapshots sharing following 37 elements
    app//org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:456)
    app//org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:502)
    app//org.apache.lucene.index.DocumentsWriter.maybeFlush(DocumentsWriter.java:456)
    app//org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:649)
    app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:578)
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112)
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:91)
    app//org.elasticsearch.index.engine.TranslogDirectoryReader.createInMemoryReader(TranslogDirectoryReader.java:204)
    app//org.elasticsearch.index.engine.TranslogDirectoryReader.create(TranslogDirectoryReader.java:103)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter.synthesizeSource(TranslogOperationAsserter.java:55)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter$2.assertSameIndexOperation(TranslogOperationAsserter.java:43)
    app//org.elasticsearch.index.translog.TranslogWriter.assertNoSeqNumberConflict(TranslogWriter.java:287)
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:265)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elastic  1> search.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  3/10 snapshots sharing following 24 elements
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  unique snapshot
    [email protected]/java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:527)
    [email protected]/java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:504)
    [email protected]/java.lang.ThreadLocal.get(ThreadLocal.java:187)
    [email protected]/java.lang.ThreadLocal.get(ThreadLocal.java:172)
    [email protected]/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:427)
    [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1146)
    [email protected]/java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:897)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:635)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)

81.4% [cpu=39.2%, other=42.2%] (407ms out of 500ms) cpu usage by thread 'elasticsearch[node_t1][generic][T#1]'
  2/10 snapshots sharing following 37 elements
    app//org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:538)
    app//org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:502)
    app//org.apache.lucene.index.DocumentsWriter.maybeFlush(DocumentsWriter.java:456)
    app//org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:649)
    app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:578)
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112)
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:91)
    app//org.elastic  1> search.index.engine.TranslogDirectoryReader.createInMemoryReader(TranslogDirectoryReader.java:204)
    app//org.elasticsearch.index.engine.TranslogDirectoryReader.create(TranslogDirectoryReader.java:103)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter.synthesizeSource(TranslogOperationAsserter.java:55)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter$2.assertSameIndexOperation(TranslogOperationAsserter.java:43)
    app//org.elasticsearch.index.translog.TranslogWriter.assertNoSeqNumberConflict(TranslogWriter.java:287)
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:265)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  2/10 snapshots sharing following 32 elements
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112)
    app//org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:91)
    app//org.elasticsearch.index.engine.TranslogDirectoryReader.createInMemoryReader(TranslogDirectoryReader.java:204)
    app//org.elasticsearch.index.engine.TranslogDirectoryReader.create(TranslogDirectoryReader.java:103)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter.synthesizeSource(TranslogOperationAsserter.java:55)
    app//org.elasticsearch.index.engine.TranslogOperationAsserter$2.assertSameIndexOperation(TranslogOperationAsserter.java:43)
    app//org.elasticsearch.index.translog.TranslogWriter.assertNoSeqNumberConflict(TranslogWriter.java:287)
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:265)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  4/10 snapshots sharing following 25 elements
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:245)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetServ  1> ice$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)
  2/10 snapshots sharing following 27 elements
    app//org.elasticsearch.index.engine.TranslogOperationAsserter$2.assertSameIndexOperation(TranslogOperationAsserter.java:43)
    app//org.elasticsearch.index.translog.TranslogWriter.assertNoSeqNumberConflict(TranslogWriter.java:287)
    app//org.elasticsearch.index.translog.TranslogWriter.add(TranslogWriter.java:265)
    app//org.elasticsearch.index.translog.Translog.add(Translog.java:633)
    app//org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:1229)
    app//org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:1085)
    app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:1011)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2027)
    app//org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:2014)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$4(RecoveryTarget.java:454)
    app//org.elasticsearch.indices.recovery.RecoveryTarget$$Lambda/0x00007f5eb8db69e0.get(Unknown Source)
    app//org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:367)
    app//org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:429)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.performTranslogOps(PeerRecoveryTargetService.java:652)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:597)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.handleRequest(PeerRecoveryTargetService.java:589)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:685)
    app//org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRequestHandler.messageReceived(PeerRecoveryTargetService.java:672)
    app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:90)
    app//org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:289)
    app//org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:302)
    app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
    app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
    [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    [email protected]/java.lang.Thread.runWith(Thread.java:1596)
    [email protected]/java.lang.Thread.run(Thread.java:1583)

I will ask around and if this looks normal, I will probably increase the timeout.

gmarouli added a commit to gmarouli/elasticsearch that referenced this issue Mar 28, 2025
…ic#123769 (elastic#125478)

(cherry picked from commit 1943844)

# Conflicts:
#	muted-tests.yml
#	x-pack/plugin/downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DataStreamLifecycleDownsampleDisruptionIT.java
elasticsearchmachine pushed a commit that referenced this issue Mar 28, 2025
… (#125478) (#125845)

(cherry picked from commit 1943844)

# Conflicts:
#	muted-tests.yml
#	x-pack/plugin/downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DataStreamLifecycleDownsampleDisruptionIT.java
elasticsearchmachine added a commit that referenced this issue Mar 28, 2025
…DisruptionIT testDataStreamLifecycleDownsampleRollingRestart #123769
@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch 8.x

Mute Reasons:

  • [8.x] 2 failures in test testDataStreamLifecycleDownsampleRollingRestart (1.4% fail rate in 143 executions)

Build Scans:

@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch main

Mute Reasons:

  • [main] 2 failures in test testDataStreamLifecycleDownsampleRollingRestart (1.1% fail rate in 176 executions)

Build Scans:

elasticsearchmachine added a commit that referenced this issue Mar 30, 2025
…DisruptionIT testDataStreamLifecycleDownsampleRollingRestart #123769
@nielsbauman
Copy link
Contributor

@gmarouli this is failing due to a timeout of 10s:

Caused by: java.util.concurrent.ExecutionException: org.elasticsearch.ElasticsearchTimeoutException: timed out after [10s/10000ms]

which we already add on the temporary cluster state listener here:

listener.addTimeout(ESTestCase.SAFE_AWAIT_TIMEOUT, clusterService.threadPool(), EsExecutors.DIRECT_EXECUTOR_SERVICE);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles low-risk An open issue or test failure that is a low risk to future releases Team:Data Management Meta label for data/management team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants