Description
我现在有两个独立的es集群,每个集群都是三个节点。
我们需要遍历所有的数据 ,当前是使用scroll的方式遍历。
目前发现es会报错,错误日志是
org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:75) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:32) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.TransportSearchScrollAction$1.onFailure(TransportSearchScrollAction.java:89) ~[elasticsearch-8.16.4.jar:?]
... 145 more
Caused by: java.lang.IllegalStateException: node [EgwGsv2RS66Fr1JMtJzM9A] is not available
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:133) ~[elasticsearch-8.16.4.jar:?]
... 143 more
[2026-04-02T01:00:03,469][WARN ][r.suppressed ] [node-1] path: /_search/scroll, params: {scroll=600000ms, scroll_id=FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFk5UckpOTHQ3U0J1cnQzMm1YZUdmR1EAAAAAAB0lHRY1WlEtYlJOelRnbUtucDdrRUtpcUln}, status: 500
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.SearchScrollAsyncAction.onShardFailure(SearchScrollAsyncAction.java:289) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:137) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.lambda$run$0(SearchScrollAsyncAction.java:90) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.collectNodesAndRun(SearchScrollAsyncAction.java:112) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:86) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.TransportSearchScrollAction.doExecute(TransportSearchScrollAction.java:116) ~[elasticsearch-8.16.4.jar:?]
怀疑是网关把scroll请求对两边都进行了转发, 但是实际上其中一边集群是没有这个scroll_id记录了,所以查询报错了
这个要怎么定位问题和修复?
Description
我现在有两个独立的es集群,每个集群都是三个节点。
我们需要遍历所有的数据 ,当前是使用scroll的方式遍历。
目前发现es会报错,错误日志是
org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:75) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:32) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.TransportSearchScrollAction$1.onFailure(TransportSearchScrollAction.java:89) ~[elasticsearch-8.16.4.jar:?]
... 145 more
Caused by: java.lang.IllegalStateException: node [EgwGsv2RS66Fr1JMtJzM9A] is not available
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:133) ~[elasticsearch-8.16.4.jar:?]
... 143 more
[2026-04-02T01:00:03,469][WARN ][r.suppressed ] [node-1] path: /_search/scroll, params: {scroll=600000ms, scroll_id=FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFk5UckpOTHQ3U0J1cnQzMm1YZUdmR1EAAAAAAB0lHRY1WlEtYlJOelRnbUtucDdrRUtpcUln}, status: 500
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.SearchScrollAsyncAction.onShardFailure(SearchScrollAsyncAction.java:289) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:137) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.lambda$run$0(SearchScrollAsyncAction.java:90) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.collectNodesAndRun(SearchScrollAsyncAction.java:112) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.SearchScrollAsyncAction.run(SearchScrollAsyncAction.java:86) ~[elasticsearch-8.16.4.jar:?]
at org.elasticsearch.action.search.TransportSearchScrollAction.doExecute(TransportSearchScrollAction.java:116) ~[elasticsearch-8.16.4.jar:?]
怀疑是网关把scroll请求对两边都进行了转发, 但是实际上其中一边集群是没有这个scroll_id记录了,所以查询报错了
这个要怎么定位问题和修复?