all shards failed

4/7/2020 EsException

摘要

System:Centos7.X
JDK Version:1.8
Es Version:6.5.4

# 一:异常

[2020-04-03T10:28:18,895][WARN ][r.suppressed             ] [service3] path: /task_center_qa/_search, params: {typed_keys=true, ignore_unavailable=false, index=task_center_qa, type=todo, search_type=query_then_fetch, batched_reduce_size=512}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:293) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:133) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:254) ~[elasticsearch-6.5.4.jar:6.5.4]
	... ...
1
2
3
4
5
6

# 二:解决思路

# 2.1 健康检查

curl -XGET 'http://service3:9200/_cluster/health?pretty'   
1
{
    "cluster_name": "agency",
    "status": "green",
    "timed_out": false,
    "number_of_nodes": 3,
    "number_of_data_nodes": 3,
    "active_primary_shards": 73,
    "active_shards": 146,
    "relocating_shards": 0,
    "initializing_shards": 0,
    "unassigned_shards": 0,
    "delayed_unassigned_shards": 0,
    "number_of_pending_tasks": 0,
    "number_of_in_flight_fetch": 0,
    "task_max_waiting_in_queue_millis": 0,
    "active_shards_percent_as_number": 100
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# 2.2 分析检查结果

  1. 如果status为red,表明部分分片可用,部分分片损坏,同时执行查询仍然可以查到,具体原因,得根据log实际解决
  2. 如果active_shards_percent_as_number非100,则表明集群分片存在不可用
  3. 如果上述信息正常,检查是否重启过。如果有,可能重启过程中接入的查询,导致报错,重启正常后,则不会报相关错误

# 分片

# 分片非越多越好

最后更新: 11/3/2021, 10:12:01 PM