搭建 ELK 问题排查

网友投稿 1008 2022-10-31

搭建 ELK 问题排查

搭建 ELK 问题排查

最近在电脑上开了三个虚拟机鼓捣了一下 ELK,配置成功之后,关闭虚拟机重新打开总是出现一些奇怪的问题,使得 kibana 处于不健康的状态,真是让人操碎了心。

一、前言

在搭建过程中,本人是依据以下两篇文章进行的,步骤明确,效果清晰。

1、  ​​搭建ELK日志分析平台(上)—— ELK介绍及搭建 Elasticsearch 分布式集群​​

2、  ​​搭建ELK日志分析平台(下)—— 搭建kibana和logstash服务器​​

以下记录本人在实现过程中遇到的问题以及最终解决的思路。

二、elasticsearch 集群状态不健康

1、问题描述

elasticsearch (以下简称 es)集群状态处于 yellow 或者 red 状态,2 个数据节点未成功接入主节点,number_of_nodes 数量仍为 1,kibana 界面报错 503。

[root@server ~]# curl '192.168.100.15:9200/_cluster/health?pretty'{ "cluster_name" : "server-node", "status" : "red", # 为 green 则代表健康没问题,如果是 yellow 或者 red 则是集群有问题 "timed_out" : false, # 是否有超时 "number_of_nodes" : 1, # 集群中的节点数量 "number_of_data_nodes" : 0, # 集群中data节点的数量 "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 12, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 0.0 # 可用性百分比,此处为 0 不可用}

2、排查思路

1)首先确保 es 主节点最先启动,随后启动数据节点;

2)允许 selinux(非必要),关闭 iptables;

3)确保数据节点的配置文件正确。

3、排查后状态

[root@server ~]# curl '192.168.100.15:9200/_cluster/health?pretty' { "cluster_name" : "server-node", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 2, "active_primary_shards" : 5, "active_shards" : 10, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0}

三、索引状态不健康

1、问题描述

[root@server ~]# curl '192.168.100.15:9200/_cat/indices?v'health status index uuid pri rep docs.count docs.deleted store.size pri.store.sizered open system-syslog-2018.09 JPDsnK_qSym-sjOiZS9zAw 5 1 548 0 719.6kb 345.3kb

2、排查思路

1)确认 logstash 是否正常启动,端口(9600以及各日志索引配置端口)是否存在;

2)删除不正常索引,重新启动 logstash;

[root@server ~]# curl -XDELETE kibana 状态

[root@server ~]# systemctl status kibana● kibana.service - Kibana Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2018-09-29 09:01:44 CST; 31min ago Main PID: 646 (node) CGroup: /system.slice/kibana.service └─646 /usr/share/kibana/bin/../node/bin/node --no-warnings /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml...Sep 29 09:01:44 server systemd[1]: Started Kibana.Sep 29 09:01:44 server systemd[1]: Starting Kibana...

4)重新查看索引状态

3、排查后状态

[root@server ~]# curl '192.168.100.15:9200/_cat/indices?v' health status index uuid pri rep docs.count docs.deleted store.size pri.store.sizegreen open system-syslog-2018.09 TR_gdOb8RDSRtHj_g4a4_g 5 1 3 0 62.2kb 31.1kb

四、es 的 node 日志报错

1、问题描述

[2018-09-28T21:23:20,487][DEBUG][o.e.a.s.TransportSearchAction] [server] All shards failed for phase: [query][2018-09-28T21:23:20,488][WARN ][r.suppressed ] path: /.kibana/_search, params: {ignore_unavailable=true, index=.kibana, filter_path=aggregations.types.buckets}org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:293) ~[elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:133) ~[elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:254) ~[elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.action.search.InitialSearchPhase.onShardFailure(InitialSearchPhase.java:101) ~[elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.action.search.InitialSearchPhase.lambda$performPhaseOnShard$1(InitialSearchPhase.java:210) ~[elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.action.search.InitialSearchPhase$1.doRun(InitialSearchPhase.java:189) [elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.4.1.jar:6.4.1] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.1.jar:6.4.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144][2018-09-28T21:23:20,487][WARN ][r.suppressed ] path: /.kibana/doc/config%3A6.4.1, params: {index=.kibana, id=config:6.4.1, type=doc}org.elasticsearch.action.NoShardAvailableActionException: No shard available for [get [.kibana][doc][config:6.4.1]: routing [null]] at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.perform(TransportSingleShardAction.java:207) ~[elasticsearch-6.4.1.jar:6.4.1]

2、排查思路

1)删除索引

[root@server ~]# curl -XDELETE kibana 界面开启或关闭

3、排查后状态

[root@server ~]# curl '192.168.100.15:9200/_cat/indices?v' health status index uuid pri rep docs.count docs.deleted store.size pri.store.sizegreen open system-syslog-2018.09 TR_gdOb8RDSRtHj_g4a4_g 5 1 20 0 240.6kb 120.3kbgreen open .kibana GGWwf7gdTwCKMn3BqRaGcQ 1 1 2 0 22kb 11kb

参考资料

1. ​​聊一聊Elasticsearch的健康状态​​

2. ​​Elasticsearch系列篇之删除索引​​

3. ​​Elasticsearchallshardsfailed:[unsupported_operation_exception]null​​

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:警惕ERP项目被“中间商赚差价”
下一篇:YayCrawler 基于规则配置的通用分布式爬虫框架
相关文章

 发表评论

暂时没有评论,来抢沙发吧~