0%

elk-log-window

ELK搭建开源日志系统(window版本)—图文详细

日志对于排查错误非常重要,使用linux命令awk sed grep find等命令查询日志非常麻烦,而且很难做数据分析,使用免费开源的ELK可以支撑大规模的日志检索,本文将一步步教怎么快速搭建一个window版本的ELK日志收集系统。

下载elasticsearch、logstash、kibana、filebeat

注意同一系列的版本要一样,防止出现版本不兼容问题,本文使用7.16.0版本,在window系统演示

下载elasticsearch

访问地址为:https://www.elastic.co/cn/downloads/past-releases

点击Donload下载

image-20220418221506034

跳转到访问地址为: https://www.elastic.co/cn/downloads/past-releases/elasticsearch-7-16-0

可以选择window或者linux版本,本文下载window版本

image-20220418222252223

下载logstash

点击Donload下载

image-20220418222022349

跳转到访问地址为: https://www.elastic.co/cn/downloads/past-releases/logstash-7-16-0

选择window版本

image-20220418222156985

下载kibana

image-20220418222344389

访问地址为:https://www.elastic.co/cn/downloads/past-releases/kibana-7-16-0

选择window版本

image-20220418222406775

跳转到访问地址为:https://www.elastic.co/cn/downloads/past-releases/filebeat-7-16-0

下载filebeat

image-20220418222601130

选择window版本

image-20220418222630758

下载jdk11

由于7.16.0版本需要依赖java jdk11版本,需要将本地java环境切换到jdk11

访问网站: http://www.codebaoku.com/jdk/jdk-oracle-jdk11.html

点击下载

image-20220420200908722

全部下载完解压

image-20220420202003124

安装jdk11、elasticsearch、kibana、logstash、filebeat

安装jdk11

使用win+X键,并选择Windows终端

image-20220420202133767

image-20220420202251028

输入

1
2
cd F:\soft\elk
dir

image-20220420202824707

打开window搜索框搜索环境变量,打开编辑系统环境变量

image-20220420203326862

点击环境变量

image-20220420203449479

添加JAVA_HOME路径

1
F:\soft\elk\jdk-11.0.13_windows-x64_bin\jdk-11.0.13

image-20220420205025647

将java执行路径添加到Path变量中,输入

1
2
%JAVA_HOME%\bin
%JAVA_HOM%\jre\bin

image-20220420203823640

新打开一个shell(一定要重新打开一个新的shell才会加载刚配置过的jdk11环境变量)

并执行命令,可以看到jdk11提示输出,表示jdk11安装成功

1
java -version

image-20220420214822004

启动elasticsearch

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\elasticsearch-7.16.0-windows-x86_64\elasticsearch-7.16.0\bin\elasticsearch.bat

如果本地使用localhost访问,不需要修改配置文件,否则需要修改如下

1
network.host: 0.0.0.0

image-20220423214458423

image-20220420204714883

可以看到elasticsearch执行成功

image-20220420215030696

启动kibana

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\kibana-7.16.0-windows-x86_64\kibana-7.16.0-windows-x86_64\bin\kibana.bat

image-20220420215355755

可以看到执行成功

image-20220420215454109

访问网站,可以看到启动成功

http://localhost:5601/app/home#/

image-20220420215557167

点击Explore on my own

image-20220423144541137

启动logstash

进入

1
F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config

在logstash配置文件中,新增文件名log.conf

image-20220420220239744

内容如下:

输入指定通过5044端口使用Filebeat接收数据。

指定在elasticsearch中创建test的索引,将数据输出到test索引中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
beats {
port => 5044
}
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "test"
#user => "elastic"
#password => "changeme"
}
}

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\logstash-7.16.0-windows-x86_64\logstash-7.16.0\bin\logstash.bat -f F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config\log.conf

image-20220420220650137

可以看到启动成功

image-20220420220729325

启动filebeat

进入

1
F:\soft\elk\log

新建日志log测试日志文件夹

image-20220420221138864

进入log文件夹,创建data.log文件

image-20220420221231158

内容为

1
2
3
[08/Nov/2019:11:40:24 +0800] tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms
[08/Nov/2019:11:40:24 +0800] tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:63ms
[08/Nov/2019:11:40:24 +0800] tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:75ms

image-20220420221403224

编辑filebeat文件

进入到

1
F:\soft\elk\filebeat-7.16.0-windows-x86_64

image-20220420221433916

将enable改成true,同时设置日志路径为

1
F:\soft\elk\log\*.log

image-20220420223015584

将filebeat.config.modules的enable改成true

image-20220423144213743

将输出到elasticsearch中使用#注释去掉,并将输出到logstash注释删除掉

image-20220420221708300

1
2
cd F:\soft\elk
.\filebeat-7.16.0-windows-x86_64\filebeat.exe -e -c F:\soft\elk\filebeat-7.16.0-windows-x86_64\filebeat.yml

image-20220420222240061

可以看到filebeat运行成功

image-20220423144343293

kibana简单查询日志

命令查询创建的索引数据

访问网站,并点击左上角

http://localhost:5601/app/home#/

image-20220423144541137

左侧向下滑动,选中management,并点击Dev Tools

image-20220423144617902

在Console输入下面命令

1
GET /_cat/indices?v

在点击绿色执行按钮,可以看到,索引test已经创建

image-20220423144938859

输入下面命令,查询索引test的数据,可以看到日志数据已经成功上传

1
2
3
4
5
6
GET test/_search
{
"query": {
"match_all": {}
}
}

image-20220423145332557

通过界面查看和索引日志数据

点击Stack Management

image-20220423145805250

选择Index Patterns

image-20220423150023952

输入索引名称test,可以看到有匹配的,选择时间字段为@timestamp,点击Create Index pattern

image-20220423150226384

看到创建成功

image-20220423150259353

选择Discover

image-20220423150342241

可以看到默认为test索引

image-20220423150404995

选择时间范围大一点

image-20220423150440112

点击Update

image-20220423150502069

可以看到数据完全被加载出来

image-20220423150517409

通过关键词totalTime搜索,点击Refresh,可以看到下面搜索高量的部分

image-20220423150630046

提取日志中关键信息

日志文件

data.log日志文件内容修改如下:

1
2
3
4
5
6
{"timestamp":"2022-05-06T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms"}
{"timestamp":"2022-05-06T18:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:63ms"}
{"timestamp":"2022-05-06T19:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:75ms"}
{"timestamp":"2022-05-07T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms"}
{"timestamp":"2022-05-07T18:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:63ms"}
{"timestamp":"2022-05-07T19:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:75ms"}

image-20220507222926744

logstash 日志解析测试

进入到

1
F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config\

时间戳解析测试

在logstash配置文件中,新增文件名test.conf

image-20220507225305651

test.conf内容如下,将时间戳保存为tmpTime字段,将转化成@timestamp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
input{
stdin{
}
}
filter{
grok{
match=>["message","%{TIMESTAMP_ISO8601:tmpTime}"]
}
date{
match=>["tmpTime", "ISO8601"]
target=>"@timestamp"
}
}
output {
stdout{
}
}

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\logstash-7.16.0-windows-x86_64\logstash-7.16.0\bin\logstash.bat -f F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config\test.conf

输入参数

1
{"createtime" :"2019-08-05T07:16:00.571Z","iphost" :"10.100.4.100:2891","nihao":2021-08-05T07:16:00.571Z}

可以看到时间戳解析成功

image-20220507232202416

关键字解析测试

test.conf内容修改如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
input{
stdin{
}
}

filter{
grok{
match=>[
"message","(?<totaltime>(?<=totalTime).*?(?=ms))"
]
}
mutate{
convert=>["totaltime","integer"]
}
}

filter{
grok{
match=>["message","%{TIMESTAMP_ISO8601:localtime}"]
}
date{
match=>["localtime", "ISO8601"]
target=>"@timestamp"
}
}

output {
stdout{
}
}

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\logstash-7.16.0-windows-x86_64\logstash-7.16.0\bin\logstash.bat -f F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config\test.conf

输入参数

1
{"timestamp":"2022-05-06T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms"}

可以看到时间戳解析成功

image-20220507232958106

过滤器丢日志测试

test.conf内容修改如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
input{
stdin{
}
}

filter{
grok{
match=>[
"message","(?<totaltime>(?<=totalTime).*?(?=ms))"
]
}
mutate{
convert=>["totaltime","integer"]
}
if ([message] !~ "atotalTime") {
drop { }
}
}

filter{
grok{
match=>["message","%{TIMESTAMP_ISO8601:localtime}"]
}
date{
match=>["localtime", "ISO8601"]
target=>"@timestamp"
}
}

output {
stdout{
}
}

启动新的shell,并执行命令

1
2
cd F:\soft\elk
.\logstash-7.16.0-windows-x86_64\logstash-7.16.0\bin\logstash.bat -f F:\soft\elk\logstash-7.16.0-windows-x86_64\logstash-7.16.0\config\test.conf

输入参数

1
{"timestamp":"2022-05-06T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 atotalTime:54ms"}

然后输入这个参数

1
{"timestamp":"2022-05-06T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms"}

image-20220518133552805

可以看到消息有atotalTime关键字的可以被解析

logstash配置文件

log.conf修改如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
beats {
port => 5044
}
}

filter{
grok{
match=>[
"message","(?<totaltime>(?<=totalTime:).*?(?=ms))"
]
}
mutate{
convert=>["totaltime","integer"]
}
}

filter{
grok{
match=>["message","%{TIMESTAMP_ISO8601:localtime}"]
}
date{
match=>["localtime", "ISO8601"]
target=>"@timestamp"
}
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "test-%{+YYYY.MM.dd}"
}
}

启动elasticsearch、kibana、logstash

按照上文介绍的分别重新启动elasticsearch、kibana、logstash

启动filebeat

进入F:\soft\elk\filebeat-7.16.0-windows-x86_64

删除data文件目录,可以重新上传之前的日志文件

image-20220507222216127

删除成功

image-20220507222239315

然后按照上文启动filebeat

kibana Dev Tools查询创建的索引数据

打开Dev Tools

image-20220507233859498

输入如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
GET test/_search
{
"query": {
"match_all": {}
}
}

GET test-2022.05.06/_search
{
"query": {
"match_all": {}
}
}

GET test-2022.05.07/_search
{
"query": {
"match_all": {}
}
}

分别执行查询,可以看到日志已经根据日期创建了两个索引,并且关键字和时间也正常解析出来了

image-20220507233953041

kibana Discover查询创建的合并的日期索引数据

选择Stack Management

image-20220508090242152

选择Index Patterns

image-20220508090441934

点击Create index pattern

image-20220508090619414

Name为test-*,表示以test-为前缀的所有相关索引数据

Timestamp field选择@timestamp

点击Create index pattern

image-20220508091031023

选择Discover

image-20220508091124757

切换test-*索引,日期范围选大一点

image-20220508091148579

点击Refresh可以看到日志May 5和May 6的日志都加载过来了

image-20220508091440953

点击message和totaltime右边的加号

image-20220508091503903

可以看下只展示了message和totaltime的数据

image-20220508091541788

输入 totaltime>60可以看到totaltime大于60的数据

image-20220508091632998

kibana Dashboard统计totaltime

点击Dashboard

image-20220508092742871

点击Create new dashboard

image-20220508092802385

点击Create visualization

image-20220508092917293

切换到test-*索引

image-20220508093041479

点击选择横坐标

image-20220508093128946

选择@timestamp为横坐标

image-20220508093211860

选择好后关闭

image-20220508093249796

选择纵坐标

image-20220508093308415

选择totaltime

image-20220508093342155

选择Maximun最大值

image-20220508093427283

关闭

image-20220508093452467

可以看到totaltime的统计数据显示出来了

image-20220508093522725

点击Save and return将图表保存

image-20220508093551734

点击Save

image-20220508093620798

输入totaltime,点击Save保存

image-20220508093704624

可以看到保存成功

image-20220508093725278

点击totaltime可以直接进去创建的dashboard

image-20220508093746544

点击Edit lens可以重新进入

image-20220508093946609

可以看到编辑的dashboard

image-20220508094009580

创建Runtime script

如果刚开始没有通过logstash解析到关键字,后期也可以通过Runtime script解析到关键字,本章我们再次通过Runtime script解析totalTIme关键字

参考内容:https://www.elastic.co/guide/en/kibana/7.16/managing-index-patterns.html#runtime-fields

发送的日志消息如下

1
{"timestamp":"2022-05-06T17:23:25.365+08:00", "message":"tc-com.net - - 192.168.12.58 192.168.12.58 192.168.15.135 80 GET 200 /geccess/services/capability/L6JN4255 ?pageIndex=1&pageSize=2000&vehicleType=0 21067 17 totalTime:54ms"}

点击Stack Management

image-20220508103935969

点击Index Patterns,选择test-*索引

image-20220508104017844

点击Add field

image-20220508104121981

点击Set value

image-20220508104155935

可以看到Create field内容如下

image-20220508104255942

Name输入runTime

Define scipt输入如下内容

1
2
3
4
5
6
def msg=doc['message.keyword'][0];
msg=msg.toString();
int start = msg.indexOf("totalTime:");
int end = msg.indexOf("ms");
def tt = msg.substring(start+10,end);
emit(tt);

可以看到右侧实时计算出来了tunTime的值为63

点击Save

image-20220508104425576

搜索runTime可以看到关键字已经创建

image-20220508104457250

点击Discover

image-20220508104537061

可以runTime已经显示出来了,点击加号

image-20220508104630553

可以看到Selected fields中有runTime,右边也显示出来了

image-20220508104901326

输入runTime>60,点击Update

image-20220508105000608

可以看到执行成功

image-20220508105029593

curator删除过期索引数据

安装curator(window版本)

访问网站https://www.elastic.co/guide/en/elasticsearch/client/curator/current/windows-zip.html

点击下载

image-20220508161610835

解压

image-20220508161748698

搜索系统变量

image-20220508161824827

点击环境变量

image-20220508161857642

双击Path

image-20220508162006255

添加F:\soft\elk\elasticsearch-curator-5.8.4-amd64\curator-5.8.4-amd64

image-20220508162108964

按键Win+X打开新的shell

image-20220508162159009

输入命令curator可以看到已经执行成功

image-20220508162241489

配置curator配置文件

参考网站:https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

输入命令查看本地es创建的索引

1
curator_cli --host 127.0.0.1 --port 9200 show-indices

image-20220508153518021

在路径下F:\soft\elk\elasticsearch-curator-5.8.4-amd64\curator-5.8.4-amd64

新建action.yml和config.yml文件

image-20220508162415972

config.yml内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
client:
hosts:
- 127.0.0.1
port: 9200
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
ssl_no_validate: False
http_auth:
timeout: 30
master_only: False

logging:
loglevel: INFO
logfile:
logformat: default
blacklist: ['elasticsearch', 'urllib3']

action.yml内容如下:

unit_count为1,表示删除1天前的以test-为开头命令的索引

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: delete_indices
description: >-
Delete indices older than 1 days (based on index name), for logstash-
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: test-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 1

输入如下命令

1
curator --config F:\soft\elk\elasticsearch-curator-5.8.4-amd64\curator-5.8.4-amd64\config.yml F:\soft\elk\elasticsearch-curator-5.8.4-amd64\curator-5.8.4-amd64\action.yml

可以看到5月6号和5月7号的索引被成功删除

image-20220508161231455

通过kibana索引查询,发现索引已经不存在了

1
2
3
4
5
6
GET test-2022.05.06/_search?pretty
{
"query": {
"match_all": {}
}
}

image-20220508161312556

创建定时任务清理任务

linux 系统中我们可以将命令添加到定时任务中执行

1
2
crontab -e
0 4 * * * /usr/bin/curator --config /etc/curator/config.yml /etc/curator/action.yml

过滤日志

结论

可以看到一个简单的日志收集系统搭建成功,我们可以根据这套系统收集日志,并做分析,本文只是针对window版本的介绍,后续会在linux版本搭建一套,步骤基本一致,更多实用搜索功能敬请期待。