沒(méi)有做老千的斗牛網(wǎng)站6網(wǎng)絡(luò)營(yíng)銷專業(yè)課程
第3章 Logstash數(shù)據(jù)分析
Logstash使用管道方式進(jìn)行日志的搜集處理和輸出。有點(diǎn)類似*NIX系統(tǒng)的管道命令 xxx | ccc | ddd,xxx執(zhí)行完了會(huì)執(zhí)行ccc,然后執(zhí)行ddd。
在logstash中,包括了三個(gè)階段:
輸入input --> 處理filter(不是必須的) --> 輸出output
每個(gè)階段都由很多的插件配合工作,比如file、elasticsearch、redis等等。
每個(gè)階段也可以指定多種方式,比如輸出既可以輸出到elasticsearch中,也可以指定到stdout在控制臺(tái)打印。
logstash支持多輸入和多輸出
ELFK架構(gòu)示意圖:
1.Logstash基礎(chǔ)部署
- 安裝軟件
[root@host3 ~]# yum install logstash --enablerepo=es -y # 偶爾需要使用的倉(cāng)庫(kù)可以將它關(guān)閉,用到的時(shí)候臨時(shí)打開(kāi)[root@host3 ~]# ln -sv /usr/share/logstash/bin/logstash /usr/local/bin/ # 做軟連接,命令就可以直接使用了
"/usr/local/bin/logstash" -> "/usr/share/logstash/bin/logstash"
- 創(chuàng)建第一個(gè)配置文件
[root@host3 ~]# vim 01-stdin-stdout.confinput {stdin {}
}output {stdout {}
}
- 測(cè)試配置文件
[root@host3 ~]# logstash -tf 01-stdin-stdout.conf
- 自定義啟動(dòng),這種方式通常用于實(shí)驗(yàn)環(huán)境,業(yè)務(wù)環(huán)境下,通常將配置修改后,使用systemctl來(lái)管理服務(wù)
[root@host3 ~]# logstash -f 01-stdin-stdout.conf
Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2022-09-15 21:49:37.109 [main] runner - Starting Logstash {"logstash.version"=>"7.17.6", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.16+8 on 11.0.16+8 +indy +jit [linux-x86_64]"}
[INFO ] 2022-09-15 21:49:37.115 [main] runner - JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[INFO ] 2022-09-15 21:49:37.160 [main] settings - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[INFO ] 2022-09-15 21:49:37.174 [main] settings - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[WARN ] 2022-09-15 21:49:37.687 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2022-09-15 21:49:38.843 [LogStash::Runner] Reflections - Reflections took 114 ms to scan 1 urls, producing 119 keys and 419 values
[WARN ] 2022-09-15 21:49:39.658 [LogStash::Runner] line - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[WARN ] 2022-09-15 21:49:39.703 [LogStash::Runner] stdin - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
Configuration OK
[INFO ] 2022-09-15 21:49:39.917 [LogStash::Runner] runner - Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash
[root@host3 ~]# logstash -f 01-stdin-stdout.conf
Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2022-09-15 21:50:25.095 [main] runner - Starting Logstash {"logstash.version"=>"7.17.6", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.16+8 on 11.0.16+8 +indy +jit [linux-x86_64]"}
[INFO ] 2022-09-15 21:50:25.103 [main] runner - JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[WARN ] 2022-09-15 21:50:25.523 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2022-09-15 21:50:25.555 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"3fc04af1-7665-466e-839f-1eb42348aeb0", :path=>"/usr/share/logstash/data/uuid"}
[INFO ] 2022-09-15 21:50:27.119 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2022-09-15 21:50:28.262 [Converge PipelineAction::Create<main>] Reflections - Reflections took 110 ms to scan 1 urls, producing 119 keys and 419 values
[WARN ] 2022-09-15 21:50:29.084 [Converge PipelineAction::Create<main>] line - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[WARN ] 2022-09-15 21:50:29.119 [Converge PipelineAction::Create<main>] stdin - Relying on default value of `pipeline.ecs_compatibility`, which may change in a future major release of Logstash. To avoid unexpected changes when upgrading Logstash, please explicitly declare your desired ECS Compatibility mode.
[INFO ] 2022-09-15 21:50:29.571 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/root/01-stdin-stdout.conf"], :thread=>"#<Thread:0x32e464e6 run>"}
[INFO ] 2022-09-15 21:50:30.906 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>1.33}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.jrubystdinchannel.StdinChannelLibrary$Reader (file:/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/jruby-stdin-channel-0.2.0-java/lib/jruby_stdin_channel/jruby_stdin_channel.jar) to field java.io.FilterInputStream.in
WARNING: Please consider reporting this to the maintainers of com.jrubystdinchannel.StdinChannelLibrary$Reader
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[INFO ] 2022-09-15 21:50:31.128 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
The stdin plugin is now waiting for input:
[INFO ] 2022-09-15 21:50:31.270 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
abc
{"message" => " abc","@version" => "1","host" => "host3.test.com","@timestamp" => 2022-09-15T13:52:02.984Z
}
bbb
{"message" => "bbb","@version" => "1","host" => "host3.test.com","@timestamp" => 2022-09-15T13:52:06.177Z
}
2.輸入類型
在上例中,輸入類型是stdin,也就是手動(dòng)輸入,而在生產(chǎn)環(huán)境中,日志不可能通過(guò)手工輸入的發(fā)生產(chǎn)生,因此stdin通常都是用于測(cè)試環(huán)境是否搭建成功,下面會(huì)介紹幾種常見(jiàn)的輸入類型。
2.1 file
input {file {path => ["/tmp/test/*.txt"]# 從最開(kāi)始讀日志文件(默認(rèn)是末尾),僅在讀取記錄沒(méi)有任何記錄的情況下生效,也就是說(shuō),在服務(wù)停止的時(shí)候有新文件產(chǎn)生,服務(wù)器啟動(dòng)后可以讀取到(舊文件不行)start_position => "beginning" }
}
文件的讀取記錄放在/usr/share/logstash/data/plugins/inputs/file/.sincedb_3cd99a80ca58225ec14dc0ac340abb80
中
[root@host3 ~]# cat /usr/share/logstash/data/plugins/inputs/file/.sincedb_3cd99a80ca58225ec14dc0ac340abb80
5874000 0 64768 4 1663254379.147252 /tmp/test/1.txt
2.2 tcp
和filebeat一樣,Logstash同樣支持監(jiān)聽(tīng)TCP的某一個(gè)端口,用來(lái)接收日志。可以同時(shí)監(jiān)聽(tīng)多個(gè)端口
這種方式通常用于無(wú)法安裝客戶端的服務(wù)器
也可以使用HTTP協(xié)議,配置方法和TCP類似
[root@host3 ~]#vim 03-tcp-stdout.conf
input {tcp {port => 9999}
}output {stdout {}
}
[root@host2 ~]# telnet 192.168.19.103 9999
Trying 192.168.19.103...
Connected to 192.168.19.103.
Escape character is '^]'.
123456
test
hello
{"message" => "123456\r","@version" => "1","@timestamp" => 2022-09-15T15:30:23.123Z,"host" => "host2","port" => 51958
}
{"message" => "test\r","@version" => "1","@timestamp" => 2022-09-15T15:30:24.494Z,"host" => "host2","port" => 51958
}
{"message" => "hello\r","@version" => "1","@timestamp" => 2022-09-15T15:30:26.336Z,"host" => "host2","port" => 51958
}
2.3 redis
Logstash支持直接從redis數(shù)據(jù)庫(kù)中拿數(shù)據(jù)。支持三種redis數(shù)據(jù)類型:
- list,表示的redis命令為blpop,代表從redis list的左邊獲取第一個(gè)元素,如無(wú)元素則阻塞;
- channel,表示的redis命令為subscribe,代表從redis頻道獲取最新的數(shù)據(jù);
- pattern_channel,表示的redis命令為psubscribe,代表通過(guò)pattern正則表達(dá)式匹配頻道,獲取最新的數(shù)據(jù)。
數(shù)據(jù)類型之間的區(qū)別:
- channel與pattern_channel的區(qū)別在于,pattern_channel可以通過(guò)正則表達(dá)式匹配多個(gè)頻道,而channel是單一頻道;
- list與另外兩個(gè)channel的區(qū)別在于,1個(gè)channel的數(shù)據(jù)會(huì)被多個(gè)訂閱的logstash重復(fù)獲取,1個(gè)list的數(shù)據(jù)被多個(gè)logstash獲取時(shí)不會(huì)重復(fù),會(huì)被分?jǐn)傇诟鱾€(gè)Logstash中。
輸入配置如下
input { redis {data_type => "list" # 指定數(shù)據(jù)類型db => 5 # 指定數(shù)據(jù)庫(kù),默認(rèn)是0host => "192.168.19.101" # 指定redis服務(wù)器IP,默認(rèn)是localhostport => 6379password => "bruce"key => "test-list"}
}
redis中追加數(shù)據(jù)
[root@host1 ~]# redis-cli -h host1 -a bruce
host1:6379> select 5
OK
host1:6379[5]> lpush test-list bruce
(integer) 1
host1:6379[5]> lrange test-list 0 -1
(empty list or set)
host1:6379[5]> lpush test-list hello
(integer) 1
host1:6379[5]> lrange test-list 0 -1 # 可以看到,Logstash獲取數(shù)據(jù)后,會(huì)將列表清空
(empty list or set)
host1:6379[5]> lpush test-list '{"requestTime":"[12/Sep/2022:23:30:56 +0800]","clientIP":"192.168.19.1","threadID":"http-bio-8080-exec-7","protocol":"HTTP/1.1","requestMethod":"GET / HTTP/1.1","requestStatus":"404","sendBytes":"-","queryString":"","responseTime":"0ms","partner":"-","agentVersion":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36"}'
Logstash獲取數(shù)據(jù)
{"message" => "bruce","@timestamp" => 2022-09-16T08:17:38.213Z,"@version" => "1","tags" => [[0] "_jsonparsefailure"]
}
# 非json格式數(shù)據(jù)會(huì)報(bào)錯(cuò),但是能接收
[ERROR] 2022-09-16 16:18:21.688 [[main]<redis] json - JSON parse error, original data now in message field {:message=>"Unrecognized token 'hello': was expecting ('true', 'false' or 'null')\n at [Source: (String)\"hello\"; line: 1, column: 11]", :exception=>LogStash::Json::ParserError, :data=>"hello"}
{"message" => "hello","@timestamp" => 2022-09-16T08:18:21.689Z,"@version" => "1","tags" => [[0] "_jsonparsefailure"]
}
# json格式的數(shù)據(jù)過(guò)來(lái),Logstash可以自動(dòng)解析
{"clientIP" => "192.168.19.1","requestTime" => "[12/Sep/2022:23:30:56 +0800]","queryString" => "","@version" => "1","agentVersion" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36","partner" => "-","@timestamp" => 2022-09-16T08:23:10.320Z,"protocol" => "HTTP/1.1","requestStatus" => "404","threadID" => "http-bio-8080-exec-7","requestMethod" => "GET / HTTP/1.1","sendBytes" => "-","responseTime" => "0ms"
}
2.4 beats
在FileBeat中已經(jīng)配置好了將日志輸出到Logstash,在Logstash中,只需要接收數(shù)據(jù)即可。
filebeat配置
filebeat.inputs:
- type: logpaths: /tmp/1.txtoutput.logstash:hosts: ["192.168.19.103:5044"]
Logstash配置
input { beats {port => 5044}
}
host2上在/tmp/1.txt中追加111,Logstash的輸出
{"message" => "111","tags" => [[0] "beats_input_codec_plain_applied"],"agent" => {"id" => "76b7876b-051a-4df8-8b13-bd013ac5ec59","version" => "7.17.4","hostname" => "host2.test.com","type" => "filebeat","name" => "host2.test.com","ephemeral_id" => "437ac89f-7dc3-4898-a457-b2452ac4223b"},"input" => {"type" => "log"},"host" => {"name" => "host2.test.com"},"log" => {"offset" => 0,"file" => {"path" => "/tmp/1.txt"}},"@version" => "1","ecs" => {"version" => "1.12.0"},"@timestamp" => 2022-09-16T08:53:20.975Z
}
3. 輸出類型
3.1 redis
redis也可以作為輸出類型,配置方式和輸入類似
output { redis {data_type => "list" db => 6 host => "192.168.19.101" port => 6379password => "bruce"key => "test-list"}
}
查看redis數(shù)據(jù)庫(kù)
[root@host1 ~]# redis-cli -h host1 -a bruce
host1:6379> select 6
OK
host1:6379[6]> lrange test-list 0 -1
1) "{\"message\":\"1111\",\"@version\":\"1\",\"@timestamp\":\"2022-09-16T09:12:29.890Z\",\"host\":\"host3.test.com\"}"
3.2 file
file類型是輸出到本地磁盤保存。
output { file {path => "/tmp/test-file.log"}
}
3.3 elasticsearch
output { elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "centos-logstash-elasticsearh-%{+YYYY.MM.dd}"}
}
4. filter
filter是一個(gè)可選插件,在接收到日志信息后,可以對(duì)日志進(jìn)行格式化,然后再輸出。
4.1 grok
grok可以用來(lái)解析任意文本并進(jìn)行結(jié)構(gòu)化。該工具適合syslog日志、Apache和其他網(wǎng)絡(luò)服務(wù)器日志。
①簡(jiǎn)單示例
input {file {path => ["/var/log/nginx/access.log*"]start_position => "beginning"}
}filter {grok {match => {"message" => "%{COMBINEDAPACHELOG}"# "message" => "%{HTTPD_COMMONLOG}" # 新版本Logstash可能會(huì)用這個(gè)變量}}
}output {stdout {}elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "nginx-logs-es-%{+YYYY.MM.dd}"}
}
解析出來(lái)的結(jié)果:
{"request" => "/","bytes" => "4833","@version" => "1","auth" => "-","agent" => "\"curl/7.29.0\"","path" => "/var/log/nginx/access.log-20220913","ident" => "-","verb" => "GET","message" => "192.168.19.102 - - [12/Sep/2022:21:48:29 +0800] \"GET / HTTP/1.1\" 200 4833 \"-\" \"curl/7.29.0\" \"-\"","httpversion" => "1.1","host" => "host3.test.com","@timestamp" => 2022-09-16T14:27:43.208Z,"response" => "200","timestamp" => "12/Sep/2022:21:48:29 +0800","referrer" => "\"-\"","clientip" => "192.168.19.102"
}
②預(yù)定義字段
grok是基于正則表達(dá)式來(lái)進(jìn)行匹配,它的語(yǔ)法格式是%{SYNTAX:SEMANTIC}
- SYNTAX是將匹配您的文本的模式的名稱,這是內(nèi)置好的語(yǔ)法,官方支持120種字段。
- SEMANTIC是您為要匹配的文本提供的標(biāo)識(shí)符,也就是你要給它去的名字。
示例:
- 日志源文件
55.3.244.1 GET /index.html 15824 0.043
- 匹配的字段應(yīng)該是
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
- 配置文件
input {stdin {}
}filter {grok {match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }}
}output {stdout {}
}
- 匹配出來(lái)的結(jié)果
55.3.244.1 GET /index.html 15824 0.043
{"message" => "55.3.244.1 GET /index.html 15824 0.043","@version" => "1","@timestamp" => 2022-09-16T14:46:46.426Z,"method" => "GET","request" => "/index.html","bytes" => "15824","duration" => "0.043","host" => "host3.test.com","client" => "55.3.244.1"
}
針對(duì)不同服務(wù)的日志,可以查看官方文檔的定義:
https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
③自定義字段
當(dāng)預(yù)定義的字段不符合要求時(shí),grok也支持自定義正則表達(dá)式來(lái)匹配日志信息
- 首先需要?jiǎng)?chuàng)建自定義表達(dá)式保存的目錄,并將表達(dá)式寫進(jìn)去
[root@host3 ~]# mkdir patterns
[root@host3 ~]# echo "POSTFIX_QUEUEID [0-9A-F]{10,11}" >> ./patterns/1
- 修改配置文件
input {stdin {}
}filter {grok {patterns_dir => ["/root/patterns"] # 指定表達(dá)式位置match => { "message" => "%{SYSLOGBASE} %{POSTFIX_QUEUEID:queue_id}: %{GREEDYDATA:syslog_message}" } # 這里有系統(tǒng)預(yù)定義的,也有自定義的表達(dá)式,大括號(hào)外的字符就是常規(guī)的字符,需要逐個(gè)匹配,如冒號(hào): }
}output {stdout {}
}
- 運(yùn)行并測(cè)試
...
The stdin plugin is now waiting for input:
[INFO ] 2022-09-16 23:22:04.511 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>
{"message" => "Jan 1 06:25:43 mailserver14 postfix/cleanup[21403]: BEF25A72965: message-id=<20130101142543.5828399CCAF@mailserver14.example.com>","host" => "host3.test.com","timestamp" => "Jan 1 06:25:43","queue_id" => "BEF25A72965", # 自定義表達(dá)式匹配的字段"logsource" => "mailserver14","@timestamp" => 2022-09-16T15:22:19.516Z,"program" => "postfix/cleanup","pid" => "21403","@version" => "1","syslog_message" => "message-id=<20130101142543.5828399CCAF@mailserver14.example.com>"
}
4.2 通用字段
顧名思義,這些字段可以用在所有屬于filter的插件中。
- remove_field
filter {grok {remove_field => ["@version","tag","agent"]}
}
- add_field
filter {grok {add_field => ["new_tag" => "hello world %{YYYY.mm.dd}"]}
}
4.3 date
在數(shù)據(jù)中,會(huì)有兩個(gè)時(shí)間戳timestamp和@timestamp,日志產(chǎn)生的時(shí)間和數(shù)據(jù)采集的時(shí)間,這兩個(gè)時(shí)間可能會(huì)不一致。
date插件可以用來(lái)轉(zhuǎn)換日志記錄中的時(shí)間字符串,參考@timestamp字段里的時(shí)間。date插件支持五種時(shí)間格式:
- ISO8601
- UNIX
- UNIX_MS
- TAI64N
input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMMONLOG}" }remove_field => ["message","ident","auth","@version","path"]}date {match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ] # timestamp必須是現(xiàn)有的字段,這里只是對(duì)這個(gè)字段的時(shí)間進(jìn)行校正,且需要和timestamp字段的原數(shù)據(jù)格式一致,否則會(huì)報(bào)解析錯(cuò)誤# timestamp原來(lái)的數(shù)據(jù)格式為"17/Sep/2022:18:42:26 +0800",因此時(shí)區(qū)改成ZZZ就會(huì)一直報(bào)錯(cuò),因?yàn)閆ZZ代表Asia/Shanghai這種格式,Z代表+0800timezone => "Asia/Shanghai"}}output {stdout {}
}
輸出的格式:
{"timestamp" => "17/Sep/2022:18:42:26 +0800", #和@timestamp有8小時(shí)的時(shí)間差,可到Elasticsearch中查看,如果也有時(shí)間差,可以在date中修改timezone"response" => "200","httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T10:42:26.000Z,"bytes" => "4833"
}
使用target將匹配到的時(shí)間字段解析后存儲(chǔ)到目標(biāo)字段,若不指定,默認(rèn)是@timestamp字段。這個(gè)字段在Kibana中創(chuàng)建索引時(shí)可以用到
date {match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ]timezone => "Asia/Shanghai"target => "logtime"}# 結(jié)果
{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","logtime" => 2022-09-17T13:15:30.000Z, # 日志產(chǎn)生的時(shí)間"httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T13:15:31.357Z, # 日志記錄的時(shí)間,可以看到和日志產(chǎn)生的時(shí)間有一定的延遲"bytes" => "4833"
}
4.4 geoip
用來(lái)解析訪問(wèn)IP的位置信息。這個(gè)插件是依賴GeoLite2城市數(shù)據(jù)庫(kù),信息不一定準(zhǔn)確,也可以自己下載MaxMind格式的數(shù)據(jù)庫(kù)然后應(yīng)用,官方網(wǎng)站有自定義數(shù)據(jù)庫(kù)的指導(dǎo)手冊(cè)。
input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMMONLOG}" }remove_field => ["message","ident","auth","@version","path"]}geoip {source => "clientip" # IP地址的源參考clientip字段# fields => ["country_name" ,"timezone", "city_name"] # 可以選擇顯示的字段}}output {stdout {}
}
得到的結(jié)果,可以看到,私有地址無(wú)法正常解析
{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","geoip" => {},"httpversion" => "1.1","clientip" => "192.168.19.102","verb" => "GET","host" => "host3.test.com","tags" => [[0] "_geoip_lookup_failure" # 私網(wǎng)地址],"request" => "/","@timestamp" => 2022-09-17T13:30:05.178Z,"bytes" => "4833"
}
{"timestamp" => "17/Sep/2022:21:15:30 +0800","response" => "200","geoip" => { # 解析的結(jié)果放在geoip中"country_code2" => "CM","country_code3" => "CM","country_name" => "Cameroon","ip" => "154.72.162.134","timezone" => "Africa/Douala","location" => {"lon" => 12.5,"lat" => 6.0},"continent_code" => "AF","latitude" => 6.0,"longitude" => 12.5},"httpversion" => "1.1","clientip" => "154.72.162.134","verb" => "GET","host" => "host3.test.com","request" => "/","@timestamp" => 2022-09-17T13:30:05.178Z,"bytes" => "4833"
}
4.5 useragent
用來(lái)解析瀏覽器的信息。前提是輸出的信息有瀏覽器信息字段。
input {file {path => "/var/log/nginx/access.log*"start_position => "beginning"}
}filter {grok {match => { "message" => "%{HTTPD_COMBINEDLOG}" } # HTTPD_COMBINEDLOG可以解析瀏覽器remove_field => ["message","ident","auth","@version","path"]}useragent {source => "agent" # 指定瀏覽器信息在哪個(gè)字段中,這個(gè)字段必須要存在target => "agent_test" # 為了方便查看,將所有解析后的信息放到這個(gè)字段里面去}
}output {stdout {}
}
得到的結(jié)果:
{"timestamp" => "17/Sep/2022:23:42:31 +0800","response" => "404","geoip" => {},"httpversion" => "1.1","clientip" => "192.168.19.103","verb" => "GET","agent" => "\"Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0\"","host" => "host3.test.com","request" => "/favicon.ico","referrer" => "\"-\"","@timestamp" => 2022-09-17T15:42:31.927Z,"bytes" => "3650","agent_test" => {"major" => "60","name" => "Firefox","os" => "Linux","os_full" => "Linux","os_name" => "Linux","version" => "60.0","minor" => "0","device" => "Other"}
}
{
{
..."agent_test" => {"major" => "60","name" => "Firefox","os" => "Linux","os_full" => "Linux","os_name" => "Linux","version" => "60.0","minor" => "0","device" => "Other"}
}
{
..."agent_test" => {"os_minor" => "0","os_full" => "iOS 16.0","version" => "16.0","os_major" => "16","device" => "iPhone","major" => "16","name" => "Mobile Safari","os" => "iOS","os_version" => "16.0","os_name" => "iOS","minor" => "0"}
}
{
..."agent_test" => {"patch" => "3987","os_full" => "Android 10","version" => "80.0.3987.162","os_major" => "10","device" => "Samsung SM-G981B","major" => "80","name" => "Chrome Mobile","os" => "Android","os_version" => "10","os_name" => "Android","minor" => "0"}
}
4.6 mutate
- 切割自定的字段
input {stdin {}
}filter {mutate {split => {message => " " # 將message消息以空格作為分隔符進(jìn)行分割}remove_field => ["@version","host"]add_field => {"tag" => "This a test field from Bruce"}}
}output {stdout {}
}
111 222 333
{"tag" => "This a test field from Bruce","message" => [[0] "111",[1] "222",[2] "333"],"@timestamp" => 2022-09-18T08:07:36.373Z
}
- 將切割后的數(shù)據(jù)取出來(lái)
input {stdin {}
}filter {mutate {split => {message => " " # 將message消息以空格作為分隔符進(jìn)行分割}remove_field => ["@version","host"]add_field => {"tag" => "This a test field from Bruce"}}mutate {add_field => {"name" => "%{[message][0]}""age" => "%{[message][1]}""sex" => "%{[message][2]}"}}
}output {stdout {}
}
bruce 37 male
{"message" => [[0] "bruce",[1] "37",[2] "male"],"age" => "37","@timestamp" => 2022-09-18T08:14:31.230Z,"sex" => "male","tag" => "This a test field from Bruce","name" => "bruce"
}
- convert:將字段的值轉(zhuǎn)換成不同的類型,例如將字符串轉(zhuǎn)換成證書,如字段值是一個(gè)數(shù)組,所有成員都會(huì)被轉(zhuǎn)換。如果該字段是散列,則不會(huì)采取任何動(dòng)作
filter {mutate {convert => {"age" => "integer" # 將age轉(zhuǎn)換成數(shù)字類型}}
}
bruce 20 male
{"message" => [[0] "bruce",[1] "20",[2] "male"],"sex" => "male","name" => "bruce","age" => 20, # 沒(méi)有引號(hào),代表已經(jīng)修改成數(shù)字類型了"@timestamp" => 2022-09-18T08:51:07.633Z,"tag" => "This a test field from Bruce"
}
- strip:剔除字段中的前導(dǎo)和尾隨的空格
filter {mutate {strip => { "name","sex" }}
}
- rename:修改字段名
filter {mutate {rename => { "sex" => "agenda" }}
}
- replace:替換字段內(nèi)容
filter {mutate {replace => { "tag" => "This is test message" } # 修改了tag字段的內(nèi)容}
}
-
update:用法和replace一樣,區(qū)別在于如果字段存在則修改內(nèi)容,如果過(guò)不存在則忽略此操作
-
uppercase/lowercase:轉(zhuǎn)換成大寫/小寫;capitalize:首字母大寫。轉(zhuǎn)換的是字段內(nèi)容
filter {mutate {uppercase => "tag" capitalize => "name" }
}
5 高級(jí)特性
5.1 判斷語(yǔ)法
在input中打上標(biāo)記后,可以在output和filter中通過(guò)判斷語(yǔ)句來(lái)做區(qū)別化的處理
input {beats {port => 8888type => "nginx-beats"}tcp {port => 9999type => "tomcat-tcp"}
}output { if [type] == "nginx-beats" {elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "nginx-beats-elasticsearh-%{+YYYY.MM.dd}"}} else {elasticsearch {hosts => ["192.168.19.101:9200","192.168.19.102:9200","192.168.19.103:9200"]index => "tomcat-tcp-elasticsearh-%{+YYYY.MM.dd}"}
}
5.2 多實(shí)例運(yùn)行
Logstash支持多實(shí)例運(yùn)行,但是如果直接啟動(dòng),第二個(gè)實(shí)例會(huì)報(bào)錯(cuò),需要指定path.data的路徑才能正常啟動(dòng)。
[root@host3 ~]# logstash -f 01-stdin-stdout.conf --path.data /tmp/logstash