官方文档:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
基本语法:
%{SYNTAX:SEMANTIC}
SYNTAX:定义的正则表达式名字(系统插件自带的默认位置:$HOME/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.2/patterns)
SEMANTIC:匹配结果的标识
grok{
match=>{
"message"=>"%{IP:clientip}"
}
}
输入结果
{
"message" => "192.168.1.1 abc",
"@version" => "1",
"@timestamp" => "2016-03-30T02:15:31.242Z",
"host" => "master",
"clientip" => "192.168.1.1"
}
clientip就是semantic
每个%{IP:clientip}表达式只能匹配到message中第一次出现的结果,可用如下方式匹配多个相同类型结果
%{IP:clientip}s+%{IP:clientip1}...,如果SEMANTIC定义的相同名字,结果为数组形式,如:
{
"message" => "12.12.12.12 32.32.32.32",
"@version" => "1",
"@timestamp" => "2016-03-30T02:26:31.077Z",
"host" => "master",
"clientip" => [
[0] "12.12.12.12",
[1] "32.32.32.32"
]
}
自定义grok表达式
语法:(?<field_name>the pattern here)
eg:
grok{
match=>{
"message"=>"%{IP:clientip}s+(?<mypattern>[A-Z]+)"
}
}
rs:
{
"message" => "12.12.12.12 ABC",
"@version" => "1",
"@timestamp" => "2016-03-30T03:22:04.466Z",
"host" => "master",
"clientip" => "12.12.12.12",
"mypattern" => "ABC"
}
创建自定义grok文件
在/home/hadoop/mylogstash/mypatterns_dir创建文件mypatterns_file,内容如下:
MY_PATTERN [A-Z]+
保存!
修改filter
grok{
patterns_dir=>["/home/hadoop/mylogstash/mypatterns_dir"]
match=>{
"message"=>"%{IP:clientip}s+%{MY_PATTERN:mypattern}"
}
}
结果同上