• logstash 配置文件语法


    需要一个配置文件 管理输入、过滤器和输出相关的配置。配置文件内容格式如下:

    # 输入
    input {
      ...
    }
    # 过滤器 filter { ... }
    # 输出 output { ... }

    先来看一个标准输入输出

    
    

    root@c201b7b32a32# ./logstash -e 'input { stdin{} } output { stdout{} }'
    Sending Logstash's logs to /opt/logstash/logs which is now configured via log4j2.properties
    [2018-04-26T06:47:20,724][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/opt/logstash/modules/fb_apache/configuration"}
    ……

    [2018-04-26T06:47:24,124][INFO ][logstash.pipeline ] Pipeline started succesfully {:pipeline_id=>"main", :thread=>"#<Thread:0x5fec99f4 run>"}
    The stdin plugin is now waiting for input:
    [2018-04-26T06:47:24,253][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}

    hello         ==>输入
    2018-04-26T06:47:31.957Z c201b7b32a32 hello         ==>输出
    this is test  ==>输入
    2018-04-26T06:50:29.743Z c201b7b32a32 this is test  ==>输出

    使用rubudebug显示详细输出,codec为一种编解码器

    
    

    # ./logstash -e 'input { stdin{} } output { stdout{ codec => rubydebug} }'

    test2   ==>输入
    {
           "message" => "test2",
        "@timestamp" => 2018-04-26T07:00:00.652Z,
          "@version" => "1",
              "host" => "c201b7b32a32"
    }       ==>使用rubydebug输出

     input输入设置

    input {
        # file为常用文件插件,插件内选项很多,可根据需求自行判断
        file {
            path => "/var/log/httpd/access_log" # 要导入的文件的位置,可以使用*,例如/var/log/nginx/*.log
            Excude =>”*.gz”                       # 要排除的文件
            start_position => "beginning"         # 从文件开始的位置开始读,默认是end
            ignore_older => 0                # 多久之内没修改过的文件不读取,0为无限制,单位为秒
            sincedb_path => "/dev/null"      # 记录文件上次读取位置;输出到null表示每次都从文件首行开始解析
    add_field=>{"test"="test"} # 增加一个字段 type => "apache-log" # type字段,可表明导入的日志类型 } }

     也可以使用多个file

    input {
      file {
        path => "/var/log/messages"
        type => "syslog"
      }
      
      file {
        path => "/var/log/apache/access.log"
        type => "apache"
      }
    }

    也可以使用数组方式   或者用*匹配

    path => ["/var/log/messages","/var/log/*.log"]
    path => ["/data/mysql/mysql.log"]

      filter过滤设置

     Logstash三个组件的第二个组件,也是真个Logstash工具中最复杂的一个组件,
    当然,也是最有作用的一个组件。

    1、grok插件 grok插件有非常强大的功能,他能匹配一切数据,但是他的性能和对资源的损耗同样让人诟病。

    filter{
        grok{
            #首先要说明的是,所有文本数据都是在Logstash的message字段中的,我们要在过滤器里操作的数据就是message。
    #只说一个match属性,他的作用是从message 字段中把时间给抠出来,并且赋值给另个一个字段logdate
            #第二点需要明白的是grok插件是一个十分耗费资源的插件。
            #第三点需要明白的是,grok有超级多的预装正则表达式,这里是没办法完全搞定的,也许你可以从这个大神的文章中找到你需要的表达式
            #http://blog.csdn.net/liukuan73/article/details/52318243
            #但是,我还是不建议使用它,因为他完全可以用别的插件代替,当然,对于时间这个属性来说,grok是非常便利的。

    match => ['message','%{TIMESTAMP_ISO8601:logdate}']
    }
    }

    再看下match 另一种用法,将message中   ip、访问方法、url、数据量、持续时间   提取出来 
    并赋值给 clientip、method、request、bytes、duration 字段

    filter {  
           grok {  
          match => {"message"=>"%{IPORHOST:clientip}s+%{WORD:method}s+%{URIPATHPARAM:request}s+%{NUMBER:bytes}s+%{NUMBER:duration}"}  
       }  
    }  

    显示数据

    {  
           "message" => "9.9.8.6   GET /xx.hmtl 343 44",  
          "@version" => "1",  
        "@timestamp" => "2017-01-18T00:12:37.490Z",  
              "path" => "/home/elk/0204/nginx.log",  
              "host" => "db01",  
              "type" => "nginx",  
          "clientip" => "9.9.8.6",  
            "method" => "GET",  
           "request" => "/xx.hmtl",  
             "bytes" => "343",  
          "duration" => "44"  
    }  

    继续修改,提取后删除message

    filter {  
           grok {  
          match => {"message"=>"%{IPORHOST:clientip}s+%{WORD:method}s+%{URIPATHPARAM:request}s+%{NUMBER:bytes}s+%{NUMBER:duration}"}  
           remove_field =>["message"]  
       }  
    }  

    显示结果

    {  
          "@version" => "1",  
        "@timestamp" => "2017-01-18T00:15:03.879Z",  
              "path" => "/home/elk/0204/nginx.log",  
              "host" => "db01",  
              "type" => "nginx",  
          "clientip" => "55.9.3.6",  
            "method" => "GET",  
           "request" => "/zz.xml",  
             "bytes" => "3",  
          "duration" => "44"  
    }  

    比较常用的是 %{COMBINEDAPACHELOG}   是logstash自带的匹配模式,内置的正则,用来匹配apache access日志

    filter {
    grok {
        match => {
            "message" => "%{COMBINEDAPACHELOG}"
        }
    
        remove_field => "message"   
       }
    }

    显示结果

    {
      "_index": "logstash-2018.05.03",
      "_type": "apache_logs",
      "_id": "VFHkI2MBPZdRHaSpwnN-",
      "_version": 1,
      "_score": null,
      "_source": {
        "agent": ""Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36 Maxthon/5.1.5.2000"",
        "path": "/var/log/httpd/access_log",
        "referrer": ""http://10.10.12.81/cacti/data_sources.php"",
        "host": "cacti",
        "verb": "GET",
        "clientip": "10.0.7.99",
        "request": "/cacti/graphs.php",
        "auth": "-",
        "@version": "1",
        "ident": "-",
        "httpversion": "1.1",
        "response": "200",
        "bytes": "37138",
        "@timestamp": "2018-05-03T02:46:26.477Z",
        "timestamp": "03/May/2018:10:46:25 +0800"
      },
      "fields": {
        "@timestamp": [
          "2018-05-03T02:46:26.477Z"
        ]
      },
      "sort": [
        1525315586477
      ]
    }

    其它插件暂时不讲……

      
    output输入设置

    输出到elasticserarch

     elasticsearch{  
        hosts=>["10.10.10.11:9200"]        # elasticsearch 地址 端口
        action=>"index"                    # 索引
        index=>"indextemplate-logstash"    # 索引名称
        #document_type=>"%{@type}"  
        document_id=>"ignore"  
          
        template=>"/opt/logstash-conf/es-template.json"    # 模板文件的路径 
        template_name=>"es-template.json"                  # 在es内部模板的名字
        template_overwrite=>true                           # 
        protocol => "http"         #目前支持三种协议    node、http 和tranaport  
       }

     

    写几个实例

    1.配置文件

    input {
    file {
        path => ['/var/log/httpd/access_log']
        start_position => "beginning"
    }
    }
    filter {
    grok {
        match => {
            "message" => "%{COMBINEDAPACHELOG}"
        }
    
        remove_field => "message"   
    }
    }
    output {
    elasticsearch {
        hosts => ["10.10.15.95:9200"]
        index => "12.81-cacti-%{+YYYY.MM.dd}"
        action => "index"
        document_type => "apache_logs"
                  }
    }

    数据

    {
    "_index": "logstash-2018.05.03",
    "_type": "apache_logs",
    "_id": "U1HkI2MBPZdRHaSpMXPM",
    "_version": 1,
    "_score": 1,
    "_source": {
    "agent": ""Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36 Maxthon/5.1.5.2000"",
    "path": "/var/log/httpd/access_log",
    "referrer": ""http://10.10.12.81/cacti/include/themes/modern/jquery-ui.css"",
    "host": "cacti",
    "verb": "GET",
    "clientip": "10.0.7.99",
    "request": "/cacti/include/themes/modern/images/ui-icons_454545_256x240.png",
    "auth": "-",
    "@version": "1",
    "ident": "-",
    "httpversion": "1.1",
    "response": "200",
    "bytes": "6992",
    "@timestamp": "2018-05-03T02:45:49.442Z",
    "timestamp": "03/May/2018:10:45:49 +0800"
               }
    }

    2.一台机器上传输两种日志

    input {
        file {
            path => "/var/log/messages"
            type => "system"
            start_position => "beginning"
        }
        file {
            path => "/var/log/elasticsearch/chuck-cluster.log"
            type => "es-error"
            start_position => "beginning"
        }
    }
    output {
        if [type] == "system" {
            elasticsearch {
                hosts => ["192.168.56.11:9200"]
                index => "system-%{+YYYY.MM.dd}"
            }
        }
        if [type] == "es-error" {
            elasticsearch {
                hosts => ["192.168.56.11:9200"]
                index => "es-error-%{+YYYY.MM.dd}"
            }
        }
    }

    123

  • 相关阅读:
    win7上装红米4手机驱动提示空间不足
    HBuilder中改造console.info
    Thinkphp 出现 “_CACHE_WRITE_ERROR” 错误的可能解决办法
    Linux上跑两个版本的php,5.4.45和5.3.24
    JavaScript中对日期格式化的新想法.
    怪不得知乎急着招前端开发.
    菜鸟利用python处理大文本数据的血泪路
    Python:数字
    Python:列表,元组
    Python:映像、集合
  • 原文地址:https://www.cnblogs.com/centos2017/p/8920519.html
Copyright © 2020-2023  润新知