• logstash 学习小记


    logstash 学习小记

    标签(空格分隔): 日志收集


    Introduce

    Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for
    later use (like, for searching). – http://logstash.net

    自从2013年logstash被ES公司收购之后,ELK stask正式称为官方用语。非常多公司都開始ELK实践。我们也不例外,借用新浪是怎样分析处理32亿条实时日志的?的一张图
    此处输入图片的描写叙述
    这是一个再常见只是的架构了:
    (1)Kafka:接收用户日志的消息队列。
    (2)Logstash:做日志解析,统一成JSON输出给Elasticsearch。
    (3)Elasticsearch:实时日志分析服务的核心技术,一个schemaless,实时的数据存储服务。通过index组织数据,兼具强大的搜索和统计功能。
    (4)Kibana:基于Elasticsearch的数据可视化组件。超强的数据可视化能力是众多公司选择ELK stack的重要原因。

    可是众多log 收集framwork,像flume,scribe。fluent,为什么选用logstash呢?

    原因非常简单:

    1. 部署启动非常easy。仅仅须要有jdk就OK了
    2. 配置简单,无需编码
    3. 支持收集log路径的正則表達式,不像flume那样必须写死要收集的文件名称。logstash不是,像这样

      path => [“/var/log/.log“]

      有个Flume VS Fluentd VS Logstash能够看看

    Logstash Examples

    logstash事件处理流程氛围三个stages:input ,filter,output。input支持非常多。如file。redis,kafka等等,filter主要是对input的log进行自己想要的处理,output则是输出到你要存储log的第三方framework。如kafka,redis,elasticsearch。db什么的。详细的查看官网。
    废话不多说,開始样例:
    1. 最最简单的样例
    input和output都是标准输入输出

    [joeywen@192 logstash]$ bin/logstash -e 'input { stdin { } } output { stdout {}}'
    
    Logstash startup completed
    >hello world  ## 输入的内容
    >2015-08-02T05:26:55.564Z joeywens-MacBook-Pro.local hello world   ## logstash收集的内容
    1. 编写config文件
    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
      }
    }
    
    
    output {
      stdout {  codec => rubydebug }
      #  elasticsearch {
      #      host => 'localhost'
      #      protocol => 'transport'
      #      cluster => 'elasticsearch'
     #       index => 'logstash-joeymac-%{+YYYY.MM.dd}'
     #   }
    }

    输入是file形式,收集系统日志,假设有异常发生,通常异常会多行,这里用codec => multiline 来对出现异常的多行转换为一行输入

    输出就是ES。或者你也能够把stdout作为调试打开看看,输出的是什么内容。执行命令例如以下以及输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    
    {
        "@timestamp" => "2015-08-02T05:36:08.972Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:34:51.764 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=2] -[KSUpdateEngine(PrivateMethods) updateFinish] KSUpdateEngine update processing complete.",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }
    {
        "@timestamp" => "2015-08-02T05:36:08.973Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:36:08.105 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=3] -[KSAgentUploader fetcher:failedWithError:] Failed to upload stats to <NSMutableURLRequest https://tools.google.com/service/update2> with error Error Domain=NSURLErrorDomain Code=-1001 "The request timed out." UserInfo=0x3605f0 {NSErrorFailingURLStringKey=https://tools.google.com/service/update2, _kCFStreamErrorCodeKey=60, NSErrorFailingURLKey=https://tools.google.com/service/update2, NSLocalizedDescription=The request timed out., _kCFStreamErrorDomainKey=1, NSUnderlyingError=0x35fd30 "The request timed out."}",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }
    {
        "@timestamp" => "2015-08-02T05:36:08.973Z",
           "message" => "Aug  2 13:36:08 joeywens-MacBook-Pro.local GoogleSoftwareUpdateAgent[1976]: 2015-08-02 13:36:08.272 GoogleSoftwareUpdateAgent[1976/0xb029b000] [lvl=3] -[KSAgentApp uploadStats:] Failed to upload stats <KSStatsCollection:0x4323e0 path="/Users/joeywen/Library/Google/GoogleSoftwareUpdate/Stats/Keystone.stats", count=6, stats={",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog"
    }

    假设想加入或删除字段,该怎么办?filter就该登场了

    1. filter
      filter的功能十分强大。能够对input的内容做不论什么更改。input的内容会转换为一个叫event的map,里面存放着key/value对,正如你所示输出一样,@timestamp,type,@version, host,message等等,都是event里面的key,在filter里面你能够启动ruby 编程plugin对其做不论什么更改
      如:
    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
      }
    }
    
    filter {
        multiline {
          pattern => "(^d+serror)|(^.+Exception:.+)|(^s+at .+)|(^s+... d+ more)|(^s*Causedby:.+)"
          what => "previous"
        }
        if [type] =~ /^syslog/ {
            ruby {
                code => "file_name = event['path'].split('/')[-1]
                event['file_name'] = file_name"
            }
        }
    }
    
    output {
      stdout {  codec => rubydebug }
     }

    如上我对type已syslog开头的event做更改,调用ruby编程
    看看输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    {
        "@timestamp" => "2015-08-02T05:46:52.771Z",
           "message" => "Aug  2 13:46:40 joeywens-MacBook-Pro.local Dock[234]: CGSConnectionByID: 0 is not a valid connection ID.",
          "@version" => "1",
              "host" => "joeywens-MacBook-Pro.local",
              "path" => "/var/log/system.log",
              "type" => "syslog",
         "file_name" => "system.log"
    }

    能够看到多了个file_name的字段,
    假设相对message做解析的话。须要调用grok plugin来做,grok是非常强大插件,比如

    input {
      file {
        path => "/var/log/http.log"
      }
    }
    filter {
      grok {
        patterns_dir => ["/opt/logstash/patterns", "/opt/logstash/extra_patterns"]
        match => { "message" => "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }
      }
    }

    对于message字段调用正则匹配,语法是%{SYNTAX:SEMANTIC}
    第一个SYNTAX是正則表達式名称,第二个是对于匹配成功的字段取名字,这些SYNTAX存在指定的pattern_dir文件夹下的文件,格式是:

    NAME PATTERN
    如 NUMBER d+

    也能够使用mutate来最event的key和value做更改。包含remove,add,update,rename 等等。详细的都能够看看(logstash文档)[https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html]

    这里给个详细的样例吧
    配置:

    input {
      file {
        path => ["/var/log/*.log"]
        type => "syslog"
    
      }
    }
    
    filter {
        multiline {
          pattern => "(^d+serror)|(^.+Exception:.+)|(^s+at .+)|(^s+... d+ more)|(^s*Causedby:.+)"
          what => "previous"
        }
        if [type] =~ /^syslog/ {
            ruby {
                code => "file_name = event['path'].split('/')[-1]
                event['file_name'] = file_name"
            }
            grok {
                patterns_dir => ["./patterns/*"]
                match => {"message" => "%{MAC_BOOK:joeymac}"}
            }
    
            mutate {
                rename => {"file_name" => "fileName"}
                add_field => {"foo_%{joeymac}" => "Hello world, from %{host}"}
            }
        }
    }
    
    output {
      stdout {  codec => rubydebug }
     }

    输出

    [joeywen@192 logstash]$ bin/logstash -f sys.conf
    Logstash startup completed
    {
                      "@timestamp" => "2015-08-02T06:10:13.161Z",
                         "message" => "Aug  2 14:10:12 joeywens-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.quicklook[2206]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.quicklook",
                        "@version" => "1",
                            "host" => "joeywens-MacBook-Pro.local",
                            "path" => "/var/log/system.log",
                            "type" => "syslog",
                         "joeymac" => "joeywens-MacBook-Pro",
                        "fileName" => "system.log",
        "foo_joeywens-MacBook-Pro" => "Hello world, from joeywens-MacBook-Pro.local"
    }

    转载请注明出处

  • 相关阅读:
    Java进行数据库导出导入 亲测可用
    Linux 进程监控和自动重启的简单实现
    Linux的运行级别详细说明
    设置让程序开机自启动的方法
    Java读书推荐
    谈谈Python、Java与AI
    这是一篇测试随笔。
    雷林鹏分享:Apache POI数据库
    雷林鹏分享:Apache POI打印区域
    雷林鹏分享:Apache POI超链接
  • 原文地址:https://www.cnblogs.com/yxysuanfa/p/7399028.html
Copyright © 2020-2023  润新知