• Logstash简介与配置&logstash收集Java日志


    1.简介与工作流程

       Logstash是采用ruby语言开发的。logstash与beats一样,是一个data shipper,只不过logstash比较重量级,支持的功能也多。

    1.简介

      官方的解释是:转换和存储数据

      Logstash 是免费且开放的服务器端数据处理管道,能够从多个来源采集数据,转换数据,然后将数据发送到您最喜欢的“存储库”中。

      Logstash 能够动态地采集、转换和传输数据,不受格式或复杂度的影响。利用 Grok 从非结构化数据中派生出结构,从 IP 地址解码出地理坐标,匿名化或排除敏感字段,并简化整体处理过程。

    2.工作流程

    1.  Input输入-采集各种样式、大小和来源的数据

      数据往往以各种各样的形式,或分散或集中地存在于很多系统中。Logstash 支持各种输入选择,可以同时从众多常用来源捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、Web 应用、数据存储以及各种 AWS 服务采集数据。

      关于其输入支持的插件参考:输入插件

    2.Filter筛选-实时解析和转换数据

      数据从源传输到存储库的过程中,Logstash 过滤器能够解析各个事件,识别已命名的字段以构建结构,并将它们转换成通用格式,以便进行更强大的分析和实现商业价值。

      Logstash 能够动态地转换和解析数据,不受格式或复杂度的影响:利用 Grok 从非结构化数据中派生出结构、从 IP 地址破译出地理坐标、将 PII 数据匿名化,完全排除敏感字段、简化整体处理,不受数据源、格式

    或架构的影响。

      使用丰富的过滤器库和功能多样的 Elastic Common Schema,可以实现无限丰富的可能。

    3. Output输出-选择存储库,导出数据

       Elasticsearch 是首选输出方向,能够为搜索和分析带来无限可能,但它并非唯一选择。Logstash 提供众多输出选择,您可以将数据发送到您要指定的地方,并且能够灵活地解锁众多下游用例。

    2.下载安装

    1. 下载logstash

     2. 解压后目录如下:

    3. 查看logstash/config 目录:

     logstash-sample.conf样本配置如下:

    # Sample Logstash configuration for creating a simple
    # Beats -> Logstash -> Elasticsearch pipeline.
    
    input {
      beats {
        port => 5044
      }
    }
    
    output {
      elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
        #user => "elastic"
        #password => "changeme"
      }
    }

     3. 入门

    1. 收集nginx的访问日志

      以控制台的方式进行搜集,便于调试

    (1)查看logstash的两条日志(因为我装的有git,所以windows可以用linux的相关命令)

    liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
    $ pwd
    /e/nginx/nginx-1.12.2/logs
    
    liqiang@root MINGW64 /e/nginx/nginx-1.12.2/logs
    $ head -n 2 ./access.log
    127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
    127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] "GET /Test.html HTTP/1.1" 200 142 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"

    (2) $logstash/config/ 目录下创建logstash_nginx.conf,内容如下:

    input {
      stdin { }
    }
    
    filter {
      grok {
        match => {
          "message" => '%{IPORHOST:remote_ip} - %{DATA:user_name} [%{HTTPDATE:time}] "%{WORD:request_action} %{DATA:request} HTTP/%{NUMBER:http_version}" %{NUMBER:response} %{NUMBER:bytes} "%{DATA:referrer}" "%{DATA:agent}"'
        }
      }
    
      date {
        match => [ "time", "dd/MMM/YYYY:HH:mm:ss Z" ]
        locale => en
      }
    
      geoip {
        source => "remote_ip"
        target => "geoip"
      }
    
      useragent {
        source => "agent"
        target => "user_agent"
      }
    }
    
    output {
    stdout {
     codec => rubydebug 
     }
    }

      grok 将非格式化的日志信息转化为JSON格式的信息。

      date:转换时间

      geoip:获取地理位置

      useragent:提取用户的来源设备

    (3) 测试日志收集:

    liqiang@root MINGW64 /e/ELK/logstash-7.6.2
    $ head -n 2 /e/nginx/nginx-1.12.2/logs/access.log | /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_nginx.conf
    Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
    [2020-08-23T12:31:18,218][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
    [2020-08-23T12:31:18,857][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
    [2020-08-23T12:31:25,229][INFO ][org.reflections.Reflections] Reflections took 122 ms to scan 1 urls, producing 20 keys and 40 values
    [2020-08-23T12:31:36,465][INFO ][logstash.filters.geoip   ][main] Using geoip database {:path=>"E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.3-java/vendor/GeoLite2-City.mmdb"}
    [2020-08-23T12:31:36,994][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
    [2020-08-23T12:31:37,019][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_nginx.conf"], :thread=>"#<Thread:0x1e1b9b66 run>"}
    [2020-08-23T12:31:40,502][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
    [2020-08-23T12:31:40,731][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
    {
             "user_name" => "-",
              "@version" => "1",
                  "host" => "root",
          "http_version" => "1.1",
                 "bytes" => "142",
                  "tags" => [
            [0] "_geoip_lookup_failure"
        ],
        "request_action" => "GET",
              "referrer" => "-",
                 "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
              "response" => "200",
                 "geoip" => {},
                  "time" => "09/Mar/2018:17:48:00 +0800",
               "request" => "/Test.html",
             "remote_ip" => "127.0.0.1",
            "@timestamp" => 2018-03-09T09:48:00.000Z,
               "message" => "127.0.0.1 - - [09/Mar/2018:17:48:00 +0800] "GET /Test.html HTTP/1.1" 200 142 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
    ",
            "user_agent" => {
               "minor" => "0",
               "build" => "",
                "name" => "Chrome",
            "os_major" => "8",
            "os_minor" => "1",
              "device" => "Other",
             "os_name" => "Windows",
               "major" => "64",
               "patch" => "3282",
                  "os" => "Windows"
        }
    }
    {
             "user_name" => "-",
              "@version" => "1",
                  "host" => "root",
          "http_version" => "1.1",
                 "bytes" => "612",
                  "tags" => [
            [0] "_geoip_lookup_failure"
        ],
        "request_action" => "GET",
              "referrer" => "-",
                 "agent" => "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",
              "response" => "200",
                 "geoip" => {},
                  "time" => "09/Mar/2018:17:45:59 +0800",
               "request" => "/",
             "remote_ip" => "127.0.0.1",
            "@timestamp" => 2018-03-09T09:45:59.000Z,
               "message" => "127.0.0.1 - - [09/Mar/2018:17:45:59 +0800] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"
    ",
            "user_agent" => {
               "minor" => "0",
               "build" => "",
                "name" => "Chrome",
            "os_major" => "8",
            "os_minor" => "1",
              "device" => "Other",
             "os_name" => "Windows",
               "major" => "64",
               "patch" => "3282",
                  "os" => "Windows"
        }
    }
    [2020-08-23T12:31:43,718][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
    [2020-08-23T12:31:44,453][INFO ][logstash.runner          ] Logstash shut down.

    2. 收集java日志

      将java日志收集到ES中。

     1. springboot的web项目使用logback直接输出到logstash中

    (1) 配置logstash,监听tcp端口4560并且启动logstash

    $logstash/config目录下新建logstash_java.conf

    # Sample Logstash configuration for creating a simple
    # Beats -> Logstash -> Elasticsearch pipeline.
     
    input {
     tcp {
     mode => "server"
     host => "127.0.0.1"
     port => 4560
     codec => json_lines
     }
    }
    output {
     elasticsearch {
     hosts => "127.0.0.1:9200"
     index => "springboot-logstash-%{+YYYY.MM.dd}"
     }
    }

    (2)启动logstash

    $ /e/ELK/logstash-7.6.2/bin/logstash -f ./config/logstash_java.conf

    (3)springboot项目pom中引入依赖

            <!--logStash -->
            <dependency>
                <groupId>net.logstash.logback</groupId>
                <artifactId>logstash-logback-encoder</artifactId>
                <version>5.3</version>
            </dependency>

    (4)src/main/resources下新建logback-spring.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
        <include resource="org/springframework/boot/logging/logback/base.xml" />
        <appender name="LOGSTASH"
            class="net.logstash.logback.appender.LogstashTcpSocketAppender">
            <!--配置logStash 服务地址 -->
            <destination>127.0.0.1:4560</destination>
            <!-- 日志输出编码 -->
            <encoder charset="UTF-8"
                class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
                <providers>
                    <timestamp>
                        <timeZone>UTC</timeZone>
                    </timestamp>
                    <pattern>
                        <pattern>
                            {
                            "logLevel": "%level",
                            "serviceName": "${springAppName:-}",
                            "pid": "${PID:-}",
                            "thread": "%thread",
                            "class": "%logger{40}",
                            "rest": "%message"
                            }
                        </pattern>
                    </pattern>
                </providers>
            </encoder>
        </appender>
    
        <root level="DEBUG">
            <appender-ref ref="LOGSTASH" />
            <appender-ref ref="CONSOLE" />
        </root>
    </configuration>

    (5)启动应用后查看日志:

    (6)kibana创建index pattern后分析

    第一步:

     第二步:

    查看:

     3.收集Java log4j生成的日志文件

    1. 日志文件格式如下:

    2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:xiangmuicc-server	runk	argetclasses started by Administrator in E:xiangmuicc-server	runk)
    2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE

    2. 编写conf文件logstash_file.conf,标准输入输出测试

    input {
      stdin { }
    }
    
    filter {
      grok {
        match => {
          "message" => '%{DATESTAMP:time} [%{WORD:threadName}] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
        }
      }
      
      date {
        match => [ "time", "YYYY/MM/dd-HH:mm:ss" ]
        locale => en
      }  
    }
    
    output {
    stdout {
     codec => rubydebug 
     }
    }

    测试如下:

    $ head -n 2 /g/logs/test.log | ./bin/logstash -f ./config/logstash_file.conf
    Sending Logstash logs to E:/ELK/logstash-7.6.2/logs which is now configured via log4j2.properties
    [2020-08-25T20:44:25,769][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
    [2020-08-25T20:44:26,519][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
    [2020-08-25T20:44:33,369][INFO ][org.reflections.Reflections] Reflections took 220 ms to scan 1 urls, producing 20 keys and 40 values
    [2020-08-25T20:44:40,149][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
    [2020-08-25T20:44:40,189][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["E:/ELK/logstash-7.6.2/config/logstash_file.conf"], :thread=>"#<Thread:0x5f39f2d0 run>"}
    [2020-08-25T20:44:43,482][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
    [2020-08-25T20:44:43,692][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    E:/ELK/logstash-7.6.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
    {
            "@timestamp" => 0020-08-13T05:03:26.000Z,
                  "time" => "20/08/13-13:09:09",
        "syslog_message" => " com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\xiangmu\icc-server\trunk\target\classes started by Administrator in E:\xiangmu\icc-server\trunk)
    ",
              "@version" => "1",
               "message" => "2020/08/13-13:09:09 [main] INFO  com.zd.ICCApplication.logStarting - Starting ICCApplication on MicroWin10-1535 with PID 12724 (E:\xiangmu\icc-server\trunk\target\classes started by Administrator in E:\xiangmu\icc-server\trunk)
    ",
            "threadName" => "main",
                  "host" => "root",
              "logLevel" => "INFO"
    }
    {
            "@timestamp" => 0020-08-13T05:03:26.000Z,
                  "time" => "20/08/13-13:09:09",
        "syslog_message" => "com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE
    ",
              "@version" => "1",
               "message" => "2020/08/13-13:09:09 [main] DEBUG com.zd.ICCApplication.logStarting - Running with Spring Boot v2.3.1.RELEASE, Spring v5.2.7.RELEASE
    ",
            "threadName" => "main",
                  "host" => "root",
              "logLevel" => "DEBUG"
    }
    [2020-08-25T20:44:45,492][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
    [2020-08-25T20:44:45,902][INFO ][logstash.runner          ] Logstash shut down.

     3. 修改logstash_file.conf文件,读取log文件,同时存入redis生成索引

    input {
        file {
            path => "G:/logs/test.log"
            type => "testfile"
            start_position => "beginning"
        }
    }
    
    filter {
        grok {
            match => {
                "message" => '%{DATESTAMP:time} [%{WORD:threadName}] %{WORD:logLevel} %{GREEDYDATA:syslog_message}'
            }
        }
    
        date {
            match => ["time", "YYYY/MM/dd-HH:mm:ss"]
            locale => en
        }
    }
    
    output {
        stdout {
            codec => rubydebug
        }
        elasticsearch {
            hosts => ["127.0.0.1:9200", "127.0.0.1:19200"] 
            index => "testfile-%{+YYYY.MM.dd}"
            template_overwrite => true
        }
    }

    执行如下:

    liqiang@root MINGW64 /e/ELK/logstash-7.6.2
    $ ./bin/logstash -f ./config/logstash_file.conf

    4. kibana查看索引字段映射如下:

    {
      "mapping": {
        "_doc": {
          "properties": {
            "@timestamp": {
              "type": "date"
            },
            "@version": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "host": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "logLevel": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "message": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "path": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "syslog_message": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "threadName": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "time": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "type": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      }
    }
    总结:
      grok常用模式参考阿里

    【当你用心写完每一篇博客之后,你会发现它比你用代码实现功能更有成就感!】
  • 相关阅读:
    分享一下用终端的命令来恢复丢失的硬盘分区表 (转)
    Smart Link
    underrun || overrun
    mtr命令详解诊断网络路由
    tracert traceroute
    OE1、OE2、ON1、ON2路由有什么区别?
    GRE tunnel 2
    【SAP HANA】新建账户和数据库(2)
    【SAP HANA】SAP HANA开篇(1)
    入职一周
  • 原文地址:https://www.cnblogs.com/qlqwjy/p/13430563.html
Copyright © 2020-2023  润新知