elasticsearch基本操作及ELK日志查询与分析系统搭建

elasticsearch基本操作及ELK日志查询与分析系统搭建

一、elasticsearch搭建及基本操作
1.elasticsearch简介
elasticsearch(以后简称es)的作用就不做太多说明，官网一句You Know, for Search已经概括其核心功能。对应原理性的东西，索引、集群、切片等功能不是本文重点，可另寻文章学习。
es的版本区别（摘自es官网）：
Elasticsearch 5.6.0 　　支持多type，可通过配置index.mapping.single_type:true设置为仅支持单个type
Elasticsearch 6.x 　　　Indices created in 6.x only allow a single-type per index(仅支持单个type)
Elasticsearch 7.X 　　　Specifying types in requests is deprecated(指定type的请求方式已过时)，
原有请求的type部分固定为_doc，其实就是固定一个名为_doc的type，官方称其为虚类型，作为_search、_source这样的endpoint端点进行理解
Elasticsearch 8.x Specifying types in requests is no longer supported(不再支持还有type的请求方式)

2.安装
a.拉取镜像
docker pull elasticsearch:7.3.0
b.运行容器
docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --name es01 elasticsearch:7.3.0
c.修改配置
进入容器，修改/usr/share/elasticsearch/config/elasticsearch.yml
network.host: 0.0.0.0

d.重启容器

3.创建索引和记录
restapi操作语法
a.curl -X PUT http://host:port/index?pretty=true
b.curl -X PUT http://host:port/index/type/id -H 'Content-Type:application/json' -d 'json content' 对于7.X的版本，type部分固定为_doc即可
c.curl -X POST http://host:port/index/type -H 'Content-Type:application/json' -d 'json content' 对于7.X的版本，type部分固定为_doc即可
注意b和c区别，请求的方法不同，并且b指定了id，c未指定id，这两者就是自行指定id和es自动生成id的创建记录方法

book索引下面创建一个1的记录
curl -X PUT http://localhost:9200/book/_doc/1?pretty=true -H 'Content-Type:application/json' -d '{"name":"thinking in java","ver":"1.0"}'

{
"_index" : "book",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
　　"total" : 2,
　　"successful" : 1,
　　"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}

若重复PUT同一个记录则会更新，结果结果如下
{
"_index" : "book",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
　　"total" : 2,
　　"successful" : 1,
　　"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}

继续在book索引下面名为python的type下面创建一个1的记录
curl -X PUT http://localhost:9200/book/python/1?pretty=true -H 'Content-Type:application/json' -d '{"name":"python1","version":"1.0.0","description":"this is for python"}'
{
"error" : {
　　"root_cause" : [
　　　　{
　　　　"type" : "illegal_argument_exception",
　　　　"reason" : "Rejecting mapping update to [book] as the final mapping would have more than 1 type: [_doc, python]"
　　　　}
　　],
　　"type" : "illegal_argument_exception",
　　"reason" : "Rejecting mapping update to [book] as the final mapping would have more than 1 type: [_doc, python]"
},
"status" : 400
}
可以看到创建结果是失败，原因是6.0以上的版本不允许一个index下面有多个type

4.查询记录
restapi操作语法
curl http://host:port/index/_search 查询索引的所有记录
curl http://host:port/index/type/1 查询id=1的记录

查询示例
curl http://localhost:9200/book/_search?pretty=true
{
　　"took" : 3,
　　"timed_out" : false,
　　"_shards" : {
　　　　"total" : 1,
　　　　"successful" : 1,
　　　　"skipped" : 0,
　　　　"failed" : 0
　　},
　　"hits" : {
　　　　"total" : {
　　　　　　"value" : 1,
　　　　　　"relation" : "eq"
　　　　},
　　　　"max_score" : 1.0,
　　　　"hits" : [
　　　　　　　　{
　　　　　　　　　　"_index" : "book",
　　　　　　　　　　"_type" : "_doc",
　　　　　　　　　　"_id" : "1",
　　　　　　　　　　"_score" : 1.0,
　　　　　　　　　　"_source" : {
　　　　　　　　　　　　"name" : "thinking in java",
　　　　　　　　　　　　"ver" : "1.0"
　　　　　　　　　　}
　　　　　　　　}
　　　　]
　　}
}

curl http://localhost:9200/book/_doc/1?pretty=true
{
　　"_index" : "book",
　　"_type" : "_doc",
　　"_id" : "1",
　　"_version" : 1,
　　"_seq_no" : 0,
　　"_primary_term" : 1,
　　"found" : true,
　　"_source" : {
　　　　"name" : "thinking in java",
　　　　"ver" : "1.0"
　　}
}
条件查询语法
a.查询条件
"query" : {
"term" : { "user" : "kimchy" }
}
b.排序
"sort":[
{ "post_date":{ "order":"asc" }},
{ "name":"desc" }
]
c.窗口控制
"from":0
"size":10
d.字段投影
"_source": {
"include": [ "filed1", "field2" ],
"exclude": [ "field3" ]
}

5.删除索引和记录
restapi操作语法
curl -X DELETE http://host:port/index
curl -X DELETE http://host:port/index/type/record

删除示例
curl -X DELETE http://localhost:9200/book?pretty=true
curl -X DELETE http://localhost:9200/book/_doc/1?pretty=true

6.更新记录
restapi操作语法
curl -X PUT http://host:port/index/type/id -H 'Content-Type:application/json' -d 'json content' 创建已存在的记录则会执行更新操作
curl -X POST http://host:port/index/type/id/_update -H 'Content-Type:application/json' -d '{"doc":{"filed":"value"}}' 通过_update端点更新7.X以前的写法
curl -X POST http://host:port/index/_update/id -H 'Content-Type:application/json' -d '{"doc":{"filed":"value"}}' 通过_update端点更新7.X以后的写法

更新示例
curl http://localhost:9200/book/_update/1?pretty=true -X POST -H 'Content-Type:application/json' -d '{"doc":{"name":"python programming"}}'
{
　　"_index" : "book",
　　"_type" : "_doc",
　　"_id" : "1",
　　"_version" : 3,
　　"result" : "updated",
　　"_shards" : {
　　　　"total" : 2,
　　　　"successful" : 1,
　　　　"failed" : 0
　　},
　　"_seq_no" : 5,
　　"_primary_term" : 1
}

二、ELK日志采集分析系统搭建
1.日志采集分析过程
数据输入(采集)->数据处理->数据输出(存储)
a.采集日志文件信息，实时跟踪日志文件的变化，获取变化结果 filebeat
b.日志格式化 logstash
c.日志存储于查询 es
d.可视化工具 kibana

2.filebeat搭建及配置
logstash本身就支持直接从文件中读取数据进行处理，还要经过filebeat采集是因为其轻量级且无需java运行环境等优点

a.拉取镜像
docker pull prima/filebeat:6.4.2

b.编写配置文件
filebeat.inputs:
- type: log
enabled: true
paths:
- /home/shared_disk/tomcat_logs/*.log
#output.logstash:
# enabled: true
# hosts: ["localhost:5044"]
output.console:
enabled: true
pretty: true
简单的配置，采集到的的日志输出到控制台。复杂的配置目前还没过多研究，可以查阅其他资料

c.启动容器
docker run -d --name filebeat01 -v /home/shared_disk/filebeat/filebeat.yml:/filebeat.yml -v /home/shared_disk/tomcat_logs:/home/shared_disk/tomcat_logs prima/filebeat:6.4.2
注意挂载的目录和文件，确保能收集到日志文件
此时/home/shared_disk/tomcat_logs/目录下的.log文件日志有信息则可以看到filebeat的控制台有以下输出，说明采集成功
{
　　"@timestamp": "2019-08-23T06:51:37.889Z",
　　"@metadata": {
　　　　"beat": "filebeat",
　　　　"type": "doc",
　　　　"version": "6.4.2"
　　},
　　"source": "/home/shared_disk/tomcat_logs/docker-learn-info.log",
　　"offset": 47547,
　　"message": "2019-08-23 06:51:34,030 [http-nio-8083-exec-9] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test",
　　"prospector": {
　　　　"type": "log"
　　},
　　"input": {
　　"type": "log"
　　},
　　"beat": {
　　　　"hostname": "efd575c7aa99",
　　　　"version": "6.4.2",
　　　　"name": "efd575c7aa99"
　　},
　　"host": {
　　　　"name": "efd575c7aa99"
　　}
}

3.logstash搭建及配置
logstash的作用就是对输入的数据进行格式化、类型转换、新增字段、删除字段等处理，得到用户想要的数据格式并输出到指定位置

a.拉取镜像
docker pull logstash:7.3.0

b.启动容器
docker run -d -p 5044:5044 -p 9600:9600 -v /home/shared_disk/tomcat_logs/:/home/shared_disk/tomcat_logs/ --name logstash01 logstash:7.3.0
需要注意的是我们这里是logstash的输入是日志文件，需要注意tomcat_logs目录权限问题，因为docker容器使用logstash用户启动，如果没有权限，将无法访问挂载目录

c.进入容器/usr/share/logstash/config目录下修改配置文件
修改pipelines.yml
- pipeline.id: logstash-1
path.config: "/usr/share/logstash/config/*.conf"
创建my-logstash.conf
input {
　　file {
　　　　path => "/home/shared_disk/tomcat_logs/*.log"
　　　　type => "system"
　　　　start_position => "beginning"
　　}
}

filter {

}

output {
　　stdout {
　　}

}

d.重启docker容器，查看容器日志可以看到控制台打印了如下信息
{
"path" => "/home/shared_disk/tomcat_logs/docker-learn-info.log",
"@timestamp" => 2019-08-23T09:32:57.085Z,
"host" => "d677fed8b760",
"@version" => "1",
"message" => "2019-08-23 09:32:56,976 [http-nio-8083-exec-3] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test",
"type" => "system"
}

4.kibana搭建
a.拉取镜像
docker pull kibana:7.3.0
b.启动容器
docker run -d -p 5601:5601 --name kibana01 kibana:7.3.0
c.修改配置文件
/usr/share/kibana/config/kibana.yml
server.host: "0.0.0.0"
elasticsearch.hosts: [ "http://192.168.16.84:9200" ]
d.重启docker容器

5.elk集成
基于上述的搭建及配置过程，我们现在只需要修改filebeat的输出，logstash的输入和输出即可完成elk的集成

a.filebeat输出修改
output.logstash:
enabled: true
hosts: "192.168.16.84:5044"

b.logstash出入修改
beats {
　　port => 5044
}

c.logstash输出修改
elasticsearch {
　　hosts => "192.168.16.84:9200"
　　index => "api-log"
}

d.日志格式化配置
logstash接收到的filebeat传过来的数据有很多，有filebeat自带的还有logstash自带的，但是我们最为关注的只有message部分，也就是以下部分
"message" => "2019-08-23 09:32:56,976 [http-nio-8083-exec-3] INFO com.allen.dockerlearn.DockerLearnApplication - this is the message for test"
所以我们需要读数据进行处理，在logstash的过滤器添加以下配置：
filter {
　　grok {
　　　　match => {
　　　　　　"message" => "%{TIMESTAMP_ISO8601:log_time}s+[%{DATA:thread_name}]s+%{LOGLEVEL:log_level}s+%{DATA:class_name}s+-s+(?<log_info>(w+s*)*)"
　　　　}
　　}
　　mutate {
　　　　add_field => {
　　　　　　"host_name" => "%{[host][name]}"
　　　　}
　　　　remove_field => ["source","beat","host","@version","@timestamp","prospector","input","tags","offset","message"]
　　}
}
经过此过滤器，我们提取了message中的部分字段，起名为log_time、thread_name、log_level、class_name、log_info，
新增了host_name字段，其值是从原有的host.name这个字段提取的，删除了source、beat...message等字段，最终的数据格式如下
{
"thread_name" => "http-nio-8083-exec-1",
"log_time" => "2019-08-26 13:31:11,029",
"host_name" => "efd575c7aa99",
"log_info" => "this is the message for test",
"class_name" => "com.allen.dockerlearn.DockerLearnApplication",
"log_level" => "INFO"
}

至此ELK集成已经完成
打开kibana主页，在设置里面创建索引之后，到discovery栏目即可进行日志搜索与查看，visualize栏目可以进行日志统计与分析
相关阅读:
沙县小吃炖罐做法 114沙县小吃配料网
 党参_百度百科
 EF架构~通过EF6的DbCommand拦截器来实现数据库读写分离~续~添加事务机制
 EF架构~通过EF6的DbCommand拦截器来实现数据库读写分离
 知方可补不足~Sqlserver中的几把锁和.net中的事务级别
 面对大数据，我们应该干的事~大话开篇
 EF架构~在T4模版中自定义属性的getter和setter
SurfaceView的一个小应用：开发示波器
 Jetty入门
 ios ARC
原文地址：https://www.cnblogs.com/xiao-tao/p/11412790.html