• 用ELK分析每天4亿多条腾讯云MySQL审计日志(2)--EQL


        上一篇介绍了用ELK分析4亿多条审计日志过程,现在介绍如何用Python3分析ES的程序

        需要分析的核心库审计数据:

             1,950多张表,几十个账号,

             2,5种操作类型(select,update,insert,delete,replace),14个实例(1主13从库)

       分析说明:

        select汇总: 必须包含from关键字:排除INSERT INTO 表 SELECT '609818','1' 这样情况

        insert汇总: 必须有into关键字: 排除select中有insert关键字

        update汇总: 不包含for 关键字: 排查 for update 的select查询

       delete汇总: 不包含into 关键字: 排除insert等内容里包含delete的数据情况

        replace汇总: 必须有into 关键字: 排除select查询中有replace函数

       分析方法:

       为保证把以上数据都能分析出来, 将950多张表,存放到数据表中,循环950多次表,每个表循环5次类型:  

        sql="select id,name,ea_time from tab order by id desc --查询全部表
        dml={'select','update','insert','delete','replace'} --每个表循环5次类型

        后来研发发现,分析程序表ol_list统计,但"库名.表",如line.ol_list,不会统计出来。后来修改EQL解决,具体代码如下:

        下列是“”select“查询EQL的代码:        

     if op.find('select')>=0:  # select,包含from
         body ={"query":{
            "bool":{ "must":[{
                    "match":{"Sql":'{op}'.format(op=op) }},{
                    "match":{"Sql": 'from'}}],
                "should": [{
                    "term": {"Sql": '{name}'.format(name=tabname)}}, {
                    "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                "minimum_should_match": 1,
                "filter":{
                    "range":{
                        "Timestamp.keyword":{
                            "lte": "{date}".format(date=end_time),
                            "gte": "{begindate}".format(begindate=begin_time),
                        }
                    } }}},
        "size":0,  "aggs":{ "aggr_mame":{
                "terms":{
                    "field":"User.keyword",
                    "size":2000
                },
                "aggs":{
                    "aggr_der":{
                        "terms":{
                            "field":"PolicyName.keyword"
                        },
                        "aggs":{
                            "top_tag_hits":{
                                "top_hits":{
                                    "size":1
                                }
                            }  }}}}}
           }
    View Code

      说明:

        1,使用:"minimum_should_match": 1,   ,这个是兼容: "表名","库名.表名“

        5种类型的全部EQL:   

     if op.find('select')>=0:  # select,包含from
         body ={"query":{
            "bool":{ "must":[{
                    "match":{"Sql":'{op}'.format(op=op) }},{
                    "match":{"Sql": 'from'}}],
                "should": [{
                    "term": {"Sql": '{name}'.format(name=tabname)}}, {
                    "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                "minimum_should_match": 1,
                "filter":{
                    "range":{
                        "Timestamp.keyword":{
                            "lte": "{date}".format(date=end_time),
                            "gte": "{begindate}".format(begindate=begin_time),
                        }
                    } }}},
        "size":0,  "aggs":{ "aggr_mame":{
                "terms":{
                    "field":"User.keyword",
                    "size":2000
                },
                "aggs":{
                    "aggr_der":{
                        "terms":{
                            "field":"PolicyName.keyword"
                        },
                        "aggs":{
                            "top_tag_hits":{
                                "top_hits":{
                                    "size":1
                                }
                            }  }}}}}
           }
        elif op.find('update')>=0:  # update 不能有for关键字
          body = {"query": {
             "bool": {"must": [{
                 "match": {"Sql": '{op}'.format(op=op)}},{
                 "match": {"PolicyName.keyword": 'd8t'}}],
                 "must_not": [{"match": {"Sql": "for"}}],
                 "should": [{
                     "term": {"Sql": '{name}'.format(name=tabname)}}, {
                     "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                 "minimum_should_match": 1,
                 "filter": {
                     "range": {
                         "Timestamp.keyword": {
                             "lte": "{date}".format(date=end_time),
                             "gte": "{begindate}".format(begindate=begin_time),
                         }
                     }}}},
             "size": 0, "aggs": {"aggr_mame": {
                 "terms": {
                     "field": "User.keyword",
                     "size": 2000
                 },
                 "aggs": {
                     "aggr_der": {
                         "terms": {
                             "field": "PolicyName.keyword"
                         },
                         "aggs": {
                             "top_tag_hits": {
                                 "top_hits": {
                                     "size": 1
                                 }
                             }}}}}}
         }
        elif op.find('replace') >= 0:  # replace  必须有into关键字
            body = {"query": {
                "bool": {"must": [{
                    "match": {"Sql": '{op}'.format(op=op)}}, {
                    "match": {"PolicyName.keyword": 'd8t'}},{
                    "match": {"Sql": 'into'}}],
                    "should": [{
                        "term": {"Sql": '{name}'.format(name=tabname)}}, {
                        "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                    "minimum_should_match": 1,
                    "filter": {
                        "range": {
                            "Timestamp.keyword": {
                                "lte": "{date}".format(date=end_time),
                                "gte": "{begindate}".format(begindate=begin_time),
                            }
                        }}}},
                "size": 0, "aggs": {"aggr_mame": {
                    "terms": {
                        "field": "User.keyword",
                        "size": 2000
                    },
                    "aggs": {
                        "aggr_der": {
                            "terms": {
                                "field": "PolicyName.keyword"
                            },
                            "aggs": {
                                "top_tag_hits": {
                                    "top_hits": {
                                        "size": 1
                                    }
                                }}}}}}
        }
        elif op.find('insert') >= 0:  # insert  必须有into关键字
            body = {"query": {
                "bool": {"must": [{
                    "match": {"Sql": '{op}'.format(op=op)}}, {
                    "match": {"PolicyName.keyword": 'd8t'}},{
                    "match": {"Sql": 'into'}}],
                    "should": [{
                        "term": {"Sql": '{name}'.format(name=tabname)}}, {
                        "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                    "minimum_should_match":1,
                    "filter": {
                        "range": {
                            "Timestamp.keyword": {
                                "lte": "{date}".format(date=end_time),
                                "gte": "{begindate}".format(begindate=begin_time),
                            }
                        }}}},
                "size": 0, "aggs": {"aggr_mame": {
                    "terms": {
                        "field": "User.keyword",
                        "size": 2000
                    },
                    "aggs": {
                        "aggr_der": {
                            "terms": {
                                "field": "PolicyName.keyword"
                            },
                            "aggs": {
                                "top_tag_hits": {
                                    "top_hits": {
                                        "size": 1
                                    }
                                }}}}}}
        }
        else:  # delete  不能有into关键字
            body = {"query": {
                "bool": {"must": [{
                    "match": {"Sql": '{op}'.format(op=op)}},{
                    "match": {"PolicyName.keyword": 'd8t'}}],
                    "must_not": [{"match": {"Sql": "into"}}],
                    "should": [{
                        "term": {"Sql": '{name}'.format(name=tabname)}}, {
                        "term": {"Sql": 'online.{name}'.format(name=tabname)}}],
                    "minimum_should_match": 1,
                    "filter": {
                        "range": {
                            "Timestamp.keyword": {
                                "lte": "{date}".format(date=end_time),
                                "gte": "{begindate}".format(begindate=begin_time),
                            }
                        }}}},
                "size": 0, "aggs": {"aggr_mame": {
                    "terms": {
                        "field": "User.keyword",
                        "size": 2000
                    },
                    "aggs": {
                        "aggr_der": {
                            "terms": {
                                "field": "PolicyName.keyword"
                            },
                            "aggs": {
                                "top_tag_hits": {
                                    "top_hits": {
                                        "size": 1
                                    }
                                }}}}}}
            }
    View Code

      写入统计数据Py:

        doc = res["aggregations"]["aggr_mame"]['buckets']
        cn = conn()
        cur = cn.cursor()
        if len(doc):
          for item in doc:
            user=item['key']  # 账号
            total=str(item['doc_count'])  # 该账号在全部实例下的调用次数
            if len(item["aggr_der"]["buckets"]):
               for bucket in item["aggr_der"]["buckets"]:
                 server=bucket['key']  # 服务器实例
                 s_total =str(bucket['doc_count'])  # 该服务器实例下的调用次数
                 sql=bucket["top_tag_hits"]["hits"]["hits"][0]["_source"]["Sql"] # 样例Sql
                 sql=emoji.demojize(transferContent(sql))  #转义并去掉表情符号
                 tsql="replace into ea_tj(tab,username,op,num,server,sqltext,dt) "
                   " values ('{0}','{1}','{2}','{3}','{4}','{5}','{6}')".format(tabname,user,op,s_total,server,sql,dt)
                 cur.execute(tsql)
                 cn.commit()
    View Code

    存放分析结果表:

    CREATE TABLE `ea_tj` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `tab` varchar(200) NOT NULL COMMENT '表名',
      `username` varchar(200) NOT NULL COMMENT '账号',
      `op` varchar(50) DEFAULT NULL COMMENT '操作类型',
      `num` bigint(11) NOT NULL COMMENT '次数',
      `server` varchar(200) NOT NULL COMMENT '实例策略名',
      `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
      `sqltext` text COMMENT '样例SQL',
      `dt` date DEFAULT NULL COMMENT '线上SQL执行日期',
      `update_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
      PRIMARY KEY (`id`) USING BTREE,
      UNIQUE KEY `un` (`tab`,`username`,`op`,`server`,`dt`)
    ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4
    View Code

       具体汇总数据:

        1,每天13个实例,表,账号,操作之间关系汇总(用来Online表拆分)
        2,每天13个数据库实例账号的连接IP汇总(用来迁移VPC)
        3,统计调用总次数 (用来分析调用次数异常)

        汇总1的结果数据:

       

  • 相关阅读:
    迅为龙芯2K1000开发板虚拟机ubuntu安装软件
    迅为恩智浦IMX6Q开发板系统固件TF卡烧写
    瑞芯微迅为iTOP-3399开发板资料更新啦!最新版本为1.3版本
    迅为-龙芯2K1000开发板虚拟机ubuntu系统开关机
    迅为IMX6ULL开发板Linux驱动初探-最简单的设备驱动-helloworld
    迅为-龙芯2K1000开发板虚拟机ubuntu基础操作
    开始Blog
    16 bit 的灰度图如何显示
    GPU 总结
    同步,异步,多线程和事件总结
  • 原文地址:https://www.cnblogs.com/zping/p/15131861.html
Copyright © 2020-2023  润新知