• web访问日志分析


    日志记录

    在Web日志中,每条日志通常代表着用户的一次访问行为,例如下面就是nginx日志

    14.23.95.98 - - [17/Mar/2015:22:26:54 -0400] "GET /pmd/phpmyadmin.css.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&js_frame=left&nocache=2705868602 HTTP/1.1" 200 3970 "http://104.131.67.100/pmd/navigation.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&db=bl" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"
    14.23.95.98 - - [17/Mar/2015:22:26:55 -0400] "GET /pmd/js/mootools.js HTTP/1.1" 304 0 "http://104.131.67.100/pmd/db_structure.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&db=bl" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"
    
    14.23.95.98 - - [17/Mar/2015:22:26:55 -0400] "GET /pmd/phpmyadmin.css.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&js_frame=right&nocache=2705868602 HTTP/1.1" 200 21799 "http://104.131.67.100/pmd/db_structure.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&db=bl" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"
    
    14.23.95.98 - - [17/Mar/2015:22:26:55 -0400] "GET /pmd/js/tooltip.js HTTP/1.1" 304 0 "http://104.131.67.100/pmd/db_structure.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&db=bl" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"
    
    

    这些日志信息,大致可以拆解为以下8个变量

    • remote_addr

      记录客户端的ip地址, 14.23.95.98

    • remote_user

      记录客户端用户名称

    • time_local

      记录访问时间与时区, [17/Mar/2015:22:26:55 -0400]

    • request

      记录请求的url与http协议, "GET /pmd/js/tooltip.js HTTP/1.1"

    • status

      记录请求状态,成功是200

    • body_bytes_sent

      记录发送给客户端文件主体内容大小, 21799

    • http_referer

      用来记录从那个页面链接访问过来的, "http://104.131.67.100/pmd/db_structure.php?token=1013c8e1ea31d0f0340af8de3cf4a0cb&db=bl"

    • http_user_agent

      记录客户浏览器的相关信息, “"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36"

    日志分析

    有了这些记录的日志信心,我们就可以用来做一些分析了
    例如,从nginx日志中得到访问量最高前10个IP

    [root@biby nginx]# cat access.log | awk '{a[$1]++} END {for(b in a) print b"	"a[b]}' | sort -k2 -r | head -n 10
    
    14.157.210.181  56
    
    112.64.235.245  3
    
    14.23.95.98     121
    
    211.97.10.56    102
    
    
  • 相关阅读:
    Jenkins修改用户密码及权限
    Selenium知识点小结
    解析图形验证码登录系统
    Pywinauto实现电脑客户端有道云签到领空间
    Python冒泡排序
    LoadRunner12常用函数
    GIT教程笔记
    LoadRunner 关联
    搭建自动化测试框架Python3+Selenium
    SSH免密从A服务器登录进B服务器,重启B服务器的tomcat
  • 原文地址:https://www.cnblogs.com/biby/p/15217697.html
Copyright © 2020-2023  润新知