• 2017-4-21 Shell+Python对抓包文件后的文本处理过程


        这几天毕设的事情,需要把Modbus数据包变成十六进制形式,但是wireshark不是非常给力,也可能是我还没找到窍门吧。这几天的文本处理把我整的够惨,有些问题以前从来没想过,遇到了真是让人觉得书到用时方恨少呀。做下笔记,以后用的着。

    一、目录结构解析

    [ root@ssd #] ls /tmp

    1.txt   10_BCD.sh   7.sh    get_final.py    README

    (1)[ root@ssd #] cat 1.txt  ##其中1.txt是原始抓包文件,

    No.     Time           Source                Destination           Protocol Length Info
        246 166.994531     192.168.1.100         192.168.1.101         Modbus/TCP 66        Query: Trans:     0; Unit:   1, Func:   3: Read Holding Registers
    
    Frame 246: 66 bytes on wire (528 bits), 66 bytes captured (528 bits) on interface 0
    Ethernet II, Src: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39), Dst: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
        Destination: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
            Address: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
            .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
            .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
        Source: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
            Address: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
            .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
            .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
        Type: IPv4 (0x0800)
    Internet Protocol Version 4, Src: 192.168.1.100, Dst: 192.168.1.101
        0100 .... = Version: 4
        .... 0101 = Header Length: 20 bytes (5)
        Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        Total Length: 52
        Identification: 0x6971 (26993)
        Flags: 0x02 (Don't Fragment)
        Fragment offset: 0
        Time to live: 128
        Protocol: TCP (6)
        Header checksum: 0x0d39 [validation disabled]
        [Header checksum status: Unverified]
        Source: 192.168.1.100
        Destination: 192.168.1.101
        [Source GeoIP: Unknown]
        [Destination GeoIP: Unknown]
    Transmission Control Protocol, Src Port: 58708, Dst Port: 502, Seq: 1, Ack: 1, Len: 12
        Source Port: 58708
        Destination Port: 502
        [Stream index: 1]
        [TCP Segment Len: 12]
        Sequence number: 1    (relative sequence number)
        [Next sequence number: 13    (relative sequence number)]
        Acknowledgment number: 1    (relative ack number)
        Header Length: 20 bytes
        Flags: 0x018 (PSH, ACK)
        Window size value: 16425
        [Calculated window size: 65700]
        [Window size scaling factor: 4]
        Checksum: 0xb0f0 [unverified]
        [Checksum Status: Unverified]
        Urgent pointer: 0
        [SEQ/ACK analysis]
        [PDU Size: 12]
    Modbus/TCP
        Transaction Identifier: 0
        Protocol Identifier: 0
        Length: 6
        Unit Identifier: 1
    Modbus
        .000 0011 = Function Code: Read Holding Registers (3)
        Reference Number: 0
        Word Count: 10
    
    No.     Time           Source                Destination           Protocol Length Info
        247 167.015547     192.168.1.101         192.168.1.100         Modbus/TCP 83     Response: Trans:     0; Unit:   1, Func:   3: Read Holding Registers
    
    Frame 247: 83 bytes on wire (664 bits), 83 bytes captured (664 bits) on interface 0
    Ethernet II, Src: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e), Dst: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
        Destination: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
            Address: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
            .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
            .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
        Source: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
            Address: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
            .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
            .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
        Type: IPv4 (0x0800)
    Internet Protocol Version 4, Src: 192.168.1.101, Dst: 192.168.1.100
        0100 .... = Version: 4
        .... 0101 = Header Length: 20 bytes (5)
        Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        Total Length: 69
        Identification: 0x1d8e (7566)
        Flags: 0x02 (Don't Fragment)
        Fragment offset: 0
        Time to live: 64
        Protocol: TCP (6)
        Header checksum: 0x990b [validation disabled]
        [Header checksum status: Unverified]
        Source: 192.168.1.101
        Destination: 192.168.1.100
        [Source GeoIP: Unknown]
        [Destination GeoIP: Unknown]
    Transmission Control Protocol, Src Port: 502, Dst Port: 58708, Seq: 1, Ack: 13, Len: 29
        Source Port: 502
        Destination Port: 58708
        [Stream index: 1]
        [TCP Segment Len: 29]
        Sequence number: 1    (relative sequence number)
        [Next sequence number: 30    (relative sequence number)]
        Acknowledgment number: 13    (relative ack number)
        Header Length: 20 bytes
        Flags: 0x018 (PSH, ACK)
        Window size value: 256
        [Calculated window size: 65536]
        [Window size scaling factor: 256]
        Checksum: 0xdaf5 [unverified]
        [Checksum Status: Unverified]
        Urgent pointer: 0
        [SEQ/ACK analysis]
        [PDU Size: 29]
    Modbus/TCP
        Transaction Identifier: 0
        Protocol Identifier: 0
        Length: 23
        Unit Identifier: 1
    Modbus
        .000 0011 = Function Code: Read Holding Registers (3)
        [Request Frame: 246]
        Byte Count: 20
        Register 0 (UINT16): 0
        Register 1 (UINT16): 0
        Register 2 (UINT16): 0
        Register 3 (UINT16): 1
        Register 4 (UINT16): 0
        Register 5 (UINT16): 0
        Register 6 (UINT16): 0
        Register 7 (UINT16): 0
        Register 8 (UINT16): 0
        Register 9 (UINT16): 0
    

     

    (2)[ root@ssd #] cat 10_BCD.sh

    #!/bin/bash
    
    if [ ! -d test ];then
            mkdir test 
    fi
    
    grep -iA57 "Modbus/TCP 66 " *.txt |grep -iA8 "^Modbus/TCP" >test/b.txt
    cd test
    yum install dos2unix -y --quiet   ##windows文件放在linux下有个^M字符编码问题,下个dos2unix即可解决
    dos2unix b.txt 
    
    cat b.txt |grep "Transaction" |awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 111
    cat b.txt |grep "Prot" |awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 222 
    cat b.txt |grep "Leng" |awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 333    
    cat b.txt |grep "Unit Identifier" |awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 444
    cat b.txt |grep "Function"|grep "Register" |awk -F ":" '{print $2}'|awk -F "(" '{print $2}'|awk -F ")" '{print $1}'> 555
    cat b.txt |grep "Refe" |awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 666
    cat b.txt |grep "Word"|awk -F ":" '{print $2}'|sed 's/^[ 	]*//g'> 777
    
    if [ $? -eq 0 ];then 
        paste -d "," 111 222 333 444 555 666 777 > c.txt  
        sed -i '/,,/d' c.txt     
        line_number=`cat c.txt | awk -F "," '{if ($NF==NULL)print NR}' `  ##删除最后一个字符是空的行
        arr=($line_number)   ##把字符串转换为数组,arr默认是arr[0]数组第一个元素的意思
        sed -i $arr',$d' c.txt  ##sed命令在shell中太被动了,这个命令害惨我了
        cd ..
        echo "====十进制结果都在test目录下的c.txt文件中=====!"
    fi 

    (3)[ root@ssd  # ]  cat get_final.py

    #!/usr/bin/env python
    # -*- coding: utf-8 -*
    import os
    import commands
    
    commands.getoutput(" /bin/bash 10_BCD.sh >&/dev/null ")
    
    def num_bcd(num):    ##十进制转16进制,取四位!
        a = hex(num)## 25转换为0x19
            if num > 16:
                    a = a[:1]+'0'+a[2:4]  ##0x19转换为0019
                    a = a[:2]+','+a[2:4]+','  ##0019转换为00,19
    
            else: ##比如如果是10,就不好办了
                    a = a[:1]+'0,0'+a[2]+','
            return a
    
    def fun2(num): ##取两位二进制,比如10转换为0a而不是00,0a
        a = hex(num)
        if num > 16:
            a = a[2:4] + ','   ##字符串切片
        else:
            a = a[:1]+a[2] + ','
        return a
        
    
    f = open('test/c.txt')
    contents = []
    for line in f.readlines():
        b = line.split(",")  ##line由字符串变成了列表
        for i in range(len(b)):
            if b[i] == " ":  ##如果是空的,认为数据帧是不完整的
                break    
            else:
                b[i] = int(b[i])
                var1 = " "    
                if i == 3 or i == 4: ##保证数据帧第4个和第5个数字只留2位
                    var1 = fun2(b[i])
                    contents.append(var1)
                else:
                    var1 = num_bcd(b[i])
                    contents.append(var1)
    f.close()
    
    filename = 'new.ini'  
    fobj = open(filename, 'w')  
    fobj.writelines(['%s%s' % (eachline, os.linesep) for eachline in contents])  ##新的内容放在列表中
    fobj.close() 
    commands.getoutput(" /bin/bash 7.sh >& /dev/null ")
    print "结果在final.txt文件中!"

    (4)[ root@ssd  # ]  cat 7.sh

    #!/bin/bash
    
    cat new.ini | awk -F "," '{if (NR%7!=0)ORS=" ";else ORS="
    ";print}' >final_Result
    if [ -f new.ini ];then
        rm -f new.ini
    fi

    (5)[ root@ssd  # ]  cat README

    ===================操作指南============================
    .txt的文件都是是初始抓包文件!

    Note: 只需要执行python get_final.py即可,数据帧结果保存在final_result文件中

    过程描述:
    1、执行python get_final.py的时候,首先调用10_BCD.sh,把原始抓包文件转换为十进制文件,在test目录有7个小文件,最后进行合并,得到b.txt
    2、在python主体中,执行从十进制到十六进制的转换,但是每7列的十六进制形式是分散的
    3、最后调用7.sh把十六进制排成一行,得到最后的结果final_Result

    二、执行结果

    [root@ssd modbus]# cat test/c.txt ##最开始是这样的格式
    32,0,6,1,3,0,10
    32,0,23,1,3,0,10
    33,0,6,1,3,0,10
    33,0,23,1,3,0,10
    34,0,6,1,3,0,10
    35,0,6,1,3,0,10
    36,0,6,1,3,0,10
    37,0,6,1,3,0,10
    34,0,23,1,3,0,10
    38,0,6,1,3,0,10

    #32,0,6,1,3,0,,  #最开始删不掉这种含有两个逗号,中间没有数字的的行

    #42,0,6,1,3,0,,   #在shell中,使用awk找到对应行号,然后arr转换为数组,然后sed删除从该行到末尾的行。sed -i $arr',$d' c.txt

    [root@ssd modbus]# cat  final_Result   ##结果就是必须这样的十六形式

    00,20, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,20, 00,00, 00,17, 01, 03, 00,00, 00,0a,
    00,21, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,21, 00,00, 00,17, 01, 03, 00,00, 00,0a,
    00,22, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,23, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,24, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,25, 00,00, 00,06, 01, 03, 00,00, 00,0a,
    00,22, 00,00, 00,17, 01, 03, 00,00, 00,0a,
    00,26, 00,00, 00,06, 01, 03, 00,00, 00,0a,

    官网:http://www.xiguagongzi.cn/
  • 相关阅读:
    浅谈分层图最短路问题
    [Luogu P2574]XOR的艺术
    luogu P2419 [USACO08JAN]牛大赛Cow Contest
    luogu P1119 灾后重建
    [国家集训队]跳跳棋
    洛谷P4147 玉蟾宫
    [ZJOI2007]棋盘制作
    树状数组模版
    Nearest Common Ancestor
    P1260 工程规划
  • 原文地址:https://www.cnblogs.com/yue-hong/p/6698561.html
Copyright © 2020-2023  润新知