• TCP/IP Checksum 吐槽


    算法原理:

    假定 output[2] 为输出结果,input[n]为待计算校验和的内存块。

    1)所有奇数位[0,2,4……] byte 累加进 结果的奇数位内存 output[0],如果溢出,则进位给偶数位的 output[1];

    2)所有偶数位[1,3,5……] byte 累加进 结果的偶数位内存 output[1],如果溢出,则进位给奇数位的 output[0];

    3)最后对 output[2] 求反码即可

    示例代码

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import struct
    import sys
    
    def ip_cksum(s):
        
        a = 0
        b = 0
    
        #  偶数序号的  unsigned char 互相累加
        for i in xrange(0, len(s), 2):
            a += struct.unpack('B', s[i])[0]
    
        # 奇数序号的 unsigned char 互相累加
        for i in xrange(1, len(s), 2):
            b += struct.unpack('B', s[i])[0]
    
    
        # 缩小值为 unsigned char
        while a > 256 or b > 256:
            b += a/256 # a 超过 byte 的部分进位给 b
            a = a%256
    
            a += b/256 # b 超过 byte 的部分进位给 a
            b = b%256
    
        # 取反
        a = ~a & 0xff
        b = ~b & 0xff
        
        # 校验和作为字符串
        v = chr(a) + chr(b)
        
        # 校验和作为 unsigned short
        v = struct.unpack('H', v)[0]
        
        return v
    
    
    if __name__ == '__main__':
        for i in sys.argv[1:]:
            print ip_cksum(i)
    View Code

    关于TCP/IP 校验和计算的代码,网上很多,但不少都有些问题,这里作一番简单分析

    1.最尾部 byte 处理依赖机序

    来自 http://locklessinc.com/articles/tcp_checksum/ 的 C 代码片段:

     1 unsigned short checksum1(const char *buf, unsigned size)
     2 {
     3     unsigned sum = 0;
     4     int i;
     5 
     6     /* Accumulate checksum */
     7     for (i = 0; i < size - 1; i += 2)
     8     {
     9         unsigned short word16 = *(unsigned short *) &buf[i];
    10         sum += word16;
    11     }
    12 
    13     /* Handle odd-sized case */
    14     if (size & 1)
    15     {
    16         unsigned short word16 = (unsigned char) buf[i];
    17         sum += word16;
    18     }
    19 
    20     /* Fold to get the ones-complement result */
    21     while (sum >> 16) sum = (sum & 0xFFFF)+(sum >> 16);
    22 
    23     /* Invert to get the negative in ones-complement arithmetic */
    24     return ~sum;
    25 }

    注意第16行,对于buffer 长度非偶数情况的处理, 导致此代码只可在 Little-Endian (如x86) 机器上运行。只需对最后一个 byte 补一个’'的 byte,凑够两个 byte 然后转为 unsinged short 相加即可。

    2.多内存块的计算

    来自 python 网络包创建、解析库 dpkt 的代码 dpkt.py

     1 try:
     2     import dnet
     3     def in_cksum_add(s, buf):
     4         return dnet.ip_cksum_add(buf, s)
     5     def in_cksum_done(s):
     6         return socket.ntohs(dnet.ip_cksum_carry(s))
     7 except ImportError:
     8     import array
     9     def in_cksum_add(s, buf):
    10         n = len(buf)
    11         cnt = (n / 2) * 2
    12         a = array.array('H', buf[:cnt])
    13         if cnt != n:
    14             a.append(struct.unpack('H', buf[-1] + 'x00')[0])
    15         return s + sum(a)
    16     def in_cksum_done(s):
    17         s = (s >> 16) + (s & 0xffff)
    18         s += (s >> 16)
    19         return socket.ntohs(~s & 0xffff)

    它这里会有两个实现,一个是调用dnet库的实现(见2-6行),一个是用python自己实现的版本(见8-19行)。
    dnet 库是 C 实现的一个库,但和 dpkt 库是同一个作者,这里都有一个共同的问题:对于 in_cksum_add 进的内存块,如果为奇数长度,则尾部会追加一个byte 'x00' (见14行),这里就导致了问题。其实呢,尾部的那个 byte 应该留给下一个接下来的内存块一起计算,当且仅当所有的内存块都处理完毕(即 in_cksum_done 时),多余一个 byte 时才该追加 byte 'x00'。

    3.经典的实现

    来自 wireshark 的 in_cksum.c

      1 /*
      2  * Checksum routine for Internet Protocol family headers (Portable Version).
      3  *
      4  * This routine is very heavily used in the network
      5  * code and should be modified for each CPU to be as fast as possible.
      6  */
      7 
      8 #define ADDCARRY(x)  {if ((x) > 65535) (x) -= 65535;}
      9 #define REDUCE {l_util.l = sum; sum = l_util.s[0] + l_util.s[1]; ADDCARRY(sum);}
     10 
     11 int
     12 in_cksum(const vec_t *vec, int veclen)
     13 {
     14     register const guint16 *w;
     15     register int sum = 0;
     16     register int mlen = 0;
     17     int byte_swapped = 0;
     18 
     19     union {
     20         guint8    c[2];
     21         guint16    s;
     22     } s_util;
     23     union {
     24         guint16 s[2];
     25         guint32    l;
     26     } l_util;
     27 
     28     for (; veclen != 0; vec++, veclen--) {
     29         if (vec->len == 0)
     30             continue;
     31         w = (const guint16 *)(const void *)vec->ptr;
     32         if (mlen == -1) {
     33             /*
     34              * The first byte of this chunk is the continuation
     35              * of a word spanning between this chunk and the
     36              * last chunk.
     37              *
     38              * s_util.c[0] is already saved when scanning previous
     39              * chunk.
     40              */
     41             s_util.c[1] = *(const guint8 *)w;
     42             sum += s_util.s;
     43             w = (const guint16 *)(const void *)((const guint8 *)w + 1);
     44             mlen = vec->len - 1;
     45         } else
     46             mlen = vec->len;
     47         /*
     48          * Force to even boundary.
     49          */
     50         if ((1 & (unsigned long) w) && (mlen > 0)) {
     51             REDUCE;
     52             sum <<= 8;
     53             s_util.c[0] = *(const guint8 *)w;
     54             w = (const guint16 *)(const void *)((const guint8 *)w + 1);
     55             mlen--;
     56             byte_swapped = 1;
     57         }
     58         /*
     59          * Unroll the loop to make overhead from
     60          * branches &c small.
     61          */
     62         while ((mlen -= 32) >= 0) {
     63             sum += w[0]; sum += w[1]; sum += w[2]; sum += w[3];
     64             sum += w[4]; sum += w[5]; sum += w[6]; sum += w[7];
     65             sum += w[8]; sum += w[9]; sum += w[10]; sum += w[11];
     66             sum += w[12]; sum += w[13]; sum += w[14]; sum += w[15];
     67             w += 16;
     68         }
     69         mlen += 32;
     70         while ((mlen -= 8) >= 0) {
     71             sum += w[0]; sum += w[1]; sum += w[2]; sum += w[3];
     72             w += 4;
     73         }
     74         mlen += 8;
     75         if (mlen == 0 && byte_swapped == 0)
     76             continue;
     77         REDUCE;
     78         while ((mlen -= 2) >= 0) {
     79             sum += *w++;
     80         }
     81         if (byte_swapped) {
     82             REDUCE;
     83             sum <<= 8;
     84             byte_swapped = 0;
     85             if (mlen == -1) {
     86                 s_util.c[1] = *(const guint8 *)w;
     87                 sum += s_util.s;
     88                 mlen = 0;
     89             } else
     90                 mlen = -1;
     91         } else if (mlen == -1)
     92             s_util.c[0] = *(const guint8 *)w;
     93     }
     94     if (mlen == -1) {
     95         /* The last mbuf has odd # of bytes. Follow the
     96            standard (the odd byte may be shifted left by 8 bits
     97            or not as determined by endian-ness of the machine) */
     98         s_util.c[1] = 0;
     99         sum += s_util.s;
    100     }
    101     REDUCE;
    102     return (~sum & 0xffff);
    103 }

    1)92行是当前内存块还余一个 byte ,则会 s_util 等待下个内存卡再处理——恰当的处理前面提到的第二个问题

    2)94行是所有内存块处理完毕后,对尾部最后一个 byte 的处理 ——恰当的处理了前面提到的第一个问题

    3)看点:指针非对齐的情况下处理

    50行会先将未对其的1个 byte 暂存,这样可迫使指针对齐,但又为了让同奇位、同偶位内存相加,所以使 sum<<8;81行,如果前面sum是已经左移过的,则再次 sum<<8,让sum回归最初的奇偶次序

    注:REDUCE 宏实现的功能是将大于 short 的值(即大于65535)转化为 short 能表示的值.

  • 相关阅读:
    Html 回顾
    Parallel 并行编程
    Task---常用的多线程(基于多线程线程)
    Threadpool 可以对线程加以管理的封装
    AsyncThreads---异步多线程
    Abstract 封装,继承,多态
    IO&&Serize 利用线程Thread.Sleep实现"自动输出"
    Ling && Lambda
    Delegate&&Event
    Delegate &&Lambda
  • 原文地址:https://www.cnblogs.com/JesseFang/p/3150344.html
Copyright © 2020-2023  润新知