• 【python cookbook】python访问子字符串


    afiled = theline[3:8]


    如果还要考虑字段的长度 struct.unpack可能更合适

    import struct
    #得到一个5字节的字符串 跳过三字节 得到两个8字节的字符串 以及其余部分
    baseformat = "5s 3x 8s 8s"
    #theline超出的长度也由这个base-format 确定
    numremain = len(theline) - struct.calcsize(baseformat)
    #用合适的s或者x字段完成格式 然后unpack
    format = "%s %ds" % (baseformat,numremain)
    l,s1,s2,t = struct.unpack(format,theline)

    >>> theline = "numremain = len(theline) - struct.calcsize(baseformat)" >>> numremain = len(theline) - struct.calcsize(baseformat) >>> format = "%s %ds" % (baseformat,numremain) >>> format '5s 3x 8s 8s 30s' >>> l,s1,s2,t = struct.unpack(format,theline) >>> l 'numre' >>> s1 'n = len(' >>> s2 'theline)' >>> t ' - struct.calcsize(baseformat)'


    pieces = [theline[k:k+n] for k in xrange(0,len(theline),n)]

    如果想把数据切成指定长度的列 用带LC的切片方法比较容易实现

    cuts = [8,14,20,26,30]
    pieces = [ theline[i,j] for i j in zip([0]+cuts,cuts+[None])]


    第一项和最后一项为(0,cuts[0]) (cuts[len(cuts)-1],None)



    def fields(baseformat,theline,lastfield=False):
        #theline 超出的长度也有这个base-format 确定
        #(通过 struct.calcsize计算切片的长度)
        numremain = len(theline)-struct.calcsize(baseformat)
        #用合适的s或者x字段完成格式 然后unpack
        format = "%s %d %s" % (baseformat,numre


    def fields(baseformat,theline,lastfield=False,_cache={ }):
        key = baseformat,len(theline),lastfield
        format _cache.get(key)
        if format is None:    
            #m没有缓存的格式字符串 创建并缓存
            numremain = len(theline) - struct.calcsize(baseformat)
            _cache[key] = format = "%s %d%s" % (
                baseformat,numremain,lastfield and "s" or "x")
        return struct.unpack(format,theline)

    cookbook上说的这个比优化之前的版本快30%到40% 不过如果这里不是瓶颈部分,没必要使用这种方法


    def split_by(theline,n,lastfield=False):
        pieces = [theline[k:k+n] for k in xrange(0,len(theline),n)]
        if not lastfield and len(pieces[-1] < n):
        return pieces
    def split_at(theline,cuts,lastfield=False):
        pieces = [ theline[i,j] for i j in zip([0]+cuts,cuts+[None])]
        #若不需要最后一段 丢弃
        if not lastfield:
        return pieces


    def split_at(the_line,cuts,lastfield=False):
        last = 0
        for cut in cuts:
            yield the_line[last:cut]
            last = cut
        if lastfield:
            yield the_line[last:]
    def split_by(the_line,n,lastfield=False):
        return split_at1(the_line,xrange(n,len(the_line),n),lastfield)



    This function returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The returned list is truncated in length to the length of the shortest argument sequence. When there are multiple arguments which are all of the same length, zip() is similar to map() with an initial argument of None. With a single sequence argument, it returns a list of 1-tuples. With no arguments, it returns an empty list.

    The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n).

    zip() in conjunction with the * operator can be used to unzip a list:

    >>> x = [1, 2, 3]
    >>> y = [4, 5, 6]
    >>> zipped = zip(x, y)
    >>> zipped
    [(1, 4), (2, 5), (3, 6)]
    >>> x2, y2 = zip(*zipped)
    >>> x == list(x2) and y == list(y2)

      >>> x2
      (1, 2, 3)
      >>> y2
      (4, 5, 6)


    生成器的用法参见这篇博客 http://www.cnblogs.com/cacique/archive/2012/02/24/2367183.html

  • 相关阅读:
    导入用户数据到Discuz! X3.2 并实现同步登陆
    win2003(sp2 x86)+iis6+php-5.3.5-Win32 配置
    Macbook Hbase(1.2.6) 伪分布式安装,Hadoop(2.8.2) ,使用自带zookeeper
    LeetCode 696. Count Binary Substrings
    LeetCode 637. Average of Levels in Binary Tree
    LeetCode 226. Invert Binary Tree
    LeetCode 669. Trim a Binary Search Tree
    LeetCode 575. Distribute Candies
  • 原文地址:https://www.cnblogs.com/cacique/p/2603640.html
Copyright © 2020-2023  润新知