• Python字符串操作


    Python 如何判断一个字符串只包含数字字符

    Q:如何判断一个字符串只包含数字字符

    A:一种方法是 a.isdigit()。但这种方法对于包含正负号的数字字符串无效,因此更为准确的为:

        try:
            x = int(aPossibleInt)
            ... do something with x ...
        except ValueError:
            ... do something else ...

    这样更准确一些,适用性也更广。但如果你已经确信没有正负号,使用字符串的isdigit()方法则更为方便。

    Python 字符串比较

     

    Python 字符串简单比较

    简单比较是用内置函数 cmp() 来比较两个字符串:

    Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on
    win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> a = "abc"
    >>> b = "abc"
    >>> cmp(a, b)
    0
    >>> c = "def"
    >>> cmp(a,c)
    -1
    >>>

    Python字符串比较忽略大小写

     

    正则表达式,使用IGNORECASE标志

    >
    >>> import re
    >>> m = re.search('multi', 'A mUltiCased string', re.IGNORECASE)
    >>> bool(m)
    True

    在比较前把2个字符串转换成同样大写

    在比较前把2个字符串转换成同样大写,用upper()方法,或小写,lower()
    >>> s = 'A mUltiCased string'.lower()
    >>> s
    'a multicased string'
    >>> s.find('multi')
    2

    python 字符串高级比较

    使用python库difflib可以实现两个字符串的比较,找到相同的部分

    Python difflib|SequenceMatcher|Differ|HtmlDiff 使用方法

     

    介绍

    difflib 是python提供的比较序列(string list)差异的模块。 
    实现了三个类: 

    • SequenceMatcher 任意类型序列的比较 (可以比较字符串)
    • Differ 对字符串进行比较
    • HtmlDiff 将比较结果输出为html格式

    SequenceMatcher 实例

     

    代码:

    import difflib
    from pprint import pprint
     
    a = 'pythonclub.org is wonderful'
    b = 'Pythonclub.org also wonderful'
    #构造SequenceMatcher类
    s = difflib.SequenceMatcher(None, a, b)
     
    #得到相同的block
    print "s.get_matching_blocks():"
    pprint(s.get_matching_blocks())
    print 
    print "s.get_opcodes():"
    for tag, i1, i2, j1, j2 in s.get_opcodes():
        print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %  (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
        #在此实现你的功能

    输出为:

    s.get_matching_blocks():
    [(1, 1, 14), (16, 17, 1), (17, 19, 10), (27, 29, 0)]
    
    s.get_opcodes():
    replace a[0:1] (p) b[0:1] (P)
      equal a[1:15] (ythonclub.org ) b[1:15] (ythonclub.org )
    replace a[15:16] (i) b[15:17] (al)
      equal a[16:17] (s) b[17:18] (s)
     insert a[17:17] () b[18:19] (o)
      equal a[17:27] ( wonderful) b[19:29] ( wonderful)

    SequenceMatcher find_longest_match BUG

    import difflib
     
    str1 = "Poor Impulse Control: A Good Babysitter Is Hard To Find"
     
    str2 = """     A Good Babysitter Is Hard To Find    This is Frederick
    by Leo Lionni, the first book I picked for myself.
    I was in kindergarten, I believe, which would be either 1968 or 1969.
    Frederick has a specific lesson for children about how art is as
    important in life as bread, but there's a secondary consideration
    I took away: if we pool our talents our lives are immeasurably better.
    Curiously, this book is the story of my life, however one interprets
    those things. I expect Mickey Rooney to show up any time with a barn
    and a plan for a show, though my mom is not making costumes. My sisters
    own a toy store with a fantastic selection of imaginative children's books.
    I try not to open them because I can't close them and put them back.
    My tantrums are setting a bad example for the kids. Anyway, I mention
    this because yesterday was Mr. Rogers' 40th anniversary. I appreciate
    the peaceful gentleman more as time passes, as I play with finger puppets
    in department meetings, as I eye hollow trees for Lady Elaine Fairchild
    infestations. Maybe Pete can build me trolley tracks!Labels: To Take
    Your Heart Away   """
     
    s = difflib.SequenceMatcher(None, str1, str2)
    print len(str1), len(str2)
    star_a, start_b, length = s.find_longest_match(0, len(str1)-1, 0, len(str2)-1)
    print star_a, start_b, length
    print str1[star_a:star_a + length]

    输出结果为:

    55 1116
    0 1048 1
    P
    
    版本为:
    Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on
    win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>>

    而最长的应该为 A Good Babysitter Is Hard To Find.

    解决方法

    将 str1 于 str2 交换一下, len(str1) > len(str2). 
    则输出结果是想得到的结果。 

    下面列出了常用的python实现的字符串操作

    1.复制字符串

    #strcpy(sStr1,sStr2)
    sStr1 = 'strcpy'
    sStr2 = sStr1
    sStr1 = 'strcpy2'
    print sStr2

    2.连接字符串

    #strcat(sStr1,sStr2)
    sStr1 = 'strcat'
    sStr2 = 'append'
    sStr1 += sStr2
    print sStr1

    3.查找字符

    #strchr(sStr1,sStr2)
    sStr1 = 'strchr'
    sStr2 = 'r'
    nPos = sStr1.index(sStr2)
    print nPos

    4.比较字符串

    #strcmp(sStr1,sStr2)
    sStr1 = 'strchr'
    sStr2 = 'strch'
    print cmp(sStr1,sStr2)

    5.扫描字符串是否包含指定的字符

    #strspn(sStr1,sStr2)
    sStr1 = '12345678'
    sStr2 = '456'
    #sStr1 and chars both in sStr1 and sStr2
    print len(sStr1 and sStr2)

    6.字符串长度

    #strlen(sStr1)
    sStr1 = 'strlen'
    print len(sStr1)

    7.将字符串中的小写字符转换为大写字符

    #strlwr(sStr1)
    sStr1 = 'JCstrlwr'
    sStr1 = sStr1.upper()
    print sStr1

    8.追加指定长度的字符串

    #strncat(sStr1,sStr2,n)
    sStr1 = '12345'
    sStr2 = 'abcdef'
    n = 3
    sStr1 += sStr2[0:n]
    print sStr1

    9.字符串指定长度比较

    #strncmp(sStr1,sStr2,n)
    sStr1 = '12345'
    sStr2 = '123bc'
    n = 3
    print cmp(sStr1[0:n],sStr2[0:n])

    10.复制指定长度的字符

    #strncpy(sStr1,sStr2,n)
    sStr1 = ''
    sStr2 = '12345'
    n = 3
    sStr1 = sStr2[0:n]
    print sStr1

    11.字符串比较,不区分大小写

    #stricmp(sStr1,sStr2)
    sStr1 = 'abcefg'
    sStr2 = 'ABCEFG'
    print cmp(sStr1.upper(),sStr2.upper())

    12.将字符串前n个字符替换为指定的字符

    #strnset(sStr1,ch,n)
    sStr1 = '12345'
    ch = 'r'
    n = 3
    sStr1 = n * ch + sStr1[3:]
    print sStr1

    13.扫描字符串

    #strpbrk(sStr1,sStr2)
    sStr1 = 'cekjgdklab'
    sStr2 = 'gka'
    nPos = -1
    for c in sStr1:
        if c in sStr2:
            nPos = sStr1.index(c)
            break
    print nPos

    14.翻转字符串

    #strrev(sStr1)
    sStr1 = 'abcdefg'
    sStr1 = sStr1[::-1]
    print sStr1

    15.查找字符串

    python strstr

    #strstr(sStr1,sStr2)
    sStr1 = 'abcdefg'
    sStr2 = 'cde'
    print sStr1.find(sStr2)

    16.分割字符串

    #strtok(sStr1,sStr2)
    sStr1 = 'ab,cde,fgh,ijk'
    sStr2 = ','
    sStr1 = sStr1[sStr1.find(sStr2) + 1:]
    print sStr1
  • 相关阅读:
    sersync+rsync原理及部署
    rsync同步
    zabbix 3.2.2 agent端(源码包)安装部署
    zabbix配置
    Netbackup media server部署报错
    Oracle_rac命令
    Linux系统克隆为iso镜像盘(类似win gost)
    Nebackup清除磁带数据重新使用
    V7000初始化
    【数据案例】服务器崩溃后的数据恢复方法
  • 原文地址:https://www.cnblogs.com/zhongbin/p/3273048.html
Copyright © 2020-2023  润新知