python 内置模块-re

想要在python中使用正则表达式，就需要先导入re模块，正则表达式是一个强大的功能，可以为我们节省很多工作量。

一、元字符：

　　用一些具有特殊含义的符号表示特定种类的字符或位置。

. 匹配除换行符以外的任意字符

w匹配字母或数字或下划线或汉字

W匹配任何非字母数字或下划线或汉字

s匹配任意的空白符

d匹配数字

D匹配非数字字符

匹配单子的开始或结束

^匹配字符串的开始，如果放在字符串的开头，则表示取非。

$匹配字符串的结束

匹配次数

*重复零次或多次

+重复一次或更多次

？重复零次或一次

{n}重复n次

{n,}重复n次或多次

{n,m}重复n到m次。

范围

［］用来匹配一个指定的字符类别，所谓的字符类别就是你想匹配的一个字符集，对于字符集中的字符可以理解成或的关系。

[0-9] 匹配0~9的数字，同d

[a-z]匹配所有的小写字母

[A-Z]匹配所有的大写字母

[a-zA-Z] 匹配所有的字母

[a-z0-9A-Z] 等同于w

字符串转义

如果想匹配元字符本身或者正则中的一些特殊字符，使用转义。例如匹配*这个字符则使用*，匹配这个字符，使用\。

需要转义的字符：$, (, ), *, +, ., [, ], ?, , ^, {, }, |

为了避免过多的使用，python提供了原生字符的方法，也就是在字符串前面加上一个“r”，代表此字符串中的“”可直接用于正则表达式，而不用再次转义。因此，请养成在python的正则表达式字符串的前面添加一个“r“的好习惯。

二、re模块的方法

1、match

re.match(' 规则','字符串 ') 从字符串的开头进行匹配，匹配单个。

2、search

re.search(' ',' ') 在字符串中进行匹配，并返回第一个匹配到的值。

3、findall

re.findall('','') 在字符串中进行匹配，并以列表的形式返回所有满足的值。

>>> re.findall('d+','dsg2335dhreh54623grh46fdh57')

['2335', '54623', '46', '57']

4、group，groups

a = "123abc456"
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group()
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group(0)
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group(1)
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group(2)
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).groups()

5、sub

sub(pattern, repl, string, count=0, flags=0)用于替换匹配到的字符串。

>>> import re
>>> a = 'sfgwg323dgw13'
>>> b = re.sub(r'd+','111',a)
>>> b
'sfgwg111dgw111'

6、split(pattern, string, maxsplit=0, flags=0) 根据指定匹配进行分组

content = "'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'"
new_content = re.split('*', content)
# new_content = re.split('*', content, 1)
print new_content

content = "'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'"
new_content = re.split('[+-*/]+', content)
# new_content = re.split('*', content, 1)
print new_content

inpp = '1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))'
inpp = re.sub('s*','',inpp)
new_content = re.split('(([+-*/]?d+[+-*/]?d+){1})', inpp, 1)
print new_content

相关阅读:
MFC调用C动态库函数-----待补充
硬盘知识总结：
Android 四：区分刷机与root
总结：Linux系统启动流程
Android 三：手机adb 命令解锁
UVa11136 Hoax or what
UVa11988 Broken Keyboard (a.k.a. Beiju Text)
UVa11280 Flying to Fredericton
UVa10269 Adventure of Super Mario
UVa12589 Learning Vector

原文地址：https://www.cnblogs.com/ernest-zhang/p/5634078.html