生活中处处都是正则:
比如我们描述:4条腿
你可能会想到的是四条腿的动物或者桌子,椅子等
继续描述:4条腿,活的
就只剩下四条腿的动物这一类了
二:常用匹配模式(元字符)
运行:
print(re.findall('w','aAbc123_*()-='))
print(re.findall('W','aAbc123_*()-= '))
print(re.findall('s','aA
bc
12f3_*()-= '))
print(re.findall('S','aA
bc
12f3_*()-= '))
print(re.findall('d','aA
bc
12f3_*()-= '))
print(re.findall('D','aA
bc
12f3_*()-= '))
print(re.findall('D','aA
bc
12f3_*()-= '))
print(re.findall('Aalex',' alexis alex sb'))
终端:
['a', 'A', 'b', 'c', '1', '2', '3', '_']
['*', '(', ')', '-', '=', ' ']
['
', ' ', '
', 'x0c', ' ']
['a', 'A', 'b', 'c', '1', '2', '3', '_', '*', '(', ')', '-', '=']
['1', '2', '3']
['a', 'A', '
', 'b', 'c', ' ', '
', 'x0c', '_', '*', '(', ')', '-', '=', ' ']
['a', 'A', '
', 'b', 'c', ' ', '
', 'x0c', '_', '*', '(', ')', '-', '=', ' ']
[]
重复匹配:| . | * | ? | .* | .*? | + | {n,m} |
1、.:匹配除了 之外任意一个字符,指定re.DOTALL之后才能匹配换行符
print(re.findall('a.b','a1b a2b a b abbbb a
b a b a*b'))
# ['a1b', 'a2b', 'a b', 'abb', 'a b', 'a*b']
print(re.findall('a.b','a1b a2b a b abbbb a
b a b a*b',re.DOTALL))
# ['a1b', 'a2b', 'a b', 'abb', 'a
b', 'a b', 'a*b']
2、*:左侧字符重复0次或无穷次,性格贪婪
print(re.findall('ab*','a ab abb abbbbbbbb bbbbbbbb'))
# ['a', 'ab', 'abb', 'abbbbbbbb']
3、+:左侧字符重复1次或无穷次,性格贪婪
print(re.findall('ab+','a ab abb abbbbbbbb bbbbbbbb'))
# ['ab', 'abb', 'abbbbbbbb']
4、?:左侧字符重复0次或1次,性格贪婪
print(re.findall('ab?', 'a ab abb abbbbbbbb bbbbbbbb'))
# ['a', 'ab', 'ab', 'ab']
5、{n,m}:左侧字符重复n次到m次,性格贪婪(自定)
{0,} => *
{1,} => +
{0,1} => ?
{n}单独一个n代表只出现n次,多一次不行少一次也不行
print(re.findall('ab{2,5}', 'a ab abb abbb abbbb abbbbbbbb bbbbbbbb'))
# ['abb', 'abbb', 'abbbb', 'abbbbb']
6、[]匹配指定字符一个
运行:
print(re.findall('adb','a1111111b a3b a4b a9b aXb a b a
b',re.DOTALL))
print(re.findall('a[501234]b','a1111111b a3b a4b a9b aXb a b a
b',re.DOTALL))
print(re.findall('a[0-5]b','a1111111b a3b a1b a0b a4b a9b aXb a b a
b',re.DOTALL))
print(re.findall('a[0-9a-zA-Z]b','a1111111b axb a3b a1b a0b a4b a9b aXb a b a
b',re.DOTALL))
print(re.findall('a[^0-9a-zA-Z]b','a1111111b axb a3b a1b a0b a4b a9b aXb a b a
b',re.DOTALL))
print(re.findall('a-b','a-b aXb a b a
b',re.DOTALL))
print(re.findall('a[-0-9
]b','a-b a0b a1b a8b aXb a b a
b',re.DOTALL))
终端:
['a3b', 'a4b', 'a9b']
['a3b', 'a4b']
['a3b', 'a1b', 'a0b', 'a4b']
['axb', 'a3b', 'a1b', 'a0b', 'a4b', 'a9b', 'aXb']
['a b', 'a b']
['a-b']
['a-b', 'a0b', 'a1b', 'a8b', 'a b']