python3正则表达式:匹配以“www”起始且以“.com”结尾的简单web域名;例如,www.yahoo.com/。
import re patt = 'www.+.com' #正则表达式的意思为www开头,中间.+表示匹配任意长度的任意字符,其中.com为转义.后以.com结尾。
m = re.match(patt,'www.yahoo.com')
if m is not None:m.group()
import re patt = 'www' m = re.match(patt, 'www.yahoo.com') print(m)
判断字符串是否全部为小写,给定字符串:s1 = 'adkkdk'
s2 = 'abc123efg'
In[2]: import re In[3]: s1 = 'adkkdk' In[4]: s2 = 'abc123efg' In[6]: an = re.search('^[a-z]+$',s1) In[7]: if an: ...: print('s1:',an.group(),'全为小写') ...: else: ...: print(s1,'不全是小写!') ...: s1: adkkdk 全为小写 In[8]: an = re.match('[a-z]+$',s2) In[9]: if an: ...: print('s2:',an.group(),'全为小写') ...: else: ...: print(s2,"不全是小写") ...: abc123efg 不全是小写
在处理自然语言时123,000,000如果以标点符号分割,就会出现大问题,好好的一个数字就被逗号肢解了,因此可以先下手把数字处理干净(逗号去掉)。给定字符串sen = "abc,123,456,789,mnp"
In[2]: import re In[3]: sen = "abc,123,456,789,mnp" In[4]: p = re.compile("d+,d+?") In[5]: for com in p.finditer(sen): ...: mm = com.group() ...: print("hi:",mm) ...: print("sen_before:",sen) ...: sen = sen.replace(mm,mm.replace(",","")) ...: print("sen_back:",sen,' ') ...: hi: 123,4 sen_before: abc,123,456,789,mnp sen_back: abc,123456,789,mnp hi: 56,7 sen_before: abc,123456,789,mnp sen_back: abc,123456789,mnp