转载请注明出处 http://www.cnblogs.com/pengwang52/。
>>> p= re.compile(r'<div class="comment-content comment-content_new">([^x00-xff]*)</div>') >>> text='<div class="comment-content comment-content_new">测试</div> <div class="comment-content comment-content_new">学习正则</div>' >>> for m in p.finditer(text): ... print m.group(1) ... 测试 学习正则 如果 用findall 输出为中文字符编码 >>> m = re.findall(r'<div class="comment-content comment-content_new">([^x00-xff]*)</div>','<div class="comment-content comment-content_new">测试</div> <div class="comment-content comment-content_new">学习正则</div>') >>> print m ['xe6xb5x8bxe8xafx95', 'xe5xadxa6xe4xb9xa0xe6xadxa3xe5x88x99']