错误信息
TraitError: Could not decode 're.findall("xe6x9dxa5xe6xbax90xefxbcx9a(.*)", web_source_info.encode("utf-8"))' for unicode trait '_i00' of a HistoryManager instance.
原因是网站的源码 编码格式不对
改下网站源码编码再匹配即可
root = Selector(type="html",text=response.body.decode('gb2312'))
web_source_info=""
web_source_arr = re.findall(u"来源:(.*)", web_source_info)