python BeautifulSoup模块的安装
安装包下载地址:http://www.crummy.com/software/BeautifulSoup/#Download
文档:http://www.crummy.com/software/BeautifulSoup/documentation.html
下载后解压, 然后进入目录执行 :
python setup.py build
python setup.py install
引入包要用
import bs4 from bs4 import BeautifulSoup
利用BeautifulSoup抓取网页内容
1 # coding=utf-8 2 from bs4 import BeautifulSoup 3 import urllib 4 import re 5 6 url ='http://www.baidu.com/s' 7 values ={'wd':u'渗透'} 8 encoded_param = urllib.urlencode(values) 9 full_url = url +'?'+ encoded_param 10 response = urllib.urlopen(full_url) 11 soup =BeautifulSoup(response) 12 alinks = soup.find_all('a', href=re.compile('^http|^/'))