Python爬虫视频教程零基础小白到scrapy爬虫高手-轻松入门
https://item.taobao.com/item.htm?spm=a1z38n.10677092.0.0.482434a6EmUbbW&id=564564604865
抓取网址:
重医附一院
http://www.hospital-cqmu.com/index.php?file=job&smid=6&page=1
重医附二院
http://www.sahcqmu.com/a/rencaizhaopin/
西南医院
http://web.xnyy.cn/elite/info_list.aspx?type_id=4
大坪医院
http://www.dph-fsi.com/zl/rczp/fl.asp?type_id=6
拜耳
http://jobs.51job.com/all/co100411.html#syzw
gsk
http://jobs.51job.com/all/co2141156.html#syzw
http://jobs.51job.com/all/co2835582.html?#syzw
http://jobs.51job.com/all/co3838952.html?#syzw
http://www.gsk-china.com/cn-cn/careers/hot/
正大天晴
http://jobs.51job.com/all/co198308.html
gilead
https://gilead.avature.net/careers/SearchJobs/China%7C%7CShanghai/
# -*- coding: utf-8 -*- """ Created on Sun Mar 20 09:24:28 2016 @author: daxiong """ import requests,bs4,openpyxl,time from openpyxl.cell import get_column_letter,column_index_from_string charset="gb2312" site="http://jobs.51job.com/all/co198308.html" res=requests.get(site) res.encoding = charset soup1=bs4.BeautifulSoup(res.text,"lxml") group=soup1.select('.el') group2=group[1] group2.getText() ''' 孝感医药代表(学术专员) 大专 南京-玄武区 6000-7999/月 03-18 ''' text=group2.getText() text.split(' ') '''['', '孝感医药代表(学术专员)', '大专', '南京-玄武区', '6000-7999/月', '03-18', '']''' group1=group[0] text1=group1.getText() text1.split(' ') '''['', '孝感医药代表(学术专员)', '', '南京-玄武区', '6000-7999/月', '03-18', '']'''