fetch:使用Scrapy下载器(downloader)下载给定的URL,并将获取到的内容送到标准输出
scrapy fetch --nolog http://www.23andme.com
view:scrapy view 将页面download本地通过浏览器加载打开,发现淘宝和京东是不能加载出来的。
scrapy view http://www.taobao.com scrapy view http://www.23mofang.com scrapy view http://www.jd.com scrapy view http://http://www.amazon.cn/ scrapy view http://www.amazon.cn/
list:列出有哪些爬虫
SimilarFacedeMacBook-Pro:spiders similarface$ scrapy list
amazonbook
stackoverflow
taobao
Similar
edit:默认会调用vim进行对爬虫修改
SimilarFacedeMacBook-Pro:spiders similarface$ scrapy edit stackoverflow
shell:scrapy的终端
#打开丑事百科玩玩 SimilarFacedeMacBook-Pro:spiders similarface$ scrapy shell http://www.qiushibaike.com/ >>> response <200 http://www.qiushibaike.com/> >>> response.url 'http://www.qiushibaike.com/' >>> response.encoding 'utf-8' >>> response.headers {'Set-Cookie': ['_qqq_uuid_="2|1:0|10:1453947674|10:_qqq_uuid_|56:MDlhM2ZlODM2N2UxZGE0YmYyNjU4MmExM2Q0OTE3MzU4NTliNzIyMg==|505b66b8fc9bc1936ce339417c5c6be46d0cfc570baa61ce378c033c18af4358"; Domain=.qiushibaike.com; expires=Sat, 27 Feb 2016 02:21:14 GMT; Path=/'], 'Vary': ['User-Agent'], 'Server': ['nginx'], 'Date': ['Thu, 28 Jan 2016 02:21:14 GMT'], 'Content-Type': ['text/html; charset=UTF-8']} >>> response.meta {'download_timeout': 180.0, 'handle_httpstatus_all': True, 'download_latency': 0.13596606254577637, 'depth': 0, 'download_slot': 'www.qiushibaike.com'} >>> response.status 200 >>> dir(response) ['_DEFAULT_ENCODING', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__weakref__', '_auto_detect_fun', '_body', '_body_declared_encoding', '_body_inferred_encoding', '_cached_benc', '_cached_selector', '_cached_ubody', '_declared_encoding', '_encoding', '_get_body', '_get_url', '_headers_encoding', '_set_body', '_set_url', '_url', 'body', 'body_as_unicode', 'copy', 'css', 'encoding', 'flags', 'headers', 'meta', 'replace', 'request', 'selector', 'status', 'url', 'urljoin', 'xpath'] >>> print(response.body.decode('utf-8')) ... <div class="content"> 我是一个观众,我有话要说,从一个观众的角度,我们喜欢六小龄童老师的孙悟空,陪我们长大。今年是猴年,多希望春晚的舞台上可以有孙悟空。但是,你们选出来的节目,是老百姓喜欢的吗?tfboys 韩国明星,那些来参加合适吗?春晚是全国人的春晚,不是你们自己的春晚!希望做成百姓的春晚,谢谢! <!--1453944031--> </div> ...