• 使用BeautifulSoup 检索 www.163.com 主页面所有超链接


    #coding:utf-8
    import urllib2
    from bs4 import BeautifulSoup

    response=urllib2.urlopen("http://www.163.com")

    HtmlDoc=response.read()

    soup = BeautifulSoup(HtmlDoc,'html.parser',from_encoding='utf-8')


    links =soup.find_all("a")
    print('打印所有链接')
    for link in links:
    print link.name,link['href']
    print len(links)



    执行结果

    打印所有链接
    a http://www.163.com/#f=topnav
    a http://m.163.com/newsapp/#f=topnav
    a http://music.163.com/#f=topnav
    a http://yuedu.163.com/#f=topnav
    a http://note.youdao.com/#f=topnav
    a http://y.163.com/?from=wsdh
    a http://open.163.com/#f=topnav
    a http://caipiao.163.com/mobile/client_cp.jsp#from=yingyong
    a http://cidian.youdao.com/?vendor=topnav
    a http://mail.163.com/client/dl.html?from=mail46
    a http://www.lofter.com/?act=qb163rk_20141031_01
    a http://study.163.com/client/download.htm?from=163app&utm_source=163.com&utm_medium=web_app&utm_campaign=business
    a http://www.163.com/
    a http://reg.163.com/
    a http://reg.163.com/RecoverPassword.shtml?f=www
    a http://mail.163.com/client/dl.html?from=mail46
    a http://reg.email.163.com/mailregAll/reg0.jsp?from=163navi&regPage=163
    a http://reg.vip.163.com/register.m?from=topnav
    a http://reg.163.com/Logout.jsp
    a http://rd.da.netease.com/redirect?t=I4iYc8&p=EA7B9E&target=http%3A%2F%2Fwww.kaola.com%2F
    a http://www.kaola.com/outter/promote/myzq.html
    a http://www.kaola.com/outter/promote/mrcz.html
    a http://www.kaola.com/outter/promote/jjry.html
    a http://www.kaola.com/outter/promote/jkms.html
    a http://www.kaola.com/outter/promote/yybj.html
    a http://www.kaola.com/outter/promote/hwzy.html
    a http://rd.da.netease.com/redirect?t=W1rULs&p=pESsw1&proId=1024&target=http%3A%2F%2Fwww.kaola.com%2Factivity%2Fdetail%2F5288.html%3Ftag%3Dbe3d8d027a530881037ef01d304eb505
    a http://www.kaola.com/outter/promote/khd.html
    a http://email.163.com/#from=163nav_icon
    a http://email.163.com/#f=topnav
    a http://vipmail.163.com/#f=topnav
    a http://qiye.163.com/#f=topnav
    a http://reg.email.163.com/mailregAll/reg0.jsp?from=ntes_nav&regPage=163
    a http://reg.email.163.com/unireg/call.do?cmd=register.entrance&flow=mobile&from=ntes_nav
    a http://mail.163.com/dashi/dlpro.html?from=mail46
    a http://pay.163.com/




  • 相关阅读:
    html页面中的转意字符
    bootstrap学习笔记3- navbar-header navbar-toggle 类 data-toggle和data-target
    DIV嵌套过程中的高度自适应问题
    <span class="icon-bar"></span> 不显示?
    CSS中定位的浮动float
    CSS 盒模型,块级元素和行内元素的区别和特性
    CSS padding
    CSS Position(定位)
    网络资料
    vue中 具名插槽+作用域插槽的混合使用
  • 原文地址:https://www.cnblogs.com/smallgou/p/5079421.html
Copyright © 2020-2023  润新知