• xpath /和//


    / 和//区别:
    
    1. 使用1个/
    
    <html>
    <li>aaa</li>
    <li>bbb</li>
    <ul>
             <li class="item-0">a01<a href="link1.html">first item</a></li>
             <li class="item-1">b02<a href="link2.html">second item</a></li>
             <li class="item-inactive">c03<a href="link3.html">third item</a></li>
             <li class="item-1">d04<a href="link4.html">fourth item</a></li>
             <li class="item-0">e05<a href="link5.html">fifth item</a></li>
        </ul>
    </html>
    
    # !/usr/bin/env python
    # -*- coding: utf-8 -*-
    from lxml import etree
    
    # 获取文件元素
    from lxml import etree
    
    # 获取文件元素
    htmlEmt = etree.parse('test01.html')
    # 获取所有的 <li> 标签
    result = htmlEmt.xpath('/html/li')
    print(result)
    print type(result)
    for x in result:
        print x
        print type(x)
        print '-------------------------'
        print x.text
    	
    C:Python27python.exe C:/Users/TLCB/PycharmProjects/untitled/xpath/l1.py
    [<Element li at 0x268a2d8>, <Element li at 0x268a9b8>]
    <type 'list'>
    <Element li at 0x268a2d8>
    <type 'lxml.etree._Element'>
    -------------------------
    aaa
    <Element li at 0x268a9b8>
    <type 'lxml.etree._Element'>
    -------------------------
    bbb
    
    Process finished with exit code 0
    
    
    
    2.
    改成//后
    
    # !/usr/bin/env python
    # -*- coding: utf-8 -*-
    from lxml import etree
    
    # 获取文件元素
    from lxml import etree
    
    # 获取文件元素
    htmlEmt = etree.parse('test01.html')
    # 获取所有的 <li> 标签
    result = htmlEmt.xpath('//li')
    print(result)
    print type(result)
    for x in result:
        print x
        print type(x)
        print '-------------------------'
        print x.text
    
    C:Python27python.exe C:/Users/TLCB/PycharmProjects/untitled/xpath/l1.py
    [<Element li at 0x264a2d8>, <Element li at 0x264a9b8>, <Element li at 0x264a170>, <Element li at 0x264a0a8>, <Element li at 0x264a210>, <Element li at 0x264a418>, <Element li at 0x264a4b8>]
    <type 'list'>
    <Element li at 0x264a2d8>
    <type 'lxml.etree._Element'>
    -------------------------
    aaa
    <Element li at 0x264a9b8>
    <type 'lxml.etree._Element'>
    -------------------------
    bbb
    <Element li at 0x264a170>
    <type 'lxml.etree._Element'>
    -------------------------
    a01
    <Element li at 0x264a0a8>
    <type 'lxml.etree._Element'>
    -------------------------
    b02
    <Element li at 0x264a210>
    <type 'lxml.etree._Element'>
    -------------------------
    c03
    <Element li at 0x264a418>
    <type 'lxml.etree._Element'>
    -------------------------
    d04
    <Element li at 0x264a4b8>
    <type 'lxml.etree._Element'>
    -------------------------
    e05
    
    Process finished with exit code 0
  • 相关阅读:
    Java——异步调用
    GTK3-demo 代码调用
    ef core code first 生成的数据库表去复数的方法
    nuxt全局挂载导航路由守卫
    vue导入,导出,列表展示excel数据
    JS之blob对象下载文件,解决word可能打开是乱码,xlsx文件打不开,图片显示格式不支持等问题
    程序猿的十一条浮躁表现
    RSA加密解密及加签验签
    冒泡排序
    Failed to parse source for import analysis because the content contains invalid JS syntax
  • 原文地址:https://www.cnblogs.com/hzcya1995/p/13349013.html
Copyright © 2020-2023  润新知