• python模块之xml.etree.ElementTree


    xml.etree.ElementTree用于解析和构建XML文件

    <?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighbor name="Malaysia" direction="N"/>
        </country>
        <country name="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighbor name="Costa Rica" direction="W"/>
            <neighbor name="Colombia" direction="E"/>
        </country>
    </data>

    解析XML文件

    parse()函数,从xml文件返回ElementTree

    from xml.etree.ElementTree import parse
    tree = parse('demo.xml')  //获取ElementTree
    root = tree.getroot()   // 获取根元素

    Element.tag 、Element.attrib、Element.text

    In [6]: root.tag
    Out[6]: 'data'
    
    In [7]: root.attrib
    Out[7]: {}
    
    In [25]: root.text
    Out[25]: '
        '
    

    for child in root  迭代获得子元素

    In [8]: for child in root:
       ...:     print(child.tag, child.attrib)
       ...:     
    country {'name': 'Liechtenstein'}
    country {'name': 'Singapore'}
    country {'name': 'Panama'}
    

    Element.get()  获得属性值

    In [27]: for child in root:
        ...:     print (child.tag, child.get('name'))
        ...:     
    country Liechtenstein
    country Singapore
    country Panama

    root.getchildren()  获得直接子元素

    In [21]: root.getchildren()
    Out[21]: 
    [<Element 'country' at 0x7f673581c728>,
     <Element 'country' at 0x7f673581ca98>,
     <Element 'country' at 0x7f673581cc28>]

    root[0][1]  根据索引查找子元素

    In [9]: root[0][1].text
    Out[9]: '2008'
    
    In [10]: root[1][0].text
    Out[10]: '4'
    

    root.find() 根据tag查找直接子元素,返回查到的第一个元素

    In [13]: root.find('country').attrib
    Out[13]: {'name': 'Liechtenstein'}

    root.findall()    根据tag查找直接子元素,返回查到的所有元素的列表

    In [16]: for country in root.findall('country'):
        ...:     print  (country.attrib)
        ...:     
    {'name': 'Liechtenstein'}
    {'name': 'Singapore'}
    {'name': 'Panama'}
    

    root.iterfind()   根据tag查找直接子元素,返回查到的所有元素的生成器

    In [22]: root.iterfind('country')
    Out[22]: <generator object prepare_child.<locals>.select at 0x7f6736dccfc0> 

    支持的XPath语句(XML Path)

    In [19]: root.findall('.//rank')  //查找任意层次元素
    Out[19]: 
    [<Element 'rank' at 0x7f673581c8b8>,
     <Element 'rank' at 0x7f673581c6d8>,
     <Element 'rank' at 0x7f673581cc78>]
    
    In [32]: root.findall('country/*')  //查找孙子节点元素
    Out[32]: 
    [<Element 'rank' at 0x7f673581c8b8>,
     <Element 'year' at 0x7f673581cbd8>,
     <Element 'gdppc' at 0x7f673581c958>,
     <Element 'neighbor' at 0x7f673581c688>,
     <Element 'neighbor' at 0x7f673581cb38>,
     <Element 'rank' at 0x7f673581c6d8>,
     <Element 'year' at 0x7f673581c5e8>,
     <Element 'gdppc' at 0x7f673581c868>,
     <Element 'neighbor' at 0x7f673581cb88>,
     <Element 'rank' at 0x7f673581cc78>,
     <Element 'year' at 0x7f673581ccc8>,
     <Element 'gdppc' at 0x7f673581cd18>,
     <Element 'neighbor' at 0x7f673581cd68>,
     <Element 'neighbor' at 0x7f673581cdb8>]
    
    In [33]: root.findall('.//rank/..')   // ..表示父元素
    Out[33]: 
    [<Element 'country' at 0x7f673581c728>,
     <Element 'country' at 0x7f673581ca98>,
     <Element 'country' at 0x7f673581cc28>]
    
    In [34]: root.findall('country[@name]')   // 包含name属性的country
    Out[34]: 
    [<Element 'country' at 0x7f673581c728>,
     <Element 'country' at 0x7f673581ca98>,
     <Element 'country' at 0x7f673581cc28>]
    
    In [35]: root.findall('country[@name="Singapore"]')   // name属性为Singapore的country
    Out[35]: [<Element 'country' at 0x7f673581ca98>]
    
    In [36]: root.findall('country[rank]')   // 孩子元素中包含rank的country
    Out[36]: 
    [<Element 'country' at 0x7f673581c728>,
     <Element 'country' at 0x7f673581ca98>,
     <Element 'country' at 0x7f673581cc28>]
    
    In [37]: root.findall('country[rank="68"]')   // 孩子元素中包含rank且rank元素的text为68的country
    Out[37]: [<Element 'country' at 0x7f673581cc28>]
    
    In [38]: root.findall('country[1]')     // 第一个country
    Out[38]: [<Element 'country' at 0x7f673581c728>]
    
    In [39]: root.findall('country[last()]')   // 最后一个country
    Out[39]: [<Element 'country' at 0x7f673581cc28>]
    
    In [40]: root.findall('country[last()-1]')    // 倒数第二个country
    Out[40]: [<Element 'country' at 0x7f673581ca98>]
    
    

    root.iter()  递归查询指定的或所有子元素 

    In [29]: root.iter()
    Out[29]: <_elementtree._element_iterator at 0x7f67355dd728>
    
    In [30]: list(root.iter())
    Out[30]: 
    [<Element 'data' at 0x7f673581c778>,
     <Element 'country' at 0x7f673581c728>,
     <Element 'rank' at 0x7f673581c8b8>,
     <Element 'year' at 0x7f673581cbd8>,
     <Element 'gdppc' at 0x7f673581c958>,
     <Element 'neighbor' at 0x7f673581c688>,
     <Element 'neighbor' at 0x7f673581cb38>,
     <Element 'country' at 0x7f673581ca98>,
     <Element 'rank' at 0x7f673581c6d8>,
     <Element 'year' at 0x7f673581c5e8>,
     <Element 'gdppc' at 0x7f673581c868>,
     <Element 'neighbor' at 0x7f673581cb88>,
     <Element 'country' at 0x7f673581cc28>,
     <Element 'rank' at 0x7f673581cc78>,
     <Element 'year' at 0x7f673581ccc8>,
     <Element 'gdppc' at 0x7f673581cd18>,
     <Element 'neighbor' at 0x7f673581cd68>,
     <Element 'neighbor' at 0x7f673581cdb8>]
    
    In [31]: list(root.iter('rank'))
    Out[31]: 
    [<Element 'rank' at 0x7f673581c8b8>,
     <Element 'rank' at 0x7f673581c6d8>,
     <Element 'rank' at 0x7f673581cc78>]
    

      

  • 相关阅读:
    RS交叉表按照预定的节点成员排序
    Open DJ备份与恢复方案
    SQLServer2008备份时发生无法打开备份设备
    数据仓库备份思路
    SQLServer代理新建或者编辑作业报错
    Transfrom在64bit服务下面无法运行
    ActiveReport开发入门-图表的交互性
    ActiveReport开发入门-列表的交互性
    /etc/fstab 参数详解(转)
    CentOS7 查看硬盘情况
  • 原文地址:https://www.cnblogs.com/Peter2014/p/8065114.html
Copyright © 2020-2023  润新知