• python 解析xml 文件: Element Tree 方式


    环境

    python:3.4.4

    准备xml文件

    首先新建一个xml文件,countries.xml。内容是在python官网上看到的。

    <?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighbor name="Malaysia" direction="N"/>
        </country>
        <country name="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighbor name="Costa Rica" direction="W"/>
            <neighbor name="Colombia" direction="E"/>
        </country>
    </data>

    准备python文件

    新建一个test_ET.py,用来解析xml文件。

    #!/usr/bin/python
    # -*- coding=utf-8 -*-
    
    import xml.etree.ElementTree as ET
    from xml.etree.ElementTree import Element
    
    tree = ET.parse('countries.xml')
    
    nodes = tree.findall("country")
    
    for node in nodes:
    #search node & attribute & text
        print ("*****Country*****")
        if node.attrib["name"]:
            print ("Name:",node.attrib["name"])
    
        rank=node.find("rank")
        print ("Rank:",rank.text)
    
        year=node.find("year")
        print ("Year:",year.text)
    
        gdppc=node.find("gdppc")
        print ("Gdppc:",gdppc.text)
    
        neighbors=node.findall("neighbor")
        for neighbor in neighbors:
            print ("Neighbor:",neighbor.attrib["name"])
    
    #add node
        rank=node.find("rank")
        element=Element("rank_next", {"name":"Rank","create":"20151231"})
        element.text="5"
        rank.append(element)
    
    #delete node
        year=node.find("year")
        node.remove(year)
    
    #add node attribute
        node.set("force","NewForce")
    #update node attribute
        node.set("name","NewNode")
    #delete node attribute
        neighbors=node.findall("neighbor")
        for neighbor in neighbors:
            del neighbor.attrib["direction"]
    
    #add node text
        neighbors=node.findall("neighbor")
        for neighbor in neighbors:
            neighbor.text = "Hello,Neighbor"
    #update node text
        gdppc=node.find("gdppc")
        gdppc.text = "11111"
    #delete node text
        rank=node.find("rank")
        rank.text = ""  
    
    tree.write("./out.xml", encoding="utf-8",xml_declaration=True)

    执行结果

    控制台:

    >python test_ET.py
    *****Country*****
    Name: Liechtenstein
    Rank: 1
    Year: 2008
    Gdppc: 141100
    Neighbor: Austria
    Neighbor: Switzerland
    *****Country*****
    Name: Singapore
    Rank: 4
    Year: 2011
    Gdppc: 59900
    Neighbor: Malaysia
    *****Country*****
    Name: Panama
    Rank: 68
    Year: 2011
    Gdppc: 13600
    Neighbor: Costa Rica
    Neighbor: Colombia

    out.xml文件:

    <?xml version='1.0' encoding='utf-8'?>
    <data>
        <country force="NewForce" name="NewNode">
            <rank><rank_next create="20151231" name="Rank">5</rank_next></rank>
            <gdppc>11111</gdppc>
            <neighbor name="Austria">Hello,Neighbor</neighbor>
            <neighbor name="Switzerland">Hello,Neighbor</neighbor>
        </country>
        <country force="NewForce" name="NewNode">
            <rank><rank_next create="20151231" name="Rank">5</rank_next></rank>
            <gdppc>11111</gdppc>
            <neighbor name="Malaysia">Hello,Neighbor</neighbor>
        </country>
        <country force="NewForce" name="NewNode">
            <rank><rank_next create="20151231" name="Rank">5</rank_next></rank>
            <gdppc>11111</gdppc>
            <neighbor name="Costa Rica">Hello,Neighbor</neighbor>
            <neighbor name="Colombia">Hello,Neighbor</neighbor>
        </country>
    </data>

    备注

    具有方便友好的API。代码可用性好,速度快,消耗内存少。

    最适合用来处理XML文档。

    参考:https://docs.python.org/2/library/xml.etree.elementtree.html

    tree = ET.parse('countries.xml')

    解析countries.xml并返回一个树。

    tree.write("./out2.xml", encoding="utf-8",xml_declaration=True)

    将元素树写入到文档,采用 “utf-8”编码,具有xml声明。

    write(file, encoding="us-ascii", xml_declaration=None, default_namespace=None, method="xml")
    Writes the element tree to a file, as XML. file is a file name, or a file object opened for writing. encoding [1] is the output encoding (default is US-ASCII). xml_declaration controls if an XML declaration should be added to the file. Use False for never, True for always, None for only if not US-ASCII or UTF-8 (default is None). default_namespace sets the default XML namespace (for “xmlns”). method is either "xml", "html" or "text" (default is "xml"). Returns an encoded string.
  • 相关阅读:
    POJ1611 && POJ2524 并查集入门
    POJ 2785:4 Values whose Sum is 0 二分
    POJ 2309:BST lowbit
    POJ 3692:Kindergarten 求补图的最大点独立集 头一次接触这样的做法
    POJ 3735:Training little cats 联想到矩阵相乘
    POJ 3916:Duplicate Removal 将相近的重复元素删除
    nyoj43--24 Point game(dfs)
    cf-Igor In the Museum (dfs)
    8VC Venture Cup 2016 -- Elimination Round Tutorial 626B
    蓝桥杯
  • 原文地址:https://www.cnblogs.com/miniren/p/5092200.html
Copyright © 2020-2023  润新知