什么是xml
- XML 指可扩展标记语言(EXtensible Markup Language)
- XML 是一种标记语言,很类似 HTML
- XML 的设计宗旨是传输数据,而非显示数据
- XML 标签没有被预定义。您需要自行定义标签。
- XML 被设计为具有自我描述性。
- XML 是 W3C 的推荐标准
在python中可以用以下模块操作xml
以下是xml文件:
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank updated="yes">2</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank updated="yes">5</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank updated="yes">69</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
1. 查找、遍历 xml
得到root 节点:
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(dir(root))
#['__class__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'attrib', 'clear', 'extend', 'find', 'findall', 'findtext', 'get', 'getchildren', 'getiterator', 'insert', 'items', 'iter', 'iterfind', 'itertext', 'keys', 'makeelement', 'remove', 'set', 'tag', 'tail', 'text']
print(root.tag) #打印结果: data data正好是我们的根节点
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot() #得到root节点
print(root.tag)
# 遍历文档
for child in root:
print(child.tag,child.attrib) # 标签名和 属性名
print('----------------------')
for i in child:
print(i.tag,i.text)
打印结果:
data
country {'name': 'Liechtenstein'}
----------------------
rank 2
year 2008
gdppc 141100
neighbor None
neighbor None
country {'name': 'Singapore'}
----------------------
rank 5
year 2011
gdppc 59900
neighbor None
country {'name': 'Panama'}
----------------------
rank 69
year 2011
gdppc 13600
neighbor None
neighbor None
只遍历 其中某个 节点:
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(root.tag)
# 只遍历 year 节点
for node in root.iter('year'):
print(node.tag,node.text)
# 打印:
# data
# year 2008
# year 2011
# year 2011
----------------------------------------分割线-------------------------------------------------
2.修改和删除xml文档内容
修改:
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(root.tag)
#修改
for node in root.iter('year'):
node.text = str(int(node.text) + 1)
node.set('colr','red')
print(node.tag,node.text)
tree.write('xml_test2.xml')
结果:
删除
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
#删除
for country in root.findall('country'):
rank = int(country.find('rank').text)
if rank > 50:
root.remove(country)
tree.write('xml_test3.xml')
---------------------------------------------------分割线-------------------------------------------------------
3.创建xml
import xml.etree.cElementTree as ET
# 创建xml
#生成一个对象
new_xml = ET.Element('namelist')
name = ET.SubElement(new_xml,"name",attrib={"country":"Peking"})
age = ET.SubElement(name,"age",attrib={"type":"child"})
age.text = "22"
et = ET.ElementTree(new_xml) #生成文档对象
et.write("create_xml.xml",encoding="utf-8",xml_declaration=True)
ET.dump(new_xml) #打印生成的格式
打印结果:
<?xml version='1.0' encoding='utf-8'?> <namelist> <name country="Peking"> <age type="child">22</age> </name> </namelist>
并创建了一个 xml文件