今天突然发现了一个lxml的坑。
假设我们有一个节点
<id>123</id>
有两个父节点都要用上述节点,则必须把上面的节点写两遍!用同一个会出错!
出错例子:
#!/usr/bin/env python #encoding:utf8 from lxml import etree if __name__ == "__main__": root1 = etree.Element("root1") #根节点1 root2 = etree.Element("root2") #根节点2 ver_node = etree.Element("id") #子节点 ver_node.text = "123" root1.append(ver_node) #都加入了同一个子节点 root2.append(ver_node) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8') print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
结果:
<?xml version='1.0' encoding='UTF-8'?> <root1/> <?xml version='1.0' encoding='UTF-8'?> <root2> <id>123</id> </root2>
只有后面一个有子节点,前面一个没有!
正确写法:
#!/usr/bin/env python #encoding:utf8 from lxml import etreeimport copy if __name__ == "__main__": root1 = etree.Element("root1") root2 = etree.Element("root2") ver_node1 = etree.Element("id") ver_node1.text = "123" ver_node2 = copy.deepcopy(ver_node1) #深拷贝! root1.append(ver_node1) root2.append(ver_node2) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8') print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
结果:
<?xml version='1.0' encoding='UTF-8'?> <root1> <id>123</id> </root1> <?xml version='1.0' encoding='UTF-8'?> <root2> <id>123</id> </root2>