python学习之路——基础篇（3）模块（续）

re正则表达式、shutil、ConfigParser、xml

一、re

正则元字符和语法：

语法	说明	表达式	完全匹配字符
字符
一般字符	匹配自身	abc	abc
.	匹配除换行符“ ”外,任意一个字符	a.c	abc
	转义字符,将特殊字符转义为本身	a.c a\c	a.c ac
[...]	匹配字符集任意一个字符，或者“-”表示一个集合范围如：[a-zA-Z0-9]匹配范围中任意一个字符；或者[^] 匹配否定，对括号中的内容取反。	[abc]efg	aefg befg cefg
预定义字符集
d	数字：[0-9]	adc	a1c
D	非数字：[^d]	aDc	abc
s	空白字符：[<空格> fv]	asc	a c
S	非空白字符:[^s]	aSc	abc
w	字符：[a-zA-Z0-9_]	awc	abc
W	非单词字符：[^w]	aWc	a c
数量词
*	匹配前一个字符0次或无数次	a*b	aab ab b
+	匹配前一个字符1次或无数次	a+b	aab aaaab
?	匹配前一个字符0次或1次	a?b	b ab
{m}	匹配前一个字符m次	a{2}c	aac
{m,n}	匹配前一个字符m次到n次。m与n可以省略如果省略m，表示0次到n次；省略n表示从m次到无数次	a{1,2}c	ac aac
*? +? ?? {m,n}?	使*、+、？、{m,n}变为非贪婪模式	见后文
边界匹配
^	匹配字符串开头	^abc	abc
$	匹配字符串结尾	abc$	abc
A	仅匹配字符串开头	Aabc	abc
	仅匹配字符串结尾	abc	abc
	匹配一个单词边界，也就是指单词和空格间的位置。例如，“er”可以匹配“never”中的“er”，但不能匹配“verb”中的“er”	ab	ab
B	匹配非单词边界。“erB”能匹配“verb”中的“er”，但不能匹配“never”中的“er”	abBc	abc
逻辑与分组
\|	代表左右表达式任意匹配一个。它总是先匹配左边的，一旦匹配成功，则跳过右边表达式。如果\|没有被包含在()中，他的范围将是整个表达式。	abc\|def	abc def
()	被括号括起来的表达式将视为分组。从表达式左边开始，每遇到一个分组的左括号“（“，编号+1 分组表达式作为一个整体，可以后接数量词。表达式中\|仅在分组中生效。	(abc){2} (abc\|bcd)	abcabc abc
(?P<name>...)	分组，除了原有编号外，再指定一个别名。group(1)=group(name)	(?P<id>abc)	abc
(?P=name)引用别名为name的分组匹配到的字符串(?P<id>123)abc(?P=id)123abc123

数量词的贪婪模式与非贪婪模式

　　正则表达式通常用于在文本中查找匹配的字符串。Python里数量词默认是贪婪的（在少数语言里也可能是默认非贪婪），总是尝试匹配尽可能多的字符；非贪婪的则相反，总是尝试匹配尽可能少的字符。例如：正则表达式"ab*"如果用于查找"abbbc"，将找到"abbb"。而如果使用非贪婪的数量词"ab*?"，将找到"a"。

反斜杠的困扰

　　与大多数编程语言相同，正则表达式里使用""作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符""，那么使用编程语言表示的正则表达式里将需要4个反斜杠"\\"：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r"\"表示。同样，匹配一个数字的"\d"可以写成r"d"。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

re相关匹配方法

match

　　　　match，从起始位置开始匹配，匹配成功返回一个对象，未匹配成功返回None

 1 import re
 2 
 3 text="the Attila the Hun show"
 4 m = re.match(".",text)
 5 print(m.group()) #"t" 或者group(0)取得匹配所有结果
 6 
 7 m = re.match("(.)(.)(.)",text)
 8 print(m.group(0)) #"the"
 9 
10 #分组
11 print(m.group(1,2,3)) #('t','h','e') 匹配后得分组
12 
13 #将正则编译成Pattern对象
14 pattern = re.compile(".") 
15 m = pattern.match(text)
16 print(m.group()) #'t'

　　2. search

　　　　search, 浏览整个字符串去匹配第一个，未匹配成功返回None

1 import re
2 text = "Example 3:there is 1 date 11/5/2016 in here"
3 m = re.search("(d{1,2})/(d{1,2})/(d{2,4})",text)
4 print(m.group(1),m.group(2),m.group(3))# 11 5 2016

3. sub

　　　　替换匹配成功的指定位置字符串

 1 import re
 2 # sub(pattern, repl, string, count=0, flags=0)
 3 # pattern： 正则模型
 4 # repl   ： 要替换的字符串或可执行对象
 5 # string ： 要匹配的字符串
 6 # count  ： 指定匹配个数
 7 # flags  ： 匹配模式
 8 text = "you're no fun anymore fun"
 9 m = re.sub("fun","entertaining",text,2)
10 print(m)
11 # "you're no entertaining anymore entertaining"

4. spilt

根据正则匹配分隔字符串

import re
# split(pattern, string, maxsplit=0, flags=0)
# pattern： 正则模型
# string ： 要匹配的字符串
# maxsplit：指定分割个数
# flags  ： 匹配模式

# 无分组
origin = "hello alex bcd alex lge alex acd 19"
r = re.split("alex", origin, 1)
print(r) #["hello","bcd alex lge alex acd 19"]

# 有分组
origin = "hello alex bcd alex lge alex acd 19"
r1 = re.split("(alex)", origin, 1)
print(r1) # ["hello","alex","bcd alex lge alex acd 19"]

 r2 = re.split("(al(ex))", origin, 1)
print(r2) # ["hello","alex","ex","bcd alex lge alex acd 19"]

5. findall

　　　　获取非重复的匹配列表；如果有一个组则以列表形式返回，且每一个匹配均是字符串；如果模型中有多个组，则以列表形式返回，且每一个匹配均是元祖；

　　　　空的匹配也会包含在结果中

1 # 无分组
2 origin = "hello alex bcd abcd lge acd 19"
3 r = re.findall("aw+",origin)
4 print(r) # ["alex","abcd","acd"]
5 
6 # 有分组
7 origin = "hello alex bcd abcd lge acd 19"
8 r = re.findall("a((w*)c)(d)", origin)
9 print(r) # 匹配两个字符串"abcd"&"acd"先将匹配最外层分组的元素放入元祖#中，再将内层分组匹配的元素放入元祖中结果[("bc","b","d"),("c","","d")]

IP：
^(25[0-5]|2[0-4]d|[0-1]?d?d)(.(25[0-5]|2[0-4]d|[0-1]?d?d)){3}$
手机号：
^1[3|4|5|8][0-9]d{8}$
邮箱：
[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(.[a-zA-Z0-9_-]+)+

二、shutil

高级文件、文件夹、压缩包处理模块

将文件内容拷贝到另一个文件

　　　　shutil.copyfileobj(fsrc, fdst[, length])

1 import shutil
2 shutil.copyfileobj(open('old.xml','r'), open('new.xml', 'w'))

　　2. 拷贝文件

　　　　shutil.copyfile(src, dst)

1 shutil.copyfile('f1.log', 'f2.log')

　　3. 仅拷贝权限。内容、组、用户均不变

　　　　shutil.copymode(src, dst)

1 shutil.copymode('f1.log', 'f2.log')

　　4. 仅拷贝状态的信息，包括：mode bits, atime, mtime, flags

　　　　shutil.copystat(src, dst)

1 shutil.copystat('f1.log', 'f2.log')

　　5. 拷贝文件和权限

　　　　shutil.copy(src, dst)

1 shutil.copy('f1.log', 'f2.log')

　　6. 拷贝文件和状态信息

　　　　shutil.copy2(src, dst)

1 shutil.copy2('f1.log', 'f2.log')

　　7. 递归的去拷贝文件夹

　　　　shutil.ignore_patterns(*patterns) 忽略某些格式文件
　　　　shutil.copytree(src, dst, symlinks=False, ignore=None)

1 import shutil
2 shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

1 import shutil
2 shutil.copytree('f1', 'f2', symlinks=True, ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

　8. 递归的去删除文件

　　　　shutil.rmtree(path[, ignore_errors[, onerror]])

1 import shutil
2 shutil.rmtree('folder1')

　　9. 递归的去移动文件，它类似mv命令，其实就是重命名。

　　　　shutil.move(src, dst)

1 import shutil
2 shutil.move('folder1', 'folder3')

　　10. 创建压缩包并返回文件路径，例如：zip、tar

　　　　shutil.make_archive(base_name, format,...) 这个功能只能压缩一个文件夹

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，如：www =>保存至当前路径如：/Users/lcy/www =>保存至/Users/lcy/

format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”

root_dir：要压缩的文件夹路径（默认当前目录）

owner：用户，默认当前用户

group：组，默认当前组

logger：用于记录日志，通常是logging.Logger对象

#将 /Users/lcy/Downloads/test 下的文件打包放置当前程序目录
import shutil
ret = shutil.make_archive("www", 'gztar', root_dir='/Users/lcy/Downloads/test')
#将 /Users/lcy/Downloads/test 下的文件打包放置 /Users/lcy/目录
import shutil
ret = shutil.make_archive("/Users/lcy/www", 'gztar', root_dir='/Users/lcy/Downloads/test')

附加：ZipFile 和 TarFile一般用这个较多

 1 import zipfile
 2 
 3 # 压缩
 4 z = zipfile.ZipFile('laxi.zip', 'w') #创建一个压缩包 如果以“a”模式打开 追加 在已存在追加文件放入压缩包
 5 z.write('a.log') #将文件写到这个压缩包中
 6 z.write('data.data')
 7 z.close()
 8 # 解压
 9 z = zipfile.ZipFile('laxi.zip', 'r')
10 z.extractall() #解压全部
11 z.close()

1 # 解压 单个文件
2 z= zipfile.ZipFile("la.zip","r")
3 for item in z.namelist(): # 将打印出压缩包中成员文件
4      print(item)
5 z.extract(member) #根据item 解压某个文件名
6 z.close()

tarfile模块

 1 import tarfile
 2 
 3 # 压缩
 4 tar = tarfile.open('your.tar','w')
 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.log', arcname='bbs3.log')# arcname 改压缩文件的名字
 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.log', arcname='cmdb2.log')
 7 tar.close()
 8 
 9 # 解压
10 tar = tarfile.open('your.tar','r')
11 tar.extractall()  # 可设置解压地址 
#tar.getmembers() 来获取压缩包中的成员，返回是所有成员对象类型为tarfile.TarInfo;获取某个文件名对象obj=tar.getmeber("文件名")然后tar.extract(obj)解压单个文件
12 tar.close()

三、ConfigParse

　　configparser用于处理特定格式的文件，其本质上是利用open来操作文件。

#指定格式文件如下
[section1] # 节点
k1 = v1    # 值
k2:v2       # 值
 
[section2] # 节点
k1 = v1    # 值

获取所有节点

1 import configparser
2  
3 config = configparser.ConfigParser()
4 config.read('xxxooo', encoding='utf-8')
5 ret = config.sections()
6 print(ret)

　　2. 获取指定节点下所有的键值对

1 import configparser
2  
3 config = configparser.ConfigParser()
4 config.read('conf', encoding='utf-8')
5 ret = config.items('section1')

　　3. 获取指定节点下所有的建

import configparser
 
config = configparser.ConfigParser()
config.read('conf', encoding='utf-8')
ret = config.options('section1')
print(ret)

　　4. 获取指定节点下指定key的值

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config.read('conf', encoding='utf-8')
 5  
 6  
 7 v = config.get('section1', 'k1')
 8 # v = config.getint('section1', 'k1')
 9 # v = config.getfloat('section1', 'k1')
10 # v = config.getboolean('section1', 'k1')

　　5. 检查、删除、添加节点

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config.read('conf', encoding='utf-8')
 5  
 6  
 7 # 检查
 8 has_sec = config.has_section('section1')
 9 print(has_sec)
10  
11 # 添加节点
12 config.add_section("SEC_1")
13 config.write(open('conf', 'w'))
14  
15 # 删除节点
16 config.remove_section("SEC_1")
17 config.write(open('conf', 'w'))

　　6. 检查、删除、设置指定组内的键值对

import configparser
 
config = configparser.ConfigParser()
config.read('conf', encoding='utf-8')
 
# 检查
has_opt = config.has_option('section1', 'k1')
print(has_opt)
 
# 删除
config.remove_option('section1', 'k1')
config.write(open('conf', 'w'))
 
# 设置
config.set('section1', 'k10', "123")
config.write(open('conf', 'w')) # 从内存写到文件

四、XML

xml文件格式：

<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2023</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2026</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2026</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data>

  1 class Element:
  2     """An XML element.
  3 
  4     This class is the reference implementation of the Element interface.
  5 
  6     An element's length is its number of subelements.  That means if you
  7     want to check if an element is truly empty, you should check BOTH
  8     its length AND its text attribute.
  9 
 10     The element tag, attribute names, and attribute values can be either
 11     bytes or strings.
 12 
 13     *tag* is the element name.  *attrib* is an optional dictionary containing
 14     element attributes. *extra* are additional element attributes given as
 15     keyword arguments.
 16 
 17     Example form:
 18         <tag attrib>text<child/>...</tag>tail
 19 
 20     """
 21 
 22     当前节点的标签名
 23     tag = None
 24     """The element's name."""
 25 
 26     当前节点的属性
 27 
 28     attrib = None
 29     """Dictionary of the element's attributes."""
 30 
 31     当前节点的内容
 32     text = None
 33     """
 34     Text before first subelement. This is either a string or the value None.
 35     Note that if there is no text, this attribute may be either
 36     None or the empty string, depending on the parser.
 37 
 38     """
 39 
 40     tail = None
 41     """
 42     Text after this element's end tag, but before the next sibling element's
 43     start tag.  This is either a string or the value None.  Note that if there
 44     was no text, this attribute may be either None or an empty string,
 45     depending on the parser.
 46 
 47     """
 48 
 49     def __init__(self, tag, attrib={}, **extra):
 50         if not isinstance(attrib, dict):
 51             raise TypeError("attrib must be dict, not %s" % (
 52                 attrib.__class__.__name__,))
 53         attrib = attrib.copy()
 54         attrib.update(extra)
 55         self.tag = tag
 56         self.attrib = attrib
 57         self._children = []
 58 
 59     def __repr__(self):
 60         return "<%s %r at %#x>" % (self.__class__.__name__, self.tag, id(self))
 61 
 62     def makeelement(self, tag, attrib):
 63         创建一个新节点
 64         """Create a new element with the same type.
 65 
 66         *tag* is a string containing the element name.
 67         *attrib* is a dictionary containing the element attributes.
 68 
 69         Do not call this method, use the SubElement factory function instead.
 70 
 71         """
 72         return self.__class__(tag, attrib)
 73 
 74     def copy(self):
 75         """Return copy of current element.
 76 
 77         This creates a shallow copy. Subelements will be shared with the
 78         original tree.
 79 
 80         """
 81         elem = self.makeelement(self.tag, self.attrib)
 82         elem.text = self.text
 83         elem.tail = self.tail
 84         elem[:] = self
 85         return elem
 86 
 87     def __len__(self):
 88         return len(self._children)
 89 
 90     def __bool__(self):
 91         warnings.warn(
 92             "The behavior of this method will change in future versions.  "
 93             "Use specific 'len(elem)' or 'elem is not None' test instead.",
 94             FutureWarning, stacklevel=2
 95             )
 96         return len(self._children) != 0 # emulate old behaviour, for now
 97 
 98     def __getitem__(self, index):
 99         return self._children[index]
100 
101     def __setitem__(self, index, element):
102         # if isinstance(index, slice):
103         #     for elt in element:
104         #         assert iselement(elt)
105         # else:
106         #     assert iselement(element)
107         self._children[index] = element
108 
109     def __delitem__(self, index):
110         del self._children[index]
111 
112     def append(self, subelement):
113         为当前节点追加一个子节点
114         """Add *subelement* to the end of this element.
115 
116         The new element will appear in document order after the last existing
117         subelement (or directly after the text, if it's the first subelement),
118         but before the end tag for this element.
119 
120         """
121         self._assert_is_element(subelement)
122         self._children.append(subelement)
123 
124     def extend(self, elements):
125         为当前节点扩展 n 个子节点
126         """Append subelements from a sequence.
127 
128         *elements* is a sequence with zero or more elements.
129 
130         """
131         for element in elements:
132             self._assert_is_element(element)
133         self._children.extend(elements)
134 
135     def insert(self, index, subelement):
136         在当前节点的子节点中插入某个节点，即：为当前节点创建子节点，然后插入指定位置
137         """Insert *subelement* at position *index*."""
138         self._assert_is_element(subelement)
139         self._children.insert(index, subelement)
140 
141     def _assert_is_element(self, e):
142         # Need to refer to the actual Python implementation, not the
143         # shadowing C implementation.
144         if not isinstance(e, _Element_Py):
145             raise TypeError('expected an Element, not %s' % type(e).__name__)
146 
147     def remove(self, subelement):
148         在当前节点在子节点中删除某个节点
149         """Remove matching subelement.
150 
151         Unlike the find methods, this method compares elements based on
152         identity, NOT ON tag value or contents.  To remove subelements by
153         other means, the easiest way is to use a list comprehension to
154         select what elements to keep, and then use slice assignment to update
155         the parent element.
156 
157         ValueError is raised if a matching element could not be found.
158 
159         """
160         # assert iselement(element)
161         self._children.remove(subelement)
162 
163     def getchildren(self):
164         获取所有的子节点（废弃）
165         """(Deprecated) Return all subelements.
166 
167         Elements are returned in document order.
168 
169         """
170         warnings.warn(
171             "This method will be removed in future versions.  "
172             "Use 'list(elem)' or iteration over elem instead.",
173             DeprecationWarning, stacklevel=2
174             )
175         return self._children
176 
177     def find(self, path, namespaces=None):
178         获取第一个寻找到的子节点
179         """Find first matching element by tag name or path.
180 
181         *path* is a string having either an element tag or an XPath,
182         *namespaces* is an optional mapping from namespace prefix to full name.
183 
184         Return the first matching element, or None if no element was found.
185 
186         """
187         return ElementPath.find(self, path, namespaces)
188 
189     def findtext(self, path, default=None, namespaces=None):
190         获取第一个寻找到的子节点的内容
191         """Find text for first matching element by tag name or path.
192 
193         *path* is a string having either an element tag or an XPath,
194         *default* is the value to return if the element was not found,
195         *namespaces* is an optional mapping from namespace prefix to full name.
196 
197         Return text content of first matching element, or default value if
198         none was found.  Note that if an element is found having no text
199         content, the empty string is returned.
200 
201         """
202         return ElementPath.findtext(self, path, default, namespaces)
203 
204     def findall(self, path, namespaces=None):
205         获取所有的子节点
206         """Find all matching subelements by tag name or path.
207 
208         *path* is a string having either an element tag or an XPath,
209         *namespaces* is an optional mapping from namespace prefix to full name.
210 
211         Returns list containing all matching elements in document order.
212 
213         """
214         return ElementPath.findall(self, path, namespaces)
215 
216     def iterfind(self, path, namespaces=None):
217         获取所有指定的节点，并创建一个迭代器（可以被for循环）
218         """Find all matching subelements by tag name or path.
219 
220         *path* is a string having either an element tag or an XPath,
221         *namespaces* is an optional mapping from namespace prefix to full name.
222 
223         Return an iterable yielding all matching elements in document order.
224 
225         """
226         return ElementPath.iterfind(self, path, namespaces)
227 
228     def clear(self):
229         清空节点
230         """Reset element.
231 
232         This function removes all subelements, clears all attributes, and sets
233         the text and tail attributes to None.
234 
235         """
236         self.attrib.clear()
237         self._children = []
238         self.text = self.tail = None
239 
240     def get(self, key, default=None):
241         获取当前节点的属性值
242         """Get element attribute.
243 
244         Equivalent to attrib.get, but some implementations may handle this a
245         bit more efficiently.  *key* is what attribute to look for, and
246         *default* is what to return if the attribute was not found.
247 
248         Returns a string containing the attribute value, or the default if
249         attribute was not found.
250 
251         """
252         return self.attrib.get(key, default)
253 
254     def set(self, key, value):
255         为当前节点设置属性值
256         """Set element attribute.
257 
258         Equivalent to attrib[key] = value, but some implementations may handle
259         this a bit more efficiently.  *key* is what attribute to set, and
260         *value* is the attribute value to set it to.
261 
262         """
263         self.attrib[key] = value
264 
265     def keys(self):
266         获取当前节点的所有属性的 key
267 
268         """Get list of attribute names.
269 
270         Names are returned in an arbitrary order, just like an ordinary
271         Python dict.  Equivalent to attrib.keys()
272 
273         """
274         return self.attrib.keys()
275 
276     def items(self):
277         获取当前节点的所有属性值，每个属性都是一个键值对
278         """Get element attributes as a sequence.
279 
280         The attributes are returned in arbitrary order.  Equivalent to
281         attrib.items().
282 
283         Return a list of (name, value) tuples.
284 
285         """
286         return self.attrib.items()
287 
288     def iter(self, tag=None):
289         在当前节点的子孙中根据节点名称寻找所有指定的节点，并返回一个迭代器（可以被for循环）。
290         """Create tree iterator.
291 
292         The iterator loops over the element and all subelements in document
293         order, returning all elements with a matching tag.
294 
295         If the tree structure is modified during iteration, new or removed
296         elements may or may not be included.  To get a stable set, use the
297         list() function on the iterator, and loop over the resulting list.
298 
299         *tag* is what tags to look for (default is to return all elements)
300 
301         Return an iterator containing all the matching elements.
302 
303         """
304         if tag == "*":
305             tag = None
306         if tag is None or self.tag == tag:
307             yield self
308         for e in self._children:
309             yield from e.iter(tag)
310 
311     # compatibility
312     def getiterator(self, tag=None):
313         # Change for a DeprecationWarning in 1.4
314         warnings.warn(
315             "This method will be removed in future versions.  "
316             "Use 'elem.iter()' or 'list(elem.iter())' instead.",
317             PendingDeprecationWarning, stacklevel=2
318         )
319         return list(self.iter(tag))
320 
321     def itertext(self):
322         在当前节点的子孙中根据节点名称寻找所有指定的节点的内容，并返回一个迭代器（可以被for循环）。
323         """Create text iterator.
324 
325         The iterator loops over the element and all subelements in document
326         order, returning all inner text.
327 
328         """
329         tag = self.tag
330         if not isinstance(tag, str) and tag is not None:
331             return
332         if self.text:
333             yield self.text
334         for e in self:
335             yield from e.itertext()
336             if e.tail:
337                 yield e.tail
338 
339 节点功能一览表

func

解析xml文件

解析文件为xml对象

1 from xml.etree import ElementTree as ET
2 
3 # 直接解析xml文件
4 tree = ET.parse("xo.xml")
5 
6 # 获取xml文件的根节点
7 root = tree.getroot()

解析字符串

1 from xml.etree import ElementTree as ET
2 
3 
4 # 打开文件，读取XML内容
5 str_xml = open('xo.xml', 'r').read()
6 
7 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
8 root = ET.XML(str_xml)

　　2. 操作xml

　　遍历xml所有节点

 1 from xml.etree import  ElementTree as ET
 2 
 3 ## 打开xml文件 通过返回对象tree进行相关操作
 4 tree = ET.parse("xo.xml")   tree =>ElementTree
 5 #
 6 ##获取根节点 root可以获取相关节点属性 tag为根节点名字 节点属性attrib  标签中间的值<xml>text</xml>
 7 root = tree.getroot()
 8 print(root,root.tag,root.attrib)
 9 #
10 #遍历节点
11 for child in root:
12     print(child, child.tag, child.attrib)
13     for grandechild in child:
14         print(grandechild, grandechild.tag, grandechild.attrib,grandechild.text)

遍历指定节点，删除修改等

 1   # 解析字符串形式 得到一个 xml对象
 2 # 打开文件，读取XML内容
 3 str_xml = open('xo.xml', 'r').read()
 4 
 5 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
 6 root = ET.XML(str_xml)
 7 
 8 ############ 操作 ############
 9 
10 # 顶层标签
11 print(root.tag)
12 
13 # 循环所有的year节点
14 for node in root.iter('year'):
15     # 将year节点中的内容自增一
16     new_year = int(node.text) + 1
17     node.text = str(new_year)
18 
19     # 设置属性
20     node.set('name', 'alex')
21     node.set('age', '18')
22     # 删除属性
23     del node.attrib['name']

############ 保存文件 ############
tree = ET.ElementTree(root)
tree.write("newnew.xml", encoding='utf-8') # 只有tree 才有write 写操作 《=所以无论是文件操作还是字符串形式xml

　　3. 创建xml

SubElement
Element与append也可以创建
makeElement与append 也可以自己创建

 1 from xml.etree import ElementTree as ET
 2 
 3 from xml.dom import minidom
 4 
 5 
 6 def prettify(elem):
 7     """将节点转换成字符串，并添加缩进。
 8     """
 9     rough_string = ET.tostring(elem, 'utf-8')
10     reparsed = minidom.parseString(rough_string)
11     return reparsed.toprettyxml(indent="	")
12 # 创建根节点
13 root = ET.Element("famliy")
14 
15 
16 # 创建节点大儿子
17 son1 = ET.SubElement(root, "son", attrib={'name': '儿1'})
18 # 创建小儿子
19 son2 = ET.SubElement(root, "son", attrib={"name": "儿2"})
20 
21 # 在大儿子中创建一个孙子
22 grandson1 = ET.SubElement(son1, "age", attrib={'name': '儿11'})
23 grandson1.text = '孙子'
24 
25 #
26 #et = ET.ElementTree(root)  #生成文档对象
27 #et.write("test.xml", encoding="utf-8", xml_declaration=True, short_empty_elements=False)
28 
29 # 调用缩进函数 所以得到的字符串就包含了缩进
30 raw_str = prettify(root)
31 
32 f = open("xxxoo.xml",'w',encoding='utf-8')
33 f.write(raw_str)
34 f.close()

makeelement

 1 from xml.etree import ElementTree as ET
 2 
 3 # 创建根节点
 4 root = ET.Element("famliy")
 5 
 6 
 7 # 创建大儿子
 8 # son1 = ET.Element('son', {'name': '儿1'})
 9 son1 = root.makeelement('son', {'name': '儿1'})
10 # 创建小儿子
11 # son2 = ET.Element('son', {"name": '儿2'})
12 son2 = root.makeelement('son', {"name": '儿2'})
13 
14 # 在大儿子中创建两个孙子
15 # grandson1 = ET.Element('grandson', {'name': '儿11'})
16 grandson1 = son1.makeelement('grandson', {'name': '儿11'})
17 # grandson2 = ET.Element('grandson', {'name': '儿12'})
18 grandson2 = son1.makeelement('grandson', {'name': '儿12'})
19 
20 son1.append(grandson1)
21 son1.append(grandson2)
22 
23 
24 # 把儿子添加到根节点中
25 root.append(son1)
26 root.append(son1)
27 
28 tree = ET.ElementTree(root)
29 tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)

makeelement

Element

 1 from xml.etree import ElementTree as ET
 2 
 3 
 4 # 创建根节点
 5 root = ET.Element("famliy")
 6 
 7 
 8 # 创建节点大儿子
 9 son1 = ET.Element('son', {'name': '儿1'})
10 # 创建小儿子
11 son2 = ET.Element('son', {"name": '儿2'})
12 
13 # 在大儿子中创建两个孙子
14 grandson1 = ET.Element('grandson', {'name': '儿11'})
15 grandson2 = ET.Element('grandson', {'name': '儿12'})
16 son1.append(grandson1)
17 son1.append(grandson2)
18 
19 
20 # 把儿子添加到根节点中
21 root.append(son1)
22 root.append(son1)
23 
24 tree = ET.ElementTree(root)
25 tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)

Element

相关阅读:
第三次迭代目标
 UML用例图以及时序图
 第一次迭代目标完成情况及感想
 第四次迭代感想
 数据流图与数据流程图的区别
 第三次迭代感想
 速达的用例图与时序图
 第一次迭代的感想
 速达的WBS
NABCD的分解
原文地址：https://www.cnblogs.com/lcysen/p/6032932.html