如下,一个银行卡打标签后导出的数据
<?xml version="1.0" encoding="ISO-8859-1"?> <annotation> <filename>a001.jpg</filename> <folder>users/three33//card</folder> <source> <submittedBy>three</submittedBy> </source> <imagesize> <nrows>2240</nrows> <ncols>3968</ncols> </imagesize> <object> <name>numbers</name> <deleted>0</deleted> <verified>0</verified> <occluded>no</occluded> <attributes>6228480808055442079</attributes> <parts> <hasparts/> <ispartof/> </parts> <date>12-May-2019 06:21:39</date> <id>0</id> <type>bounding_box</type> <polygon> <username>anonymous</username> <pt> <x>927</x> <y>1278</y> </pt> <pt> <x>3269</x> <y>1278</y> </pt> <pt> <x>3269</x> <y>1475</y> </pt> <pt> <x>927</x> <y>1475</y> </pt> </polygon> </object> </annotation>
(上面的代码无法保留格式,还是截张图吧
现要将其中的标记的四个坐标和银行卡号读取出来,并保存到文本文件。由于有几百张图片,需要批处理。
代码:
1 import os 2 import sys 3 import xml.etree.cElementTree as ET 4 5 6 from_path = "./card" //输入文件夹 7 to_path = "./cardout" //输出文件夹 8 files = os.listdir(from_path) 9 files.sort() #按字典序排序 10 11 12 i = 1 13 for filename in files: 14 15 dir1 = os.path.join(from_path, filename) 16 tree = ET.ElementTree(file=dir1) 17 root = tree.getroot() 18 19 new_filename = filename[:-4] + ".txt" 20 dir2 = os.path.join(to_path,new_filename) 21 22 fobj = open(dir2,'w+') 23 24 print("time: %d, from_filename: %s, to_filename: %s" % (i, dir1, dir2)) 25 26 for elem in tree.iterfind('object/polygon/pt'): 27 fobj.write((elem[0].text + ',' + elem[1].text + ',')) 28 #print(elem[0].text + ',' + elem[1].text + ',') 29 30 for elem in tree.iterfind('object/attributes'): 31 fobj.write(elem.text) 32 33 fobj.close() 34 i = i + 1 35
效果: