一般我们会使用dom4j、SAX、w3c来解析xml文件,网上也大多提供此类解决方案。
但在实际项目中,也有会解析xml格式的字符串报文的。
比如,有如下字符串:
String = "<Response service="OrderWebService"><Head>OK</Head><Body><OrderResponse><customerOrderNo>201605110015</customerOrderNo><mailNo>070000314903</mailNo><printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl><invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl></OrderResponse></Body></Response>";
对如上字符串进行格式化之后:
<Response service="OrderWebService"> <Head>OK</Head> <Body> <OrderResponse> <customerOrderNo>201605110015</customerOrderNo> <mailNo>070000314903</mailNo> <printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl> <invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl> </OrderResponse> </Body> </Response>
即使格式化之后,我们也发现这串报文比较特殊,因为它使用了属性而不是元素节点来描述对象。
下面提供dom4j的解决方案:
public HashMap<String, Object> stringToXmlByDom4j(String content){ HashMap<String, Object> result = new HashMap<String, Object>(); try { SAXReader saxReader=new SAXReader(); org.dom4j.Document docDom4j=saxReader.read(new ByteArrayInputStream(content.getBytes("utf-8"))); org.dom4j.Element root = docDom4j.getRootElement(); List<Attribute> rooAttrList = root.attributes(); for (Attribute rootAttr : rooAttrList) { System.out.println(rootAttr.getName() + ": " + rootAttr.getValue()); result.put(rootAttr.getName(), rootAttr.getValue()); } List<org.dom4j.Element> childElements = root.elements(); for (org.dom4j.Element e1 : childElements) { System.out.println("第一层:"+e1.getName() + ": " + e1.getText()); result.put(e1.getName(), e1.getText()); } for (org.dom4j.Element child : childElements) { //未知属性名情况下 List<Attribute> attributeList = child.attributes(); for (Attribute attr : attributeList) { System.out.println("第二层:"+attr.getName() + ": " + attr.getValue()); result.put(attr.getName(), attr.getValue()); } //已知属性名情况下 // System.out.println("id: " + child.attributeValue("id")); //未知子元素名情况下 List<org.dom4j.Element> elementList = child.elements(); for (org.dom4j.Element ele : elementList) { System.out.println("第二层:"+ele.getName() + ": " + ele.getText()); result.put(ele.getName(), ele.getText()); List<Attribute> kidAttr = ele.attributes(); for (Attribute kidattr : kidAttr) { System.out.println("第三层:"+kidattr.getName() + ": " + kidattr.getValue()); result.put(kidattr.getName(), kidattr.getValue()); } List<org.dom4j.Element> lidList = ele.elements(); int size = lidList.size(); if(size>0){ for (org.dom4j.Element e2 : lidList) { System.out.println("第三层:"+e2.getName() + ": " + e2.getText()); result.put(e2.getName(), e2.getText()); } } } // System.out.println(); //已知子元素名的情况下 // System.out.println("title" + child.elementText("title")); // System.out.println("author" + child.elementText("author")); //这行是为了格式化美观而存在 // System.out.println(); } } catch (Exception e) { e.printStackTrace(); } return result; }
写一个main方法测试结果如下:
响应结果:<Response service="OrderWebService"><Head>OK</Head><Body><OrderResponse><customerOrderNo>201605110015</customerOrderNo><mailNo>070000314903</mailNo><printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl><invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl></OrderResponse></Body></Response> service: OrderWebService 第一层:Head: OK 第一层:Body: 第二层:OrderResponse: 第三层:customerOrderNo: 201605110015 第三层:mailNo: 070000314903 第三层:printUrl: http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ== 第三层:invoiceUrl: http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==