DOM | SAX | JDOM
详情可参考:Java解析XML方法总结
★₯₰☆ DOM
树模型,支持双向访问和数据修改,占内存
/* parse入参其他形式 eg:"cof/11.xml"
* 1. File file = new File(filepath)
* 2. InputStream xmlIns = new FileInputStream(filepath)
* 3. InputSource is = new InputSource(filepath)
* 4. InputStream in = Thread.currentThread().getContextClassLoader().getResourceAsStream(filepath)
* 5. ByteArrayInputStream bais = new ByteArrayInputStream(xml_str.getBytes())//适应于xml格式字符串 */
public static Document getDocument(String filepath) throws ParserConfigurationException, IOException, SAXException {
//创建DOM解析器的工厂
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
//DOM解析器对象
DocumentBuilder builder = factory.newDocumentBuilder();
//解析XML文档得到整个Document对象
Document doc = builder.parse(filepath);
return doc;
}
其中,1-3默认类文件相对路径,getResourceAsStream()
默认从ClassPath根下获取,不能以/开头
注意parse()不能直接传xml字符串,否则报错:java.net.MalformedURLException: no protocol
Element root = doc.getDocumentElement();
NodeList list = doc.getElementsByTagName("nodename");
提供参考DocumentBuilderFactory的几个属性设置
factory.setValidating(true);//在文档解析时验证文档
factory.setIgnoringComments(true);//忽略注释
factory.setIgnoringElementContentWhitespace(false);//在解析XML文档时消除元素内容中的空格(“可忽略的空白”)
factory.setCoalescing(false);//将CDATA节点转换为文本节点并将其附加到相邻文本节点
★₯₰☆ SAX
"推"式流模型,事件驱动,无需加载整个xml文档、消耗内存少,但是查询结点需从头遍历,且不支持修改
需重写DefaultHandler
接口,实现结点处理方法
public class MySaxHandler extends DefaultHandler {
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
super.characters(ch, start, length);
}}
public static void saxParse(String filepath) throws ParserConfigurationException, SAXException {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser sax = factory.newSAXParser();
sax.parse(filepath, new MySaxHandler());
}
详情自行百度之...
★₯₰☆ JDOM
Java-based Document Object Model,树模型
/* build入参其他形式 eg:"cof/11.xml"
* 1. ByteArrayInputStream bais = new ByteArrayInputStream(xml_str.getBytes())
* 2. File file = new File(filepath)
* 3. InputSource is = new InputSource(new StringReader(xml_str)) */
public static Element getRootElement(String filepath) throws JDOMException, IOException {
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(filepath);
return doc.getRootElement();
}
注意build()也不能直接传xml字符串,否则报错:java.net.MalformedURLException: no protocol
DOM4J
Document Object Model for Java,采用Java集合框架,并完全支持DOM、SAX,推荐
/* read入参其他形式 eg:"cof/11.xml"
* 其他方式请移步上方 */
public static Document getDocument(String filepath) throws DocumentException {
SAXReader reader = new SAXReader();
Document doc = reader.read(filepath);
return doc;
}
Element root = doc.getRootElement();
Element el = root.element("xxx");
List<Element> list = root.elements("xx");
需引入包:<groupId>org.dom4j</groupId><artifactId>dom4j</artifactId><version>2.1.1</version>
通过结合xpath
快速定位元素,默认dom4j不支持,需引入:jaxen-1.1.6
List selectNodes("/xx/aa");
Node selectSingleNode("/xx/aa");
List<Node> selectNodes("/xx/aa/*"); //结点aa的所有子结点
提供部分常用方法:基于dom4j解析xml
public static void createEmptyXmlFile(String xmlPath) {
Document document = DocumentHelper.createDocument();
doc2XML(document, xmlPath); //将doc转换为xml文档
}
public static void doc2XML(Document doc, String fileName) {
OutputFormat format = OutputFormat.createPrettyPrint();
format.setEncoding("UTF-8");
XMLWriter writer = null;
try {
writer = new XMLWriter(new FileWriter(new File(fileName)), format);
writer.write(doc);
writer.close();
} catch (IOException e) { ... }
}
此外,javax.xml.xpath
也提供了xpath功能,了解之
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
StAX
Streaming API for XML:XmlStreamReader + XmlStreamWriter
,"拉"式流模型,JDK1.6新特性
两套处理XML文档的API:
- 基于指针的API,效率高、抽象化程度低
- 基于事件迭代器的API,效率低、抽象化程序高
InputStream in = XMLParseUtils.getXmlInputStream("filepath");
XMLInputFactory factory = XMLInputFactory.newFactory();
XMLStreamReader parser = factory.createXMLStreamReader(in);
基于woodstox的StAX2解析XML,参见:高铁直达
Jackson
准备:jackson-core
,jackson-dataformat-xml
,jackson-databind
参考:jackson序列化xml,jackson快速学习
提供一个xml工具类
public static String jacksonBean2XML(Object obj) throws JsonProcessingException {
JacksonXmlModule module = new JacksonXmlModule();
module.setDefaultUseWrapper(false);
XmlMapper xmlMapper = new XmlMapper(module);
return xmlMapper.writeValueAsString(obj);
}
public static <T> T jacksonXML2Bean(String xml, Class<T> cls) throws IOException {
JacksonXmlModule module = new JacksonXmlModule();
module.setDefaultUseWrapper(false);
XmlMapper xmlMapper = new XmlMapper(module);
//自动忽略无法对应pojo的字段
xmlMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
return xmlMapper.readValue(xml, cls);
}
注意,要求类T必须要有默认构造方法。
若辅以属性注解,再引入jar包:jackson-annotations
,详情请坐高铁
若需格式化输出,要加woodstox-core
:high-performance XML processor, implement Stax,SAX2 and Stax2 APIs
相关示例参考:writerWithDefaultPrettyPrinter()