• Java


    DOM | SAX | JDOM

    详情可参考:Java解析XML方法总结
    ★₯₰☆ DOM
    树模型,支持双向访问和数据修改,占内存

    /* parse入参其他形式 eg:"cof/11.xml"
    * 1. File file = new File(filepath)
    * 2. InputStream xmlIns = new FileInputStream(filepath)
    * 3. InputSource is = new InputSource(filepath)
    * 4. InputStream in = Thread.currentThread().getContextClassLoader().getResourceAsStream(filepath) 
    * 5. ByteArrayInputStream bais = new ByteArrayInputStream(xml_str.getBytes())//适应于xml格式字符串 */
    public static Document getDocument(String filepath) throws ParserConfigurationException, IOException, SAXException {
    	//创建DOM解析器的工厂
    	DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    	//DOM解析器对象
    	DocumentBuilder builder = factory.newDocumentBuilder();
    	//解析XML文档得到整个Document对象
    	Document doc = builder.parse(filepath);
    	return doc;
    }
    

    其中,1-3默认类文件相对路径,getResourceAsStream()默认从ClassPath根下获取,不能以/开头
    注意parse()不能直接传xml字符串,否则报错:java.net.MalformedURLException: no protocol

    Element root = doc.getDocumentElement();
    NodeList list = doc.getElementsByTagName("nodename");
    

    提供参考DocumentBuilderFactory的几个属性设置

    factory.setValidating(true);//在文档解析时验证文档
    factory.setIgnoringComments(true);//忽略注释
    factory.setIgnoringElementContentWhitespace(false);//在解析XML文档时消除元素内容中的空格(“可忽略的空白”)
    factory.setCoalescing(false);//将CDATA节点转换为文本节点并将其附加到相邻文本节点
    

    ★₯₰☆ SAX
    "推"式流模型,事件驱动,无需加载整个xml文档、消耗内存少,但是查询结点需从头遍历,且不支持修改
    需重写DefaultHandler接口,实现结点处理方法

    public class MySaxHandler extends DefaultHandler {
       @Override
       public void characters(char[] ch, int start, int length) throws SAXException {
          super.characters(ch, start, length);
    }}
    
    public static void saxParse(String filepath) throws ParserConfigurationException, SAXException {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser sax = factory.newSAXParser();
        sax.parse(filepath, new MySaxHandler());
    }
    

    详情自行百度之...

    ★₯₰☆ JDOM
    Java-based Document Object Model,树模型

    /* build入参其他形式 eg:"cof/11.xml" 
    * 1. ByteArrayInputStream bais = new ByteArrayInputStream(xml_str.getBytes())
    * 2. File file = new File(filepath)
    * 3. InputSource is = new InputSource(new StringReader(xml_str)) */
    public static Element getRootElement(String filepath) throws JDOMException, IOException {
        SAXBuilder builder = new SAXBuilder();
        Document doc = builder.build(filepath);
        return doc.getRootElement();
    }
    

    注意build()也不能直接传xml字符串,否则报错:java.net.MalformedURLException: no protocol

    DOM4J

    Document Object Model for Java,采用Java集合框架,并完全支持DOM、SAX,推荐

    /* read入参其他形式 eg:"cof/11.xml" 
    * 其他方式请移步上方 */
    public static Document getDocument(String filepath) throws DocumentException {
        SAXReader reader = new SAXReader();
        Document doc = reader.read(filepath);
        return doc;
    }
    
    Element root = doc.getRootElement();
    Element el = root.element("xxx");
    List<Element> list = root.elements("xx");
    

    需引入包:<groupId>org.dom4j</groupId><artifactId>dom4j</artifactId><version>2.1.1</version>
    通过结合xpath快速定位元素,默认dom4j不支持,需引入:jaxen-1.1.6

    List selectNodes("/xx/aa");
    Node selectSingleNode("/xx/aa");
    List<Node> selectNodes("/xx/aa/*"); //结点aa的所有子结点
    

    提供部分常用方法:基于dom4j解析xml

    public static void createEmptyXmlFile(String xmlPath) {
        Document document = DocumentHelper.createDocument();
    	doc2XML(document, xmlPath); //将doc转换为xml文档
    }
    public static void doc2XML(Document doc, String fileName) {
        OutputFormat format = OutputFormat.createPrettyPrint();
        format.setEncoding("UTF-8");
        XMLWriter writer = null;
        try {
            writer = new XMLWriter(new FileWriter(new File(fileName)), format);
            writer.write(doc);
            writer.close();
        } catch (IOException e) { ... }
    }
     
    

    此外,javax.xml.xpath也提供了xpath功能,了解之

    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    

    StAX

    Streaming API for XML:XmlStreamReader + XmlStreamWriter,"拉"式流模型,JDK1.6新特性
    两套处理XML文档的API:

    • 基于指针的API,效率高、抽象化程度低
    • 基于事件迭代器的API,效率低、抽象化程序高
    InputStream in = XMLParseUtils.getXmlInputStream("filepath");      
    XMLInputFactory factory = XMLInputFactory.newFactory();
    XMLStreamReader parser = factory.createXMLStreamReader(in);
    

    基于woodstox的StAX2解析XML,参见:高铁直达

    Jackson

    准备:jackson-corejackson-dataformat-xmljackson-databind
    参考:jackson序列化xmljackson快速学习
    提供一个xml工具类

    public static String jacksonBean2XML(Object obj) throws JsonProcessingException {
        JacksonXmlModule module = new JacksonXmlModule();
        module.setDefaultUseWrapper(false);
        XmlMapper xmlMapper = new XmlMapper(module);
        return xmlMapper.writeValueAsString(obj);
    }
    public static <T> T jacksonXML2Bean(String xml, Class<T> cls) throws IOException {
        JacksonXmlModule module = new JacksonXmlModule();
        module.setDefaultUseWrapper(false);
        XmlMapper xmlMapper = new XmlMapper(module);
        //自动忽略无法对应pojo的字段
        xmlMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
        return xmlMapper.readValue(xml, cls);
    }
    

    注意,要求类T必须要有默认构造方法。
    若辅以属性注解,再引入jar包:jackson-annotations详情请坐高铁
    若需格式化输出,要加woodstox-core:high-performance XML processor, implement Stax,SAX2 and Stax2 APIs
    相关示例参考:writerWithDefaultPrettyPrinter()

  • 相关阅读:
    应用运维职业现状
    两年工作总结
    explicit用法
    最小生成树 之 CODE[VS] 1231 最优布线问题
    最小生成树 之 CODE[VS] 1078 最小生成树
    并查集 之 CODE[VS] 1073 家族
    贪心 + 并查集 之 CODE[VS] 1069 关押罪犯 2010年NOIP全国联赛提高组
    枚举+并查集 之 CODE[VS] 1001 舒适的路线 2006年
    SPFA算法(求解单源最短路)详解 + 最短路 之 CODE[VS] 1079 回家
    最短路 之 CODE[VS] 1041 Car的旅行路线 2001年NOIP全国联赛提高组
  • 原文地址:https://www.cnblogs.com/wjcx-sqh/p/12364744.html
Copyright © 2020-2023  润新知