• sax解析xml,验证格式并支持自定义标签


    一、sax简介

      SAX是事件驱动型的XML解析方式。顺序读取XML文件,生成事件,传播到用户定义的回调方法中来处理XML文件。

      优点:

        分段处理xml,而不是将整个xml一次加载进内存,内存占用少,速度快。

      缺点:

        顺序访问,不能回退。编码复杂,需要用户把控数据结构。

    二、使用流程

      1.创建工厂

    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    

       2.设置工厂属性(可选)

    saxParserFactory.setValidating(true); // 是否验证xml,默认false
    saxParserFactory.setNamespaceAware(true); // 是否展示命名空间 默认false
    

       3.生成解析器

    SAXParser parser = saxParserFactory.newSAXParser();
    

      4.设置解析器属性(可选)

    parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema"); // 验证模式下必须
    

       5.获取XMLReader(可选)

    XMLReader reader = parser.getXMLReader();
    

       6.设置XMLReader属性(可选)

    reader.setContentHandler(new MyDefalutHandler()); // 内容处理器
    reader.setEntityResolver(new MyEntityResolver()); // schama解析器
    reader.setErrorHandler(new MyErrorHandler()); // 异常处理器
    

       7.解析xml文件

    reader.parse(SaxDemo.class.getResource("/").getPath() + "/saxDemo.xml");
      或者 parser.parse(SaxDemo.class.getResource("/").getPath() + "/saxDemo.xml", new DefaultHandler());

       SAXParser和XMLReader都可以对xml进行解析,但是SAXParser将异常处理、内容处理和schema解析放到一个handler中进行重载,个人觉得职责划分不是很清晰,建议使用XMLReader。

    三、常规使用代码示例

    saxDemo.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
    	   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	   xmlns:util="http://www.springframework.org/schema/util"
    	   xmlns:aop="http://www.springframework.org/schema/cache"
    	   xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
    	   http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd
    	     http://www.springframework.org/schema/cache http://www.springframework.org/schema/cache/spring-cache.xsd"
    
    	   default-autowire="byName">
    
    	<aop:annotation-driven></aop:annotation-driven>
    
    	<util:map key-type="java.lang.String" value-type="java.lang.String">
    		<entry key="key" value="value"></entry>
    	</util:map>
    
    	<bean id="commonMap" class="java.util.HashMap">
    		<constructor-arg>
    			<map>
    				<entry key="key" value="value"></entry>
    			</map>
    		</constructor-arg>
    	</bean>
    </beans>
    

     MyDefalutHandler

    public class MyDefalutHandler extends DefaultHandler{
        @Override
        public void startDocument() throws SAXException {
            System.out.println("startDocument()");
        }
    
        @Override
        public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
            StringBuilder builder = new StringBuilder();
            int length = attributes.getLength();
            if (length > 0) {
                for (int i = 0; i < length; i++) {
                    builder.append(attributes.getLocalName(i).trim())
                            .append(":")
                            .append(attributes.getValue(i).trim())
                            .append(" ");
                }
            }
    
            System.out.println(String.format("startElement uri=%s localName=%s qname=%s attributes=%s", uri, localName, qName, builder));
        }
    
        @Override
        public void characters(char[] ch, int start, int length) throws SAXException {
            String content = new String(ch, start, length);
            if (StringUtils.isNotBlank(content)) {
                System.out.println("characters=" + content);
            }
        }
    
        @Override
        public void endElement(String uri, String localName, String qName) throws SAXException {
            System.out.println(String.format("startElement uri=%s localName=%s qname=%s", uri, localName, qName));
        }
    
        @Override
        public void endDocument() throws SAXException {
            System.out.println("endDocument()");
        }
    }
    

     测试类

     1 public class SaxDemo {
     2     public static void main(String[] args) throws Exception {
     3         SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
     4 //        saxParserFactory.setValidating(true); // 是否验证xml,默认false
     5 //        saxParserFactory.setNamespaceAware(true); // 是否展示命名空间 默认false
     6         SAXParser parser = saxParserFactory.newSAXParser();
     7 //        parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
     8         XMLReader reader = parser.getXMLReader();
     9         reader.setContentHandler(new MyDefalutHandler()); // 内容处理器
    10 //        reader.setEntityResolver(new MyEntityResolver()); // schama解析器
    11 //        reader.setErrorHandler(new MyErrorHandler()); // 异常处理器
    12         reader.parse(SaxDemo.class.getResource("/").getPath() + "/saxDemo.xml");
    13     }
    14 }

     测试结果

    startDocument()
    startElement uri= localName= qname=beans attributes=xmlns:http://www.springframework.org/schema/beans xmlns:xsi:http://www.w3.org/2001/XMLSchema-instance xmlns:util:http://www.springframework.org/schema/util xmlns:aop:http://www.springframework.org/schema/cache xsi:schemaLocation:http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd     http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd       http://www.springframework.org/schema/cache http://www.springframework.org/schema/cache/spring-cache.xsd default-autowire:byName 
    startElement uri= localName= qname=aop:annotation-driven attributes=
    startElement uri= localName= qname=aop:annotation-driven
    startElement uri= localName= qname=util:map attributes=key-type:java.lang.String value-type:java.lang.String 
    startElement uri= localName= qname=entry attributes=key:key value:value 
    startElement uri= localName= qname=entry
    startElement uri= localName= qname=util:map
    startElement uri= localName= qname=bean attributes=id:commonMap class:java.util.HashMap 
    startElement uri= localName= qname=constructor-arg attributes=
    startElement uri= localName= qname=map attributes=
    startElement uri= localName= qname=entry attributes=key:key value:value 
    startElement uri= localName= qname=entry
    startElement uri= localName= qname=map
    startElement uri= localName= qname=constructor-arg
    startElement uri= localName= qname=bean
    startElement uri= localName= qname=beans
    endDocument()
    

     结果中可以看到uri和localName都是空,而且对aop:annotation-driven之类的解析不是很友好

    打开测试类saxParserFactory.setNamespaceAware(true)的注释,执行结果如下

    startDocument()
    startElement uri=http://www.springframework.org/schema/beans localName=beans qname=beans attributes=schemaLocation:http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd     http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util.xsd       http://www.springframework.org/schema/cache http://www.springframework.org/schema/cache/spring-cache.xsd default-autowire:byName 
    startElement uri=http://www.springframework.org/schema/cache localName=annotation-driven qname=aop:annotation-driven attributes=
    startElement uri=http://www.springframework.org/schema/cache localName=annotation-driven qname=aop:annotation-driven
    startElement uri=http://www.springframework.org/schema/util localName=map qname=util:map attributes=key-type:java.lang.String value-type:java.lang.String 
    startElement uri=http://www.springframework.org/schema/beans localName=entry qname=entry attributes=key:key value:value 
    startElement uri=http://www.springframework.org/schema/beans localName=entry qname=entry
    startElement uri=http://www.springframework.org/schema/util localName=map qname=util:map
    startElement uri=http://www.springframework.org/schema/beans localName=bean qname=bean attributes=id:commonMap class:java.util.HashMap 
    startElement uri=http://www.springframework.org/schema/beans localName=constructor-arg qname=constructor-arg attributes=
    startElement uri=http://www.springframework.org/schema/beans localName=map qname=map attributes=
    startElement uri=http://www.springframework.org/schema/beans localName=entry qname=entry attributes=key:key value:value 
    startElement uri=http://www.springframework.org/schema/beans localName=entry qname=entry
    startElement uri=http://www.springframework.org/schema/beans localName=map qname=map
    startElement uri=http://www.springframework.org/schema/beans localName=constructor-arg qname=constructor-arg
    startElement uri=http://www.springframework.org/schema/beans localName=bean qname=bean
    startElement uri=http://www.springframework.org/schema/beans localName=beans qname=beans
    endDocument()
    

     四、验证xml

    如果我们使用ide,得益于ide的验证插件,编写xml的时候能规避掉一部分输入错误导致的格式异常。但是如果是通过xml进行rpc调用,我们可能需要悲观的设定得到的xml并不一定是正确的,验证xml是否满足既定格式就显得十分重要了。

    取消测试类的saxParserFactory.setValidating(true),开启xml验证功能。

    警告: 已启用验证, 但未设置 org.xml.sax.ErrorHandler, 这可能不是预期结果。解析器将使用默认 ErrorHandler 来输出前 0 个错误。请调用 'setErrorHandler' 方法以解决此问题。
    Error: URI=file:///D:/idea/springboot2/target/classes//saxDemo.xml Line=2: 文档无效: 找不到语法。
    Error: URI=file:///D:/idea/springboot2/target/classes//saxDemo.xml Line=2: 文档根元素 "beans" 必须匹配 DOCTYPE 根 "null"。
    

     根据提示,创建一个异常处理器

    public class MyErrorHandler implements ErrorHandler {
    
        @Override
        public void warning(SAXParseException exception) throws SAXException {
            System.out.println("------------warning------------");
            throw exception;
        }
    
        @Override
        public void error(SAXParseException exception) throws SAXException {
            System.out.println("------------error------------");
            throw exception;
        }
    
        @Override
        public void fatalError(SAXParseException exception) throws SAXException {
            System.out.println("------------fatalError------------");
            throw exception;
        }
    }
    

     取消测试类reader.setErrorHandler(new MyErrorHandler())的注释

    Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:///D:/idea/springboot2/target/classes//saxDemo.xml; lineNumber: 2; columnNumber: 7; 文档无效: 找不到语法。

     还是异常!!!

    出现这个问题的原因是sax不知道遵循哪个xml规范

     取消测试类parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema")的注释,程序正常执行了。

     五、自定义标签

    maven工程下,resources目录创建META-INF文件夹,生成2个文件:saxDemo.schemas user.xsd

    saxDemo.schemas内容:

    http://www.ym.com/schema/user.xsd=META-INF/user.xsd
    

     user.xsd内容

    <?xml version="1.0" encoding="UTF-8"?>
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                xmlns="http://www.ym.com/schema/user" targetNamespace="http://www.ym.com/schema/user"
                elementFormDefault="qualified">
        <xsd:element name="user">
            <xsd:complexType>
                <xsd:attribute name="id" type="xsd:string" />
                <xsd:attribute name="userName" type="xsd:string" />
                <xsd:attribute name="email" type="xsd:string" />
            </xsd:complexType>
        </xsd:element>
    </xsd:schema>
    

     修改samDemo.xml文件为:

    <?xml version="1.0" encoding="UTF-8"?>
    <beans xmlns="http://www.springframework.org/schema/beans"
    	   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    	   xmlns:user="http://www.ym.com/schema/user"
    	   xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
    	   http://www.ym.com/schema/user http://www.ym.com/schema/user.xsd"
    	   default-autowire="byName">
    	<user:user id="userTag" userName="userName" email="email"></user:user>
    </beans>
    

     执行测试代码

    ------------fatalError------------
    Exception in thread "main" org.xml.sax.SAXParseException; systemId: http://www.ym.com/schema/user.xsd; lineNumber: 1; columnNumber: 50; 在 publicId 和 systemId 之间需要有空格。

     抛出异常

    为什么spring的标签没啥问题,而我们自定义的不行呢?这是因为spring的标签可以从网络上获取,但是我们自定义的基本上都在我们本地,而且很多时候我们也不希望从网络中获取这些资源,更希望使用本地jar包中的。

    EntityResolver的作用就是从本地加载标签资源,验证xml的正确性

    创建MyEntityResolver从META-INF下读取资源

    (代码来自spring源码org.springframework.beans.factory.xml.PluggableSchemaResolver,删除了所有日志的内容)

    public class MyEntityResolver implements EntityResolver {
    
        private volatile Map<String, String> schemaMappings;
        private final String schemaMappingsLocation = "META-INF/saxDemo.schemas";
        @Override
        public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
            if (systemId != null) {
                String resourceLocation = getSchemaMappings().get(systemId);
                if (resourceLocation != null) {
                    Resource resource = new ClassPathResource(resourceLocation, Thread.currentThread().getContextClassLoader());
                    try {
                        InputSource source = new InputSource(resource.getInputStream());
                        source.setPublicId(publicId);
                        source.setSystemId(systemId);return source;
                    }
                    catch (FileNotFoundException ex) {
                    }
                }
            }
            return null;
        }
    
        private Map<String, String> getSchemaMappings() {
            Map<String, String> schemaMappings = this.schemaMappings;
            if (schemaMappings == null) {
                synchronized (this) {
                    schemaMappings = this.schemaMappings;
                    if (schemaMappings == null) {try {
                            Properties mappings = PropertiesLoaderUtils.loadAllProperties(this.schemaMappingsLocation, Thread.currentThread().getContextClassLoader());
                            Map<String, String> mappingsToUse = new ConcurrentHashMap<>(mappings.size());
                            CollectionUtils.mergePropertiesIntoMap(mappings, mappingsToUse);
                            schemaMappings = mappingsToUse;
                            this.schemaMappings = schemaMappings;
                        }
                        catch (IOException ex) {
                            throw new IllegalStateException("Unable to load schema mappings from location [" + this.schemaMappingsLocation + "]", ex);
                        }
                    }
                }
            }
            return schemaMappings;
        }
    }

    现在打开测试代码reader.setEntityResolver(new MyEntityResolver())的注释

    startDocument()
    23:23:53.832 [main] DEBUG com.ym.xml.MyEntityResolver - Loading schema mappings from [META-INF/saxDemo.schemas]
    23:23:53.839 [main] DEBUG com.ym.xml.MyEntityResolver - Loaded schema mappings: {http://www.ym.com/schema/user.xsd=META-INF/user.xsd}
    startElement uri=http://www.springframework.org/schema/beans localName=beans qname=beans attributes=schemaLocation:http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.ym.com/schema/user http://www.ym.com/schema/user.xsd default-autowire:byName default-lazy-init:default default-merge:default 
    23:23:54.614 [main] DEBUG com.ym.xml.MyEntityResolver - Found XML schema [http://www.ym.com/schema/user.xsd] in classpath: META-INF/user.xsd
    startElement uri=http://www.ym.com/schema/user localName=user qname=user:user attributes=id:userTag userName:userName email:email 
    startElement uri=http://www.ym.com/schema/user localName=user qname=user:user
    startElement uri=http://www.springframework.org/schema/beans localName=beans qname=beans
    endDocument()
    

     七、总结

      本文介绍了sax的基本概念、调用流程、常用的使用方式。

      saxParserFactory.setValidating(true) 开启验证功能

      setNamespaceAware(true) 生成详细的事件记录

      setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema") 使用xsd验证规则

      setEntityResolver(new MyEntityResolver())  使用自定义标签

      setErrorHandler(new MyErrorHandler()) 自定义异常处理

      

      

      

      

  • 相关阅读:
    nginx连接php fastcgi配置
    zabbix企业级监控概述和部署
    zabbix配置文件详解
    zabbix自定义键值原理
    ipvsadm命令
    lvs持久连接
    TCP协议的3次握手与4次挥手
    TCP协议的3次握手与4次挥手
    设计模式-模板模式
    设计模式-模板模式
  • 原文地址:https://www.cnblogs.com/yangmengdx3/p/8947922.html
Copyright © 2020-2023  润新知