获取Document

前言

　　前一篇文章讲述了加载Bean的第一个步骤----XML文件的验证模式，这篇文章将会讲述其第二个步骤，也就是加载XML，得到对应的Document对象。话不多说，开始。

获取Document

经过了验证模式准备的步骤，就可以进行Document加载了。同样的XMLBeanFactoryReader类对于文档的读取没有亲力亲为，而是委托给了DocumentLoader去读取，先来看一下这个接口：

public interface DocumentLoader {
    Document loadDocument(
            InputSource inputSource, EntityResolver entityResolver,
            ErrorHandler errorHandler, int validationMode, boolean namespaceAware)
            throws Exception；
}

DocumentLoader中只有一个方法loadDocument()，该方法接收五个参数：

　　（1）inputSource：加载Document的Resource源。

　　（2）entityResolver：解析文件的解析器。

　　（3）errorHandler：处理加载Document对象过程的错误。

　　（4）validationMode：验证模式。

　　（5）namespaceAware：命名空间支持。如果要提供对XML名称空间的支持，则为true。

该方法由Document的默认实现类DefaultDocumentLoader实现，具体实现如下：

 1 　　public Document loadDocument(InputSource inputSource, EntityResolver entityResolver,
 2             ErrorHandler errorHandler, int validationMode, boolean namespaceAware) throws Exception {
 3 
 4         DocumentBuilderFactory factory = createDocumentBuilderFactory(validationMode, namespaceAware);
 5         if (logger.isTraceEnabled()) {
 6             logger.trace("Using JAXP provider [" + factory.getClass().getName() + "]");
 7         }
 8         DocumentBuilder builder = createDocumentBuilder(factory, entityResolver, errorHandler);
 9         return builder.parse(inputSource);
10     }

　　首先创建通过createDocumentBuilderFactory方法创建DocumentBuilderFactory，再通过factory创建DocumentBuilder，最后解析inputSource来返回Document对象。

EntityResolver

　　通过loadDocument()获取Document对象时，有一个参数entityResolver，该参数是通过getEntityResolver()获取的。

　　getEntityResolver()：返回指定的解析器，如果没有指定，则构造一个未指定的默认解析器。

protected EntityResolver getEntityResolver() {
        if (this.entityResolver == null) {
            // Determine default EntityResolver to use.
            ResourceLoader resourceLoader = getResourceLoader();
            if (resourceLoader != null) {
                this.entityResolver = new ResourceEntityResolver(resourceLoader);
            }
            else {
                this.entityResolver = new DelegatingEntityResolver(getBeanClassLoader());
            }
        }
        return this.entityResolver;
    }

如果ResourceLoader不为null，则根据指定的ResourceLoader创建一个ResourceEntityResolver。如果ResourceLoader为null，则创建一个DelegatingEntityResolver，该Resolver委托给默认的BeansDtdResolver和PluggableSchemaResolver。

　　❤ ResourceEntityResolver：继承自EntityResolver，通过ResourceLoader来解析实体的引用。

　　❤ DelegatingEntityResolver：EntityResolver的实现，分别代理了DTD的BeansDtdResolver 和XML Schemas 的 PluggableSchemaResolver。

　　❤ BeansDtdResolver：Spring中Bean的DTD解析器。EntityResolver的实现，用来从classpath或者jar文件加载DTD。

　　❤ PluggableSchemaResolver：使用一系列Map文件将Schema URL解析到本地classpath资源。

getEntityResolver()返回EntityResolver，那这个EntityResolver到底是什么呢？

　　官网是这样解释的：；如果SAX应用程序需要实现自定义处理外部实体，则必须实现此接口并使用setEntityResolver方法向SAX驱动器注册一个实例。也就是说，对于解析一个XML，SAX首先读取该XML文档上的声明，根据声明去寻找相应的DTD定义，以便对文档进行一个验证。默认的寻找规则，即通过网络（实现上就是声明的DTD的URL地址）来下载相应的DTD声明，并进行认证。下载的过程漫长，而且当网络中断或不可用的时候，这里会报错，就是因为相应的DTD声明没有被找到的原因。

　　EntityResolver的作用是项目本身就可以提供一个如何寻找DTD声明的方法，即由程序来实现寻找DTD声明的过程，比如我们将DTD文件放到项目中的某处，在实现时直接将此文档读取并返回给SAX即可。这样就避免了通过网络来寻找相应的声明。

　　首先来看一下EntityResolver的接口方法声明：

public abstract InputSource resolveEntity (String publicId,String systemId) throws SAXException, IOException;

　　可以看出，上面方法接收两个参数publicId和systemId，并返回一个inputSource对象。接下来我们以特定的文件来进行讲解。

（1）如果我们在解析验证模式为XSD的配置文件，代码如下：

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                            http://www.springframework.org/schema/beans/spring-beans.xsd">

    ......
</beans>

读取到以下两个参数。

　　publicId：null

　　systemId：http://www.springframework.org/schema/beans/spring-beans.xsd

（2）如果我们在解析验证模式为DTD的配置文件，代码如下：

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//Spring//DTD BEAN 2.0//EN" "http://www.springframework.org/schema/beans/spring-beans.xsd">
<beans">
    ......
</beans>

读取到以下两个参数：

　　publicId：-//Spring//DTD BEAN 2.0//EN

　　systemId：http://www.springframework.org/schema/beans/spring-beans.xsd

之前已经说过，验证模式的加载方式是通过URL进行网络下载的，这样会造成延时，用户体验也不好，一般的做法是将验证文件放置在自己的工程里，那么怎么做才能将这个URL转换为自己工程里对应的地址文件呢？我们以加载DTD文件为例来看看Spring中是如何实现的。根据之前Spring中通过getEntityResolver()方法对EntityResolver的获取，我们知道，Spring中使用DelegatingEntityResolver类为EntityResolver的实现类，resolverEntity实现方法如下：

　　public InputSource resolveEntity(String publicId, @Nullable String systemId) throws SAXException, IOException {
        if (systemId != null) {
            if (systemId.endsWith(DTD_SUFFIX)) {
                return this.dtdResolver.resolveEntity(publicId, systemId);
            }
            else if (systemId.endsWith(XSD_SUFFIX)) {
                return this.schemaResolver.resolveEntity(publicId, systemId);
            }
        }
        return null;
    }

　　我们可以看出，对不同的验证模式，Spring采用了不同的解析器进行解析。简单的描述一下原理，比如加载DTD类型的BeanDtdResolver的resolverEntity是直接截取systemId最后的xx.dtd然后去当前路径下寻找，而加载XSD类型的PluggableSchemaResolver类的resolveEntity是默认到META-INF/Spring.schemas文件中找到systemId所对应的XSD文件并加载。下面是BeansDtdResolver的源码：

public InputSource resolveEntity(String publicId, @Nullable String systemId) throws IOException {
    if (logger.isTraceEnabled()) {
        logger.trace("Trying to resolve XML entity with public ID [" + publicId +
                "] and system ID [" + systemId + "]");
    }
    if (systemId != null && systemId.endsWith(DTD_EXTENSION)) {
        int lastPathSeparator = systemId.lastIndexOf('/');
        int dtdNameStart = systemId.indexOf(DTD_NAME, lastPathSeparator);
        if (dtdNameStart != -1) {
            String dtdFile = DTD_NAME + DTD_EXTENSION;
            if (logger.isTraceEnabled()) {
                logger.trace("Trying to locate [" + dtdFile + "] in Spring jar on classpath");
            }
            try {
                Resource resource = new ClassPathResource(dtdFile, getClass());
                InputSource source = new InputSource(resource.getInputStream());
                source.setPublicId(publicId);
                source.setSystemId(systemId);
                if (logger.isDebugEnabled()) {
                    logger.debug("Found beans DTD [" + systemId + "] in classpath: " + dtdFile);
                }
                return source;
            }
            catch (IOException ex) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Could not resolve beans DTD [" + systemId + "]: not found in classpath", ex);
                }
            }

        }
    }

    // Use the default behavior -> download from website or wherever.
    return null;
}

　　从上面的代码中我们可以看出加载DTD类型的BeansDtdResolver.resolveEntity()只是对systemId进行了简单的校验（从最后一个“/”开始，内容中是否包含spring-beans），然后构造一个InputSource并设置publicId、systemId，然后返回。

PluggableSchemaResolver 的解析过程如下:

　　public InputSource resolveEntity(String publicId, @Nullable String systemId) throws IOException {
        if (logger.isTraceEnabled()) {
            logger.trace("Trying to resolve XML entity with public id [" + publicId +
                    "] and system id [" + systemId + "]");
        }

        if (systemId != null) {
            String resourceLocation = getSchemaMappings().get(systemId);
            if (resourceLocation != null) {
                Resource resource = new ClassPathResource(resourceLocation, this.classLoader);
                try {
                    InputSource source = new InputSource(resource.getInputStream());
                    source.setPublicId(publicId);
                    source.setSystemId(systemId);
                    if (logger.isDebugEnabled()) {
                        logger.debug("Found XML schema [" + systemId + "] in classpath: " + resourceLocation);
                    }
                    return source;
                }
                catch (FileNotFoundException ex) {
                    if (logger.isDebugEnabled()) {
                        logger.debug("Couldn't find XML schema [" + systemId + "]: " + resource, ex);
                    }
                }
            }
        }
        return null;
    }

　　首先调用 getSchemaMappings() 获取一个映射表(systemId 与其在本地的对照关系)，然后根据传入的 systemId 获取该 systemId 在本地的路径 resourceLocation，最后根据 resourceLocation 构造 InputSource 对象，并设置publicId、systemId，然后返回。

参考：《Spring源码深度解析》郝佳编著：

作者：Joe

出处：https://www.cnblogs.com/Joe-Go/

努力了的才叫梦想，不努力的就是空想，努力并且坚持下去，毕竟这是我相信的力量

相关阅读:
关于nginx 一个服务器，多个站点的困惑？
asp.net core 6 (单文件版本控制)
Photoshop无法载入动作，因为意外地遇到文件尾处理办法
 envi5.3打开失败JSON_PARSE: Invalid string, no closing '"'
GEE的存储空间
 IDL实现TM遥感影像直方图统计（中值、均值、方差、众数及峰度系数计算）(转）
Revit二次开发——读取shp
Linux使用记录
 RandLANet使用
 Revit2018二次开发——外部工具不显示
原文地址：https://www.cnblogs.com/Joe-Go/p/10061095.html