情况描述
- 使用JDBC从Hive中抽取数据,所以maven项目中有hive依赖库;
- 数据导入Elasticsearch,版本2.3.1其中guava库为18以上的版本
- hive与ES的guava版本冲突
- 现象:java.lang.NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
解决方法
- 将Elasticsearch中冲突库,进行改名,重新打包;
- 在新项目中引入新打包的ES库
方法一:Shade and relocate
简介
- 为了避免ES中库与其他依赖库的冲突,可以选择将ES依赖的冲突库relocate,并映射到新的名词,避免库覆盖。
- 因为hadoop生产环境的更新并不方便,通过maven的shade插件,重新映射库版本更靠谱
Shade Elasticsearch
这一步将所依赖的ES库进行shade,创建一个新的maven项目,将依赖的Elasticsearch库依赖加入,并将冲突的库relocate,编译成新的jar
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>my.elasticsearch</groupId> <artifactId>es-shaded</artifactId> <version>1.0-SNAPSHOT</version> <properties> <elasticsearch.version>2.3.1</elasticsearch.version> </properties> <dependencies> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>${elasticsearch.version}</version> </dependency> <dependency> <groupId>org.elasticsearch.plugin</groupId> <artifactId>shield</artifactId> <version>${elasticsearch.version}</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.1</version> <configuration> <createDependencyReducedPom>false</createDependencyReducedPom> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <relocations> <relocation> <pattern>com.google.guava</pattern> <shadedPattern>my.elasticsearch.guava</shadedPattern> </relocation> <relocation> <pattern>org.joda</pattern> <shadedPattern>my.elasticsearch.joda</shadedPattern> </relocation> <relocation> <pattern>com.google.common</pattern> <shadedPattern>my.elasticsearch.common</shadedPattern> </relocation> <relocation> <pattern>com.google.thirdparty</pattern> <shadedPattern>my.elasticsearch.thirdparty</shadedPattern> </relocation> </relocations> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer" /> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build> <repositories> <repository> <id>elasticsearch-releases</id> <url>http://maven.elasticsearch.org/releases</url> <releases> <enabled>true</enabled> <updatePolicy>daily</updatePolicy> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> </project>
引入shade ES jar
在新的项目中引入上一步编译好的ES包
<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>${guava.version}</version> </dependency> <dependency> <groupId>my.elasticsearch</groupId> <artifactId>es-shaded</artifactId> <version>1.0-SNAPSHOT</version> </dependency>
参考:https://www.elastic.co/blog/to-shade-or-not-to-shade
方法二:修改集群job库加载策略(未实验)
<property>
<name>mapreduce.job.user.classpath.first</name>
<value>true</value>
</property>