生产环境中hadoop一般会选择64位版本,官方下载的hadoop安装包中的native库是32位的,因此运行64位版本时,需要自己编译64位的native库,并替换掉自带native库。
源码包下的BUILDING.txt,包含官方介绍,如:如何在linux、windows下编译;编译过程中的错误处理;将源码导入eclipse中的步骤等,推荐看一看。
本文环境同:hadoop2.5.2学习及实践笔记(一)—— 伪分布式学习环境搭建
一.编译源码
附:常用工具包网盘路径(部分包非编译时使用):http://pan.baidu.com/s/1c0evtgW
BUILDING.txt文档列举的linux中编译条件如下:
* Unix System
* JDK 1.6+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code)
* Zlib devel (if compiling native code)
* openssl devel ( if compiling native hadoop-pipes )
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
注:jdk建议使用1.7,楼主使用1.8编译时报错,换回1.7编译成功。
编译过程
1. linux系统包安装
# yum install cmake lzo-devel zlib-devel gcc autoconf automake libtool ncurses-devel openssl-devel
2. 安装Maven、Ant、Findbugs
>解压tar包,并将解压后的目录移动到安装路径
>配置环境变量
# vim /etc/profile export MAVEN_HOME=/opt/apache-maven-3.1.4
export ANT_HOME=/opt/apache-ant-1.9.4 export FINDBUGS_HOME=/opt/findbugs-2.5.0
export PATH=$PATH:$MAVEN_HOME/bin:$ANT_HOME/bin:$FINDBUGS_HOME/bin
>环境变量生效
# source /etc/rofile
>maven国内镜像配置
修改MAVEN_HOME/conf/settings.xml文件<mirrors>内容
<mirror> <id>nexus-osc</id> <mirrorOf>*</mirrorOf> <name>Nexus osc</name> <url>http://maven.oschina.net/content/groups/public/</url> </mirror>
3.从源码安装protobuf
>解压tar包,并进入解压后的根目录
>编译并安装
# ./configure
# make
# sudo make install
>验证安装
# protoc --version
4.编译hadoop源码
>解压源码tar包,并进入解压后的根目录
$ mvn package -DskipTests -Pdist,native -Dtar
>其他编译选项
附:BUILDING.txt中原文
Create binary distribution without native code and without documentation:
$ mvn package -Pdist -DskipTests -Dtar
Create binary distribution with native code and with documentation:
$ mvn package -Pdist,native,docs -DskipTests -Dtar
Create source distribution:
$ mvn package -Psrc -DskipTests
Create source and binary distributions with native code and documentation:
$ mvn package -Pdist,native,docs,src -DskipTests -Dtar
Create a local staging version of the website (in /tmp/hadoop-site)
$ mvn clean site; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
>编译过程中内存溢出错误处理
编译前设置MAVEN_OPTS环境变量
export MAVEN_OPTS="-Xms256m -Xmx512m" --内存大小自己调节
附:BUILDING.txt中原文
If the build process fails with an out of memory error, you should be able to fix it by increasing the memory used by maven
which can be done via the environment variable MAVEN_OPTS.
Here is an example setting to allocate between 256 and 512 MB of heap space to Maven
export MAVEN_OPTS="-Xms256m -Xmx512m"
5.替换native库
>编译后native路径:Hadoop-2.5.2-src/hadoop-dist/target/hadoop-2.5.2/lib/native
未替换前,调用相关脚本会报警告,如:
[hadoop@localhost hadoop-2.5.2]$ ./sbin/stop-dfs.sh 15/03/22 05:45:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Stopping namenodes on [localhost] localhost: stopping namenode localhost: stopping datanode Stopping secondary namenodes [0.0.0.0] 0.0.0.0: stopping secondarynamenode 15/03/22 05:45:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
编译后,替换掉native目录下文件后,不会再提示警告
[hadoop@localhost hadoop-2.5.2]$ ./sbin/start-dfs.sh Starting namenodes on [localhost] localhost: starting namenode, logging to /opt/modules/hadoop-2.5.2/logs/hadoop-hadoop-namenode-localhost.localdomain.out localhost: starting datanode, logging to /opt/modules/hadoop-2.5.2/logs/hadoop-hadoop-datanode-localhost.localdomain.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /opt/modules/hadoop-2.5.2/logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out
二. 导入源码到eclipse
1. 安装hadoop-maven-plugins
进入源码下hadoop-maven-plugins目录,执行命令:
$ mvn install
2. 生成eclipse项目文件
返回源码根目录,执行:
$ mvn eclipse:eclipse -DskipTests
3. 导入eclipse
[File] > [Import] > [Existing Projects into Workspace]
4. 导入后错误解决
>hadoop-streaming中build path错误:
Java Build Path->Source:删除 hadoop-yarn-server-resourcemanager/conf
>AvroRecord相关错误
a. 下载avro-tools-1.7.4.jar
b. 执行命令
$ cd hadoop-2.5.2-src/hadoop-common-project/hadoop-common/src/test/avro $ java -jar avro-tools所在目录/avro-tools-1.7.4.jar compile schema avroRecord.avsc ../java
c. eclipse刷新项目
>protobuf相关错误
a. 执行命令
$ cd hadoop-2.5.2-src/hadoop-common-project/hadoop-common/src/test/proto $ protoc --java_out=../java *.proto
b. eclipse刷新项目
附:BUILDING.txt中原文
Importing projects to eclipse
When you import the project to eclipse, install hadoop-maven-plugins at first.
$ cd hadoop-maven-plugins
$ mvn install
Then, generate eclipse project files.
$ mvn eclipse:eclipse -DskipTests
At last, import to eclipse by specifying the root directory of the project via
[File] > [Import] > [Existing Projects into Workspace].