三台serveryiprod01,02,03,当中01为namenode,02为secondarynamenode。3个均为datanode
3台server的这里提到的配置均需一样。
0、安装前提条件:
0.1 确保有java
安装完java后,在.bash_profile中,必须有JAVA_HOME配置
export JAVA_HOME=/home/yimr/local/jdk
0.2 确保3台机器建立信任关系,详见还有一篇文章
1、core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/home/sdc/tmp/hadoop-${user.name}</value> </property> <property> <name>fs.default.name</name> <value>hdfs://yiprod01:9000</value> </property> </configuration>
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value><span style="font-family: Arial, Helvetica, sans-serif;">yiprod02</span><span style="font-family: Arial, Helvetica, sans-serif;">:9001</value></span> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/yimr/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/yimr/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
3、hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.6.0_27
4、mapred-site.xml
<configuration> <property> <!-- 使用yarn作为资源分配和任务管理框架 --> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <!-- JobHistory Server地址 --> <name>mapreduce.jobhistory.address</name> <value>yiprod01:10020</value> </property> <property> <!-- JobHistory WEB地址 --> <name>mapreduce.jobhistory.webapp.address</name> <value><span style="font-family: Arial, Helvetica, sans-serif;">yiprod01</span><span style="font-family: Arial, Helvetica, sans-serif;">:19888</value></span> </property> <property> <!-- 排序文件的时候一次同一时候最多可并行的个数 --> <name>mapreduce.task.io.sort.factor</name> <value>100</value> </property> <property> ll <name>mapreduce.reduce.shuffle.parallelcopies</name> <value>50</value> </property> <property> <name>mapred.system.dir</name> <value>file:/home/yimr/dfs/mr/system</value> </property> <property> <name>mapred.local.dir</name> <value>file:/home/sdc/dfs/mr/local</value> </property> <property> <!-- 每一个Map Task须要向RM申请的内存量 --> <name>mapreduce.map.memory.mb</name> <value>1536</value> </property> <property> <!-- 每一个Map阶段申请的Container的JVM參数 --> <name>mapreduce.map.java.opts</name> <value>-Xmx1024M</value> </property> <property> <!-- 每一个Reduce Task须要向RM申请的内存量 --> <name>mapreduce.reduce.memory.mb</name> <value>2048</value> </property> <property> <!-- 每一个Reduce阶段申请的Container的JVM參数 --> <name>mapreduce.reduce.java.opts</name> <value>-Xmx1536M</value> </property> <property> <!-- 排序内存使用限制 --> <name>mapreduce.task.io.sort.mb</name> <value>512</value> </property> </configuration>
5、yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>yiprod01:8080</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>yiprod01:8081</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>yiprod01:8082</value> </property> <property> <!-- 每一个nodemanager可分配的内存总量 --> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>${hadoop.tmp.dir}/nodemanager/remote</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>${hadoop.tmp.dir}/nodemanager/logs</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>yiprod01:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>yiprod01:8088</value> </property> </configuration>
6、format namenode
java.io.IOException: NameNode is not formatted.
hadoop namenode -format
7、问题解决
7.1 32位库问题
表现:
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh 14/08/01 11:59:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/yimr/local/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. yiprod01] sed: -e expression #1, char 6: unknown option to `s' -c: Unknown cipher type 'cd' The authenticity of host 'yiprod01 (192.168.1.131)' can't be established. RSA key fingerprint is ac:9e:e0:db:d8:7a:29:5c:a1:d4:7f:4c:38:c0:72:30. Are you sure you want to continue connecting (yes/no)?原因是使用了下载hadoop时。默认编译的32位的库64-Bit: ssh: Could not resolve hostname 64-Bit: Name or service not known You: ssh: Could not resolve hostname You: Name or service not known VM: ssh: Could not resolve hostname VM: Name or service not known loaded: ssh: Could not resolve hostname loaded: Name or service not known have: ssh: Could not resolve hostname have: Name or service not known HotSpot(TM): ssh: Could not resolve hostname HotSpot(TM): Name or service not known Server: ssh: Could not resolve hostname Server: Name or service not known guard.: ssh: Could not resolve hostname guard.: Name or service not known
file libhadoop.so.1.0.0
libhadoop.so.1.0.0: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped
暂时解决的方法:
改动etc以下的hadoop-env.sh
在末尾加上例如以下两行
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.library.path=$HADOOP_PREFIX/lib"
但仍然有下面warning
14/08/01 11:46:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
至此hadoop能够正常启动起来,在单独的一篇文章介绍怎样彻底解决此问题。