安装hadoop
使用管理员root用户登录
解压hadoop-2.6.3.tar.gz
tar -zxvf hadoop-2.6.3.tar.gz
解压完成后修改所有者和所属组为
chown hadoop:hadoop –R hadoop-2.6.3
修改配置文件
在配置前首先在/home/hadoop/目录下创建tmp目录和dfs目录,dfs目录下创建name目录和data目录,以便于下面的配置
1)hadoop-env.sh 修改java_home即可
2)yarn-env.sh 修改java_home即可
3)slaves 添加slave机器的hostname
4)master 不需要修改(因为现在机器的NameNode和SecondaryNameNode是一台机器,如果是不同机器,需要配置)
5)core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.hadoop:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value> file:/home/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.spark.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.spark.groups</name>
<value>*</value>
</property>
</configuration>
6)hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master.hadoop:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
7)mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master.hadoop:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master.hadoop:19888</value>
</property>
</configuration>
8)yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master.hadoop:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master.hadoop:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master.hadoop:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master.hadoop:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master.hadoop:8088</value>
</property>
</configuration>
配置完成后将hadoop文件分发到各台机器上并修改所有者和所属组
scp -r hadoop-2.6.3 root@slavex:/opt
chown hadoop:hadoop -R hadoop-2.6.3
在master配置hadoop环境变量
# set Hadoop environment
export HADOOP_HOME=/opt/hadoop-2.6.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
格式化hadoop
namenode ./bin/hdfs namenode -format
开启hadoop
./sbin/start-all.sh
查看java相关进程情况jps
master:ResourceManager,NameNode,SecondaryNameNode
slave:DataNode,NodeManager
查看集群状态 ./bin/hdfs dfsadmin -report
查看hdfs http://localhost:50070
查看map-reduce http://localhost:8088