1.关闭防火墙service iptables stop,(已经设置开机关闭的忽略)
2.进入hadoop目录,修改hadoop配置文件(4个)
core-site.xml(核心配置,fs.defaultFS指定了namenode所在的机器,而datanode是由slave文件中指定的,secondnamenode由hdfs-site.xml中指定(dfs.namenode.secondary.http-address默认在本机),hadoop.tmp.dir设置临时文件的保存目录)
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost.localdomain:8020</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/data/tmp</value> </property> </configuration>
hdfs-site.xml(分布式文件系统的配置文件,dfs.replication设置冗余备份数)
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
yarn-site.xml(数据操作系统配置文件,yarn.nodemanager.aux-services配置辅助服务,只有配置mapreduce_shuffle才可以运行mapreduce程序,yarn.resourcemanager.hostname配置了namenode的地址,yarn.log-aggregation-enable开启日志服务,yarn.log-aggregation.retain-seconds配置日志过期时间)
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>192.168.41.134</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>640800</value> </property> </configuration>
mapred-site.xml(分布式计算框架的配置文件)
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>192.168.41.134:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>192.168.41.134:19888</value> </property> </configuration>
3.开启服务
1.开始namenode节点(HDFS)
sbin/hadoop-daemon.sh start namenode
2.开始datanode节点(HDFS)
sbin/hadoop-daemon.sh start datanode
other:开启HDFS也可以用start-dfs.sh
3.开启resourcemanager(YARN)
sbin/yarn-daemon.sh resourcemanager
4.开启nodemanager(YARN)
sbin/yarn-daemon.sh stop nodemanager
other:开启YARN也可以用start-yarn.sh
5.开启日志服务(MAPREDUCE)
sbin/mr-jobhistory-daemon.sh start historyserver
6.运行测试程序(最后一个单词output不可以存在,如果存在,需要换个名字)
bin/yarn jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar
wordcount
/user/hub/mapreduce/wordcount/input
/user/hub/mapreduce/wordcount/output
3.查看HDFS的使用情况
http://192.168.41.134:50070
3.1 虚拟机下面查看HDFS的文件(命令与linux类似,前面要加上-)
bin/hdfs dfs -ls /user/hub/...
3.2 删除文件的时候 -rm -R可以直接删除文件夹
4.查看应用程序的运行情况
http://192.168.41.134:8088
5.查看hadoop官方文档
http://hadoop.apache.org
6.查看hadoop的历史版本(全)
http://archive.apache.org/dist/
7.配置说明:各个节点的位置信息,在配置文件中的指定
HDFS NameNode core-site.xml <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-senior.ibeifeng.com:8020</value> </property> DataNodes slaves hadoop-senior.ibeifeng.com SecondaryNameNode hdfs-site.xml <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop-senior.ibeifeng.com:50090</value> </property> YARN ResourceManager yarn-site.xml <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop-senior.ibeifeng.com</value> </property> NodeManagers slaves hadoop-senior.ibeifeng.com MapReduce HistoryServer mapred-site.xml <property> <name>mapreduce.jobhistory.address</name> <value>hadoop-senior.ibeifeng.com:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop-senior.ibeifeng.com:19888</value> </property>