• Hadoop安装


    先上传hadoop的安装包到服务器上去/home/hpc/

             注意:hadoop2.x的配置文件$HADOOP_HOME/etc/hadoop

             伪分布式需要修改5个配置文件

          4.1配置hadoop--配置文件目录----/etc/hadoop/

             第一个文件:hadoop-env.sh

                       vi hadoop-env.sh

                       #第25行(改成系统当前的jdk版本)

                       export JAVA_HOME=/usr/java/jdk1.8.0_91

             第二个文件:core-site.xml

                       <!-- 指定HADOOP所使用的文件系统schema(URI),HDFS的老大(NameNode)的地址 -->

                       <property>

                                <name>fs.default.name</name>

                                <value>hdfs://主机名:9000</value>

                       <description>change you own hadoop hostname</description>

                       </property>

                       <property>

                                <name>hadoop.tmp.dir</name>

                                <value>/usr/local/hadoop/tmp</value>

                       </property>

             第三个:hdfs-site.xml  

                       <!-- 指定HDFS副本的数量 -->

                       <property>

                                <name>dfs.name.dir</name>

                                <value>/usr/hadoop/hdfs/name</value>

                                <description>namenode上存储hdfs名字空间元数据 </description>

                       </property>

     

                       <property>

                                <name>dfs.data.dir</name>

                                <value>/usr/hadoop/hdfs/data</value>

                                <description>datanode上数据块的物理存储位置</description>

                       </property>

     

                       <property>

                                <name>dfs.replication</name>

                                <value>1</value>

                                <description>副本个数,配置默认是3,应小于datanode机器数量</description>

                       </property>

                       <property>

                                <name>dfs.secondary.http.address</name>

                                <value>主机名:50070</value>

                       </property>

     

       

             第四个:mapred-site.xml (mv mapred-site.xml.template mapred-site.xml)

                       mv mapred-site.xml.template mapred-site.xml

                       vim mapred-site.xml

                       <!-- 指定mr运行在yarn上 -->

                       <property>

                                <name>dfs.replication</name>

                                <value>1</value>

                       </property>

                       <property>

                                <name>dfs.permissions</name>

                                <value>false</value>

                       </property>

                       <property>

                                <name>mapreduce.framework.name</name>

                                <value>yarn</value>

                       </property>

             第五个:yarn-site.xml

                       <!-- 指定YARN的老大(ResourceManager)的地址 -->

                       <property>

                                <name>yarn.resourcemanager.hostname</name>

                                <value>主机名</value>

                       </property>

                       <property>

                                <name>yarn.nodemanager.aux-services</name>

                                <value>mapreduce_shuffle</value>

                       </property>

             3.2将hadoop添加到环境变量

             vim /etc/profile

                       export JAVA_HOME=/usr/java/jdk1.8.0_91

                       export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

                       export HADOOP_HOME=/home/qpx/hadoop-2.8.0

                       export PATH=.:$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

                       export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

         export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

      重新编译环境变量

             source /etc/profile

            

             3.3格式化namenode(是对namenode进行初始化)-----在bin文件夹下

                       ./hadoop namenode –format

          或者./hadoop-daemon.sh start namenode 或者  ./start-dfs.sh

                       netstat -lnp|grep 9000

                      将当前目录的文件上传到hadoop的云存储环境的/tmp/input中

         bin/hdfs dfs -put NOTICE.txt /

                       或者 hadoop fs -put **.txt /           

             3.4启动hadoop

                       先启动HDFS

                       sbin/start-dfs.sh           

                       再启动YARN

                       sbin/start-yarn.sh          

             3.5验证是否启动成功

                       使用jps命令验证

                       27408 NameNode

                       28218 Jps

                      27643 SecondaryNameNode

                       28066 NodeManager

                       27803 ResourceManager

                       27512 DataNode

         出现错误的情况:上面2个红色的进程必须启动,如未能全部启动,删除临时文件夹的内容user/hadoop/hdfs/,     

             错误:死了一个datanode

                                1)查看logs/xx.datanode.log

                                2)将namenode的clusterid复制一下,查看下datanode磁盘存储目录

                                3)在datanode目录下找到VERSION文件,修改clusterid为namenode的clusterid

                                4)重新启动namenode和datanode

               浏览器访问

                       http://192.168.1.101:50070 (HDFS管理界面)

  • 相关阅读:
    C#将DataTable按固定个数拆分成多个表
    IDataRowPersistable
    使用临时表的示例
    2011 11 28 sql语句学习
    2010 11 30 DrawCurve GDI绘制曲线
    如何查看viewstate的内容
    const 和 readonly 的区别
    access insert 语法错误
    asp.net下载文件的常用方法大全
    【转】JS超强判断电话号码
  • 原文地址:https://www.cnblogs.com/hmpcly/p/7327845.html
Copyright © 2020-2023  润新知