• hadoop完全分布式搭建(非高可用)


    一、准备工作

    1.新建虚拟机 固定ip NAT、固定主机名

    ## 固定主机名:
    vi /etc/sysconfig/network
    

    2.关闭防火墙 or 暴露端口

    service iptables stop  关闭防火墙
    chkconfig iptables off 禁止开启启动
    

    3.必要软件 JDK、Hadoop
    4.而配置ssh无密码登录 [每台节点都需要生成]

    (1)生成公钥和私钥
        ssh-keygen -t rsa
    (2)配置hosts文件(/etc/hosts)Ip与hostname的对照关系:
        192.168.121.101 node01
        192.168.121.102 node02
        192.168.121.103 node03
        192.168.121.103 node04
        ...
        对于以上的文件,在node01上修改完毕之后,使用scp 命令 远程拷贝给node02 node03
    (3)导入公钥到认证文件
        ssh-copy-id -i /root/.ssh/id_rsa.pub node01
        ssh-copy-id -i /root/.ssh/id_rsa.pub node02
        ssh-copy-id -i /root/.ssh/id_rsa.pub node03
        ssh-copy-id -i /root/.ssh/id_rsa.pub node04
        ...
    

    5.配置NTP,使得集群间的时间同步(非必须)

    • 安装 NTP:
    yum install ntp
    
    • 修改 /etc/ntp.conf文件:
    ## 注释掉 server开头的行,并添加
    restrict 192.168.0.0 mask 255.255.255.0 nomodify notrap
        server 127.127.1.0
        fudge 127.127.1.0 stratum 10
    
    • 在 node02、03、04添加如下内容:
    ## 注释掉 server开头的行,并添加
    server node01
    
    • 永久启动NTP服务:
    service ntpd start&chkconfig ntpd on
    

    6.【建议】删除hadoop的doc文档,400多M占据空间~

    二、集群规划

    node01 node02 node03 node04
    NameNode
    DataNode DataNode DataNode
    Resoucemanager
    SecondaryNameNode
    NodeManager nodemanager nodemanager nodemanager

    三、配置的文件

    需要配置的文件有7个:

    • $HADOOP_HOME/ etc/hadoop/hadopp-env.sh
    • $HADOOP_HOME/ etc/hadoop/yarn-env.sh
    • $HADOOP_HOME/ etc/hadoop/slaves [hadoop2.x version ] or etc/hadoop/workers [hadoop3.x version]
    • $HADOOP_HOME/ etc/hadoop/core-site.xml
    • $HADOOP_HOME/ etc/hadoop/hdfs-site.xml
    • $HADOOP_HOME/ etc/hadoop/yarn-site.xml
    • $HADOOP_HOME/ etc/hadoop/mapred-site.xml

    1. 配置 etc/hadoop/hadopp-env.sh

    hadoop只会读这个文件配置的jdk。
    在hadoop2.x只需要配置jdk,而在hadoop3.x需要配置角色;hadoop3.x对角色有了严格的管理,必须在配置。

    export JAVA_HOME=/opt/app/jdk1.8.0_201
    export HDFS_NAMENODE_USER=root
    export HDFS_DATANODE_USER=root
    export HDFS_SECONDARYNAMENODE_USER=root
    

    2. 配置 etc/hadoop/yarn-env.sh

    export JAVA_HOME=/opt/app/jdk1.8.0_201
    

    3. 配置 etc/hadoop/slaves|etc/hadoop/workers

    node02
    node03
    node04
    

    4. 配置 etc/hadoop/core-site.xml

    <configuration>
            <!--说明:hadoop2.x端口默认9000;hadoop3.x端口默认9820-->
            <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://node01:9820</value>
            </property>
            <!--注意:临时目录自己创建下-->
            <property>
                    <name>hadoop.tmp.dir</name>
                    <value>/opt/tmp/hadoop/full</value>
            </property>
    </configuration>
    
    

    5. 配置 etc/hadoop/hdfs-site.xml

    <configuration>
            <!--说明:不配置副本的情况下默认是3 -->
            <property>
                    <name>dfs.replication</name>
                    <value>2</value>
            </property>
            <property>
                <!--设置 secondaryNameNode 为 node02节点的虚拟机; hadoop2.x 端口为50090-->
                <name>dfs.namenode.secondary.http-address</name>
                <value>node02:9868</value>
            </property>
            <!--关闭 hdfs 读取权限,即不检查权限-->
            <property>
                <name>dfs.permissions.enabled</name>
                <value>false</value>
            </property>
    </configuration>
    
    

    6. 配置 etc/hadoop/yarn-site.xml

    <configuration>
       <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
       </property>
       <!--指定 resourcemanager 在 node02这台节点上启动-->
       <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>node02</value>
       </property>
       
    </configuration>
    

    7. 配置 etc/hadoop/mapred-site.xml

    <configuration>
        <!--配置 mapreduce的运行的框架名称为 yarn (MR 配置为在 yarn上运行)-->
    	<property>
        	<name>mapreduce.framework.name</name>
        	<value>yarn</value>
    	</property>
    
    </configuration>
    
    

    8. 将hadoop分发到其他节点

    scp -r /opt/app/hadoop-3.2.0 node02:/opt/app/hadoop-3.2.0
    

    四、启动集群

    1 格式化namenode

    bin/hdfs namenode -format
    

    2 启动NameNode、SecondaryNameNode与DataNode

    ## 在 node01启动 namenode
    sbin/hadoop-daemon.sh start namenode
    ## 在 node02启动 secondarynamenode
    sbin/haddop-daemon.sh star sencdarynamenode
    ## 在其他 node02 、03、04 启动 datanode
    sbin/hadoop-daemon.sh start datanode
    

    3启动YARN ,ResouceManager 以及NodeManager

    ## 在 node02节点启动 resourcemanager、nodemanager
    sbin/yarn-daemon.sh start resourcemanager 
    sbin/yarn-daemon.sh start nodemanager
    ## 在 其他 node01、03、04 节点启动 nodemanager
    sbin/yarn-daemon.sh start nodemanager
    

    说明:可以配置环境变量就不用到hadoop去执行命令了

    [root@node01 hadoop-3.2.0]# vi /etc/profile
    ## JDK环境变量
    export JAVA_HOME=/opt/app/jdk1.8.0_201
    ## hadoop环境变量
    export HADOOP_HOME=/opt/app/hadoop-3.2.0
    ## hadoop日志输出级别设置为debug
    #export HADOOP_ROOT_LOGGER=DEBUG,console
    ## 依赖的包这两个
    export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_COMMON_LIB_NATIVE_DIR"
    ## path
    export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    

    一键启动/关闭 start-all.sh / stop-all.sh

    [root@node01 logs]# start-all.sh
    Starting namenodes on [node01]
    Starting datanodes
    Starting secondary namenodes [node02]
    2019-06-15 01:15:29,452 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your 
    platform... using builtin-java classes where applicableStarting resourcemanager
    Starting nodemanagers
    [root@node01 logs]# stop-all.sh
    Stopping namenodes on [node01]
    Stopping datanodes
    Stopping secondary namenodes [node02]
    2019-06-15 01:21:10,936 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your 
    platform... using builtin-java classes where applicableStopping nodemanagers
    node03: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
    node02: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
    node04: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
    Stopping resourcemanager
    [root@node01 logs]# 
    
    
    

    问题&解决方案:

    [root@node01 ~]# cd /opt/app/hadoop-3.2.0/lib/native
    [root@node01 native]# ls
    examples     libhadooppipes.a  libhadoop.so.1.0.0  libnativetask.a   libnativetask.so.1.0.0
    libhadoop.a  libhadoop.so      libhadooputils.a    libnativetask.so
    [root@node01 native]# ldd libhadoop.so.1.0.0
    ./libhadoop.so.1.0.0: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./libhadoop.so
    .1.0.0)	linux-vdso.so.1 =>  (0x00007fff9bd8a000)
    	libdl.so.2 => /lib64/libdl.so.2 (0x00007f7f51dd7000)
    	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7f51bb9000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007f7f51825000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007f7f52208000)
    [root@node01 native]# ldd --version
    ldd (GNU libc) 2.12
    Copyright (C) 2010 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    Written by Roland McGrath and Ulrich Drepper.
    [root@node01 native]# 
    
    
    

    五、验证

    可以通过jps命令查看启动进程;以及通过 ss -nal命令监控端口进行查看

    [root@node01 hadoop-3.2.0]# jps
    1426 NodeManager
    1304 NameNode
    1550 Jps
    [root@node01 hadoop-3.2.0]# ss -nal
    State      Recv-Q Send-Q                 Local Address:Port                   Peer Address:Port 
    LISTEN     0      128                                *:9870                              *:*     
    LISTEN     0      128                                *:59635                             *:*     
    LISTEN     0      128                               :::22                               :::*     
    LISTEN     0      128                                *:22                                *:*     
    LISTEN     0      100                              ::1:25                               :::*     
    LISTEN     0      100                        127.0.0.1:25                                *:*     
    LISTEN     0      128                                *:13562                             *:*     
    LISTEN     0      128                  192.168.121.101:9820                              *:*     
    LISTEN     0      128                                *:8040                              *:*     
    LISTEN     0      128                                *:8042                              *:*     
    [root@node01 hadoop-3.2.0]# 
    
    

    web仪表盘查看:

    http://192.168.121.101:9870/dfshealth.html#tab-overview

  • 相关阅读:
    第1组 团队Git现场编程实战
    第二次结对编程作业
    团队项目-需求分析报告
    团队项目-选题报告
    第一次结对编程作业
    第一次个人编程作业
    第一次博客作业
    2019 SDN上机第二次作业
    2019 SDN上机第一次作业
    软件工程第五次作业
  • 原文地址:https://www.cnblogs.com/nm666/p/11030932.html
Copyright © 2020-2023  润新知