• Centos7部署hadoop 3


    一:ssh免密登录:

      1)vim /etc/ssh/sshd_config去掉注释或添加

          RSAAuthentcation yes

          PubkeyAuthentication yes

    # Authentication:
    
    #LoginGraceTime 2m
    #PermitRootLogin yes
    #StrictModes yes
    #MaxAuthTries 6
    #MaxSessions 10
    
    RSAAuthentication yes
    PubkeyAuthentication yes

      2)生成密钥:

          ssh-keygen -t rsa

      3)复制到公钥中:

          cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys

      4)将密钥复制到目标服务器:

          ssh-copy-id 目标服务器ip

          scp -p ./id_rsa.pub root@192.168.8.213:/root/.ssh/id_dsa.pub.214

          cat id_dsa.pub.214 >> ~/.ssh/authorized_keys

          可以把目标机的id_dsa.pub添加到本机authorized_keys文件实现免密登陆

      5)编辑hosts对应文件:

          vim /etc/hosts

      6)测试:

          ssh 目标服务器hostname或者ip

    二:安装JDK

      2.1)卸载系统自带的OpenJDK及相关组件:

        java -version

        rpm -qa | grep java

        包含noarch的不删

        rpm -e --nodeps java.....

        java -version (确认是否删除)

      2.2)下载JDK

    http://download.oracle.com/otn-pub/java/jdk/10.0.1+10/fb4372174a714e6b8c52526dc134031e/jdk-10.0.1_linux-x64_bin.tar.gz

      2.3)解压JDK

        tar -zxvf jdk...tar.gz -c /usr/local/java

      2.4)配置JDK环境变量

        vim /etc/profile

        export JAVA_HOME=/usr/local/java

        export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

        export PATH=$PATH:$JAVA_HOME/bin

    三:安装hadoop:

      1)下载hadoop:

          注意下载:binary

          wget http://www-eu.apache.org/dist/hadoop/common/hadoop-3.0.3/hadoop-3.0.3.tar.gz

      2)解压安装:

          cp /root/hadoop-3.0.3-tar.gz /usr/local/hadoop/

          cd /usr/local/hadoop

          tar -zxvf hadoop-3.0.3-tar.gz

      3)修改环境变量:

          vim /etc/profile

          在结尾加入:

            export HADOOP_HOME=/usr/local/hadoop

            export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

          保存后退出: :qw

          重新加载:  source /etc/profile

      4)测试hadoop安装情况:

          hadoop version

    四:搭建伪分布:

      特点:不具备HDFS,只能测试MapRaduce

      进入hadoop目录:cd /usr/local/hadoop/etc/hadoop/

        修改hadoop-env.sh中 export JAVA_HOME=/usr/local/java

      测试Ddemo:$JAVA_HOME/share/hadoop/mappreduce/

        hadoop-mapreduce-examples-3.0.3.jar 单词数量统计工具

        mkdir -p /usr/local/data/input/

        mkdir -p /usr/local/data/output/

        vim /usr/local/data/input/data.txt

          I LOVE BEIJING

          I LOVE CHINA

          BEIJING IS THE CAPITAL OF CHINA

        cd /usr/local/hadoop/share/hadoop/mapreduce

        执行:

          hadoop jar hadoop-mapreduce-examples-3.0.3.jar wordcount /usr/local/data/input/data.txt /usr/local/data/output/wc

          hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.3.jar wordcount /usr/local/data/input/data.txt /usr/local/data/output/wc

        输出日志:

          2018-06-18 12:57:23,440 INFO mapreduce.Job: map 100% reduce 100%

        cd /usr/local/data/output/wc/      

          -rw-r--r--. 1 root root 55 6月 18 12:57 part-r-00000
          -rw-r--r--. 1 root root 0 6月 18 12:57 _SUCCESS

        vim part-r-00000

          BEIJING 2

          CAPITAL 1
          CHINA 2
          I 2
          IS 1
          LOVE 2
          OF 1
          THE 1

        mapreduce按字典顺序排序

    五:伪分布模式:

      具备hadoop的所有功能,在单机上可以模拟一个分布式环境:

        HDFS:主:NameNode;数据节点:DataNode

        Yarn:容器,运行MapReduce

            主节点:ResourceManager

            从节点:NodeManager

      

        5.1)配置hdfs-site.xml

        cd /usr/local/hadoop/etc/hadoop/

        vim hdfs-site.xml

    <configuration>

    <!--namenode上存储hdfs名字空间元数据-->
    <property>
    <name>dfs.name.dir</name>
    <value>/usr/hadoop/hdfs/name</value>
    </property>

    <!--datanode上数据块的物理存储位置-->
    <property>
    <name>dfs.data.dir</name>
    <value>/usr/hadoop/hdfs/data</value>
    </property>


    <!--配置冗余度-->
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>


    <!--配置是否有检查权限-->
    <property>
     <name>dfs.permissions</name>
     <value>false</value>
    </property>
    </configuration>

        5.2)配置core-site.xml 文件

        vim core-site.xml

    <configuration>


    <!--配置HDFS的NameNode-->

    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://192.168.8.214:9000</value>
    </property>

    <!--配置HDFS的DataNode保存数据的路径-->
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/usr/local/hadoop/tmp</value>
    </property>

    </configuration>

    
    

        5.3)配置mapred-site.xml

        vim mapred-site.xml

    <configuration>
    <!--配置mapreduce运行的框架-->
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>
    

      

        5.4)配置yarn-site.xml

        vim yarn-site.xml

    <configuration>

    <!--配置ResourceManager运行的IP-->
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>192.168.8.214</value>
    </property>

    <!--配置NodeManager执行任务的方式-->
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>

    <!--配置mr管理界面的登录接口-->
    <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>192.168.8.214:8099</value>
    </property>

    </configuration>

         5.5)格式化 NameNode

          hdfs namenode -format

          输出:

            INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.

          为格式化成功!

        5.6)增加用户定义:cd /usr/local/hadoop/sbin

          vim start-dfs.sh

          vim stop-dfs.sh

    HDFS_DATANODE_USER=root
    HADOOP_SECURE_DN_USER=hdfs
    HDFS_NAMENODE_USER=root
    HDFS_SECONDARYNAMENODE_USER=root
    如以上报错

    WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.

    则用:

    HDFS_DATANODE_USER=root
    HDFS_DATANODE_SECURE_USER=hdfs
    HDFS_NAMENODE_USER=root
    HDFS_SECONDARYNAMENODE_USER=root

          不修改会报错:ERROR: Attempting to operate on hdfs namenode as root

        5.7)增加用户定义:cd /usr/local/hadoop/sbin

          vim start-yarn.sh

          vim stop-yarn.sh      

    YARN_RESOURCEMANAGER_USER=root
    HADOOP_SECURE_DN_USER=yarn
    YARN_NODEMANAGER_USER=root

           不修改会报错:ERROR: Attempting to operate on yarn resourcemanager as root

        5.8)启动:

          start-all.sh

          HDFS:存储数据

          Yarn:执行计算

        5.9)访问:

            命令行

            Java API

            Web Console:

              HDFS:http://192.168.8.214:50070

              Yarn:http://192.168.8.214:8088

            如果发现不能访问50070端口,可进行如下设置

              vi  /etc/selinux/config

            

    修改:
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=enforcing
    
    
    为:
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    #SELINUX=enforcing
    SELINUX=disabled

      设置默认访问端口:

        cd /usr/local/hadoop/etc/hadoop

        vim maperd-site.xml 添加:  

    <property>
         <name>mapred.job.tracker.http.address</name>
         <value>192.168.8.214:50030</value>
    </property>
    
    <property>
         <name>mapred.task.tracker.http.address</name>
         <value>192.168.8.214:50060</value>
    </property>

        vim hdfs-site.xml 添加:

    <property>
        <name>dfs.http.address</name>
        <value>192.168.8.214:50070</value>
    </property>

          然后停止所有进程:

            stop-all.sh

          删除name、data文件夹下数据:

            rm -rf /usr/local/hadoop/hdfs/data/*

            rm -rf /usr/local/hadoop/hdfs/name/*

          重新格式化:

            hdfs namenode -format

          重新启动后访问正常:

            start-all.sh

          执行:jps 有如下输出为正常:

            NodeManager

            Jps

            DataNode

            NameNode

            SecondaryNameNode

            ResourceManager

          浏览器访问:192.168.8.214:50070

    参阅:

    http://study.163.com/course/courseLearn.htm?courseId=1005536048#/learn/video?lessonId=1052769176&courseId=1005536048

    https://blog.csdn.net/maiduiyizu/article/details/79605510

    https://blog.csdn.net/coffeeandice/article/details/78879151

    https://blog.csdn.net/u013725455/article/details/70147331

          

  • 相关阅读:
    排序--插入排序(Insertion Sort)Java实现
    汉诺塔--递归和非递归 Java实现
    关于mysql使用utf8编码在cmd窗口无法添加中文数据的问题以及解决 方法二
    mysql在cmd中查询到的汉字乱码问题解决 方法一
    mysql绿色版安装以及遇到的问题
    关于按下回车键自动提交表单问题解决
    获取iframe引入页面内的元素
    ORACLE 中如何截取到时间的年月日中的年、月、日
    Jboss 7配置日志
    java中将科学技术发转为正常数据
  • 原文地址:https://www.cnblogs.com/jackyzm/p/9187979.html
Copyright © 2020-2023  润新知