• docker中搭建分布式hadoop集群


    1、pull Ubuntu镜像配置Java环境

    2、下载hadoop软件包, 配置hosts /etc/hosts 

    172.17.0.5 hadoop1
    172.17.0.6 hadoop2
    172.17.0.2 hadoop3

    3、配置JAVA_HOME(hadoop-env.sh、mapred-env.sh、yarn-env.sh)

    4、配置core-site.xml

    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hadoop1:8020</value>
        </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/home/root/data/tmp</value>
        </property>
    </configuration>

    5、配置hdfs-site.xml

    <configuration>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>hadoop3:50090</value>
        </property>
    </configuration>

    6、配置slave

    hadoop1
    hadoop2
    hadoop3

    7、配置yarn-site.xml

    <configuration>
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>hadoop2</value>
        </property>
        <property>
            <name>yarn.log-aggregation-enable</name>
            <value>true</value>
        </property>
        <property>
            <name>yarn.log-aggregation.retain-seconds</name>
            <value>106800</value>
        </property>
    </configuration>

    8、配置mapred-site.xml

    cp mapred-site.xml.template mapred-site.xml
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>hadoop1:10020</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>hadoop1:19888</value>
        </property>
    </configuration>

    9、设置ssh登录

    安装sshd

    apt-get install openssh-server
    service ssh start
    ps -e | grep ssh

    生成秘钥

    ssh-keygen -t rsa

    设置root密码

    passwd

    设置root远程登录 PermitRootLogin yes 

    vim /etc/ssh/sshd_config
    /etc/init.d/ssh restart

    分发公钥

    ssh-copy-id hadoop1
    ssh-copy-id hadoop2
    ssh-copy-id hadoop3

    NameNode执行格式化

    hdfs namenode –format

     hadoop1上启动HDFS集群

    /sbin/start-dfs.sh

    启动出错

    The authenticity of host '127.17.0.2 (127.17.0.2)' can't be established.
    Host key verification failed.
    vi /etc/ssh/ssh_config

    修改 StrictHostKeyChecking no 

     hadoop1 上启动yarn

    sbin/start-yarn.sh

      hadoop2  上启动ResourceManager

    sbin/yarn-daemon.sh start resourcemanager
  • 相关阅读:
    MySQL数据库的创建&删除&选择
    JS实现异步的几种方式
    十种排序算法实例说明总结
    常用的bug管理工具
    Bootstrap+Hbuilder
    从菜鸟的视角看测试!
    安装numpy和matplotlib
    Eclipse在线安装svn
    重新打个招呼
    <USACO09JAN>气象测量/气象牛The Baric Bovineの思路
  • 原文地址:https://www.cnblogs.com/csig/p/9975195.html
Copyright © 2020-2023  润新知