• 搭建hadoop2.6.0集群环境 分类: A1_HADOOP 2015-04-20 07:21 459人阅读 评论(0) 收藏






    一、规划
    (一)硬件资源
    10.171.29.191 master
    10.171.94.155  slave1
    10.251.0.197 slave3


    (二)基本资料
    用户:  jediael
    目录:/mnt/jediael/

    二、环境配置
    (一)统一用户名密码,并为jediael赋予执行所有命令的权限
    #passwd  
    # useradd jediael  
    # passwd jediael  
    # vi /etc/sudoers  
    增加以下一行:
    jediael ALL=(ALL) ALL 

    (二)创建目录/mnt/jediael
    $sudo chown jediael:jediael /opt  
    $ cd /opt  
    $ sudo mkdir jediael  
    注意:/opt必须是jediael的,否则会在format namenode时出错。

    (三)修改用户名及/etc/hosts文件
    1、修改/etc/sysconfig/network
    NETWORKING=yes  
    HOSTNAME=*******

     
    2、修改/etc/hosts
    10.171.29.191 master
    10.171.94.155  slave1
    10.251.0.197 slave3
    注 意hosts文件不能有127.0.0.1  *****配置,否则会导致出现异常。org.apache.hadoop.ipc.Client: Retrying connect to server: master/10.171.29.191:9000. Already trie 


    3、hostname命令
    hostname ****  

    (四)配置免密码登录
    以上命令在master上使用jediael用户执行:
    $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa  
    $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys  
    然后,将authorized_keys复制到slave1,slave2
    scp ~/.ssh/authorized_keys slave1:~/.ssh/  
    scp ~/.ssh/authorized_keys slave2:~/.ssh/  
    注意
    (1)若提示.ssh目录不存在,则表示此机器从未运行过ssh,因此运行一次即可创建.ssh目录。
    (2).ssh/的权限为600,authorized_keys的权限为700,权限大了小了都不行。

    (五)在3台机器上分别安装java,并设置相关环境变量
    参考http://blog.csdn.net/jediael_lu/article/details/38925871

    (六)下载hadoop-2.6.0.tar.gz,并将其解压到/mnt/jediael
    wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
    tar -zxvf hadoop-2.6.0.tar.gz

    三、修改配置文件
    【3台机器上均要执行,一般先在一台机器上配置完成,再用scp复制到其它机器】
    (一)hadoop_env.sh
    export JAVA_HOME=/usr/java/jdk1.7.0_51  

    (二)修改core-site.xml
           

            <property>
                    <name>hadoop.tmp.dir</name>
                    <value>/mnt/tmp</value>
                    <description>Abase for other temporary directories.</description>
            </property>
            <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://master:9000</value>
            </property>
            <property>
                    <name>io.file.buffer.size</name>
                    <value>4096</value>
            </property>


    (三)修改hdfs-site.xml
            <property>
                    <name>dfs.replication</name>
                    <value>2</value>
            </property>


    (四)修改mapred-site.xml
     
           <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
                    <final>true</final>
            </property>
    
        <property>
            <name>mapreduce.jobtracker.http.address</name>
            <value>master:50030</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>master:10020</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>master:19888</value>
        </property>
            <property>
                    <name>mapred.job.tracker</name>
                    <value>http://master:9001</value>
            </property>


    (五)修改yarn.xml
            <property>
                    <name>yarn.resourcemanager.hostname</name>
                    <value>master</value>
            </property>
    
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.resourcemanager.address</name>
            <value>master:8032</value>
        </property>
        <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>master:8030</value>
        </property>
        <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>master:8031</value>
        </property>
        <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>master:8033</value>
        </property>
        <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>master:8088</value>
        </property>



    (六)修改slaves 【不用修改masters文件??】
    slaves:  
    slave1  
    slave3


    四、启动并验证


    1、格式 化namenode
    [jediael@master hadoop-1.2.1]$  bin/hadoop namenode -format  


    2、启动hadoop【此步骤只需要在master上执行】
    [jediael@master hadoop-1.2.1]$ bin/start-all.sh   

    3、验证1:向hdfs中写入内容
    [jediael@master hadoop-2.6.0]$ bin/hadoop fs -ls /
    [jediael@master hadoop-2.6.0]$ bin/hadoop fs -mkdir /test
    [jediael@master hadoop-2.6.0]$ bin/hadoop fs -ls /       
    Found 1 items
    drwxr-xr-x   - jediael supergroup          0 2015-04-19 23:41 /test

    4、验证:登录页面
    NameNode    http://ip:50070   

    5、查看各个主机的java进程
    (1)master:
    $ jps
    3694 NameNode
    3882 SecondaryNameNode
    7216 Jps
    4024 ResourceManager


    (2)slave1:
    $ jps
    1913 NodeManager
    2673 Jps
    1801 DataNode


    (3)slave3:
    $ jps
    1942 NodeManager
    2252 Jps
    1840 DataNode

    五、运行一个完整的mapreduce程序:运行自带的wordcount程序

    $ bin/hadoop fs -mkdir /input


    $ bin/hadoop fs -ls /        
    Found 2 items
    drwxr-xr-x   - jediael supergroup          0 2015-04-20 18:04 /input
    drwxr-xr-x   - jediael supergroup          0 2015-04-19 23:41 /test


    $ bin/hadoop fs -copyFromLocal etc/hadoop/mapred-site.xml.template /input  


    $ pwd
    /mnt/jediael/hadoop-2.6.0/share/hadoop/mapreduce


    $ /mnt/jediael/hadoop-2.6.0/bin/hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output
    15/04/20 18:15:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    15/04/20 18:15:48 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
    15/04/20 18:15:48 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
    15/04/20 18:15:49 INFO input.FileInputFormat: Total input paths to process : 1
    15/04/20 18:15:49 INFO mapreduce.JobSubmitter: number of splits:1
    15/04/20 18:15:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local657082309_0001
    15/04/20 18:15:50 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
    15/04/20 18:15:50 INFO mapreduce.Job: Running job: job_local657082309_0001
    15/04/20 18:15:50 INFO mapred.LocalJobRunner: OutputCommitter set in config null
    15/04/20 18:15:50 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
    15/04/20 18:15:50 INFO mapred.LocalJobRunner: Waiting for map tasks
    15/04/20 18:15:50 INFO mapred.LocalJobRunner: Starting task: attempt_local657082309_0001_m_000000_0
    15/04/20 18:15:50 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    15/04/20 18:15:50 INFO mapred.MapTask: Processing split: hdfs://master:9000/input/mapred-site.xml.template:0+2268
    15/04/20 18:15:51 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    15/04/20 18:15:51 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    15/04/20 18:15:51 INFO mapred.MapTask: soft limit at 83886080
    15/04/20 18:15:51 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    15/04/20 18:15:51 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    15/04/20 18:15:51 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    15/04/20 18:15:51 INFO mapred.LocalJobRunner:
    15/04/20 18:15:51 INFO mapred.MapTask: Starting flush of map output
    15/04/20 18:15:51 INFO mapred.MapTask: Spilling map output
    15/04/20 18:15:51 INFO mapred.MapTask: bufstart = 0; bufend = 1698; bufvoid = 104857600
    15/04/20 18:15:51 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213916(104855664); length = 481/6553600
    15/04/20 18:15:51 INFO mapred.MapTask: Finished spill 0
    15/04/20 18:15:51 INFO mapred.Task: Task:attempt_local657082309_0001_m_000000_0 is done. And is in the process of committing
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: map
    15/04/20 18:15:51 INFO mapred.Task: Task 'attempt_local657082309_0001_m_000000_0' done.
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local657082309_0001_m_000000_0
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: map task executor complete.
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: Waiting for reduce tasks
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: Starting task: attempt_local657082309_0001_r_000000_0
    15/04/20 18:15:51 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    15/04/20 18:15:51 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@39be5e01
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
    15/04/20 18:15:51 INFO reduce.EventFetcher: attempt_local657082309_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
    15/04/20 18:15:51 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local657082309_0001_m_000000_0 decomp: 1566 len: 1570 to MEMORY
    15/04/20 18:15:51 INFO reduce.InMemoryMapOutput: Read 1566 bytes from map-output for attempt_local657082309_0001_m_000000_0
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1566, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->1566
    15/04/20 18:15:51 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied.
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
    15/04/20 18:15:51 INFO mapred.Merger: Merging 1 sorted segments
    15/04/20 18:15:51 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1560 bytes
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merged 1 segments, 1566 bytes to disk to satisfy reduce memory limit
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merging 1 files, 1570 bytes from disk
    15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
    15/04/20 18:15:51 INFO mapred.Merger: Merging 1 sorted segments
    15/04/20 18:15:51 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1560 bytes
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied.
    15/04/20 18:15:51 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
    15/04/20 18:15:51 INFO mapreduce.Job: Job job_local657082309_0001 running in uber mode : false
    15/04/20 18:15:51 INFO mapreduce.Job:  map 100% reduce 0%
    15/04/20 18:15:51 INFO mapred.Task: Task:attempt_local657082309_0001_r_000000_0 is done. And is in the process of committing
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied.
    15/04/20 18:15:51 INFO mapred.Task: Task attempt_local657082309_0001_r_000000_0 is allowed to commit now
    15/04/20 18:15:51 INFO output.FileOutputCommitter: Saved output of task 'attempt_local657082309_0001_r_000000_0' to hdfs://master:9000/output/_temporary/0/task_local657082309_0001_r_000000
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: reduce > reduce
    15/04/20 18:15:51 INFO mapred.Task: Task 'attempt_local657082309_0001_r_000000_0' done.
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local657082309_0001_r_000000_0
    15/04/20 18:15:51 INFO mapred.LocalJobRunner: reduce task executor complete.
    15/04/20 18:15:52 INFO mapreduce.Job:  map 100% reduce 100%
    15/04/20 18:15:52 INFO mapreduce.Job: Job job_local657082309_0001 completed successfully
    15/04/20 18:15:52 INFO mapreduce.Job: Counters: 38
            File System Counters
                    FILE: Number of bytes read=544164
                    FILE: Number of bytes written=1040966
                    FILE: Number of read operations=0
                    FILE: Number of large read operations=0
                    FILE: Number of write operations=0
                    HDFS: Number of bytes read=4536
                    HDFS: Number of bytes written=1196
                    HDFS: Number of read operations=15
                    HDFS: Number of large read operations=0
                    HDFS: Number of write operations=4
            Map-Reduce Framework
                    Map input records=43
                    Map output records=121
                    Map output bytes=1698
                    Map output materialized bytes=1570
                    Input split bytes=114
                    Combine input records=121
                    Combine output records=92
                    Reduce input groups=92
                    Reduce shuffle bytes=1570
                    Reduce input records=92
                    Reduce output records=92
                    Spilled Records=184
                    Shuffled Maps =1
                    Failed Shuffles=0
                    Merged Map outputs=1
                    GC time elapsed (ms)=123
                    CPU time spent (ms)=0
                    Physical memory (bytes) snapshot=0
                    Virtual memory (bytes) snapshot=0
                    Total committed heap usage (bytes)=269361152
            Shuffle Errors
                    BAD_ID=0
                    CONNECTION=0
                    IO_ERROR=0
                    WRONG_LENGTH=0
                    WRONG_MAP=0
                    WRONG_REDUCE=0
            File Input Format Counters
                    Bytes Read=2268
            File Output Format Counters 


    $ /mnt/jediael/hadoop-2.6.0/bin/hadoop fs -cat /output/*


    版权声明:本文为博主原创文章,未经博主允许不得转载。

  • 相关阅读:
    关于按钮背景透明 + div拖拽
    asp.net 自带ajax 控件的小实例
    何去何从
    字符串的常用操作
    第一章
    C语言的基础知识2
    C语言的基础知识1
    socket
    缓冲区溢出学习
    OD调试
  • 原文地址:https://www.cnblogs.com/lujinhong2/p/4637197.html
Copyright © 2020-2023  润新知