• HDFS基本命令与Hadoop MapReduce程序的执行


      一、HDFS基本命令

      1.创建目录:-mkdir

    [jun@master ~]$ hadoop fs -mkdir /test
    [jun@master ~]$ hadoop fs -mkdir /test/input

      2.查看文件列表:-ls

    [jun@master ~]$ hadoop fs -ls /
    Found 1 items
    drwxr-xr-x   - jun supergroup          0 2018-07-22 10:31 /test
    [jun@master ~]$ hadoop fs -ls /test
    Found 1 items
    drwxr-xr-x   - jun supergroup          0 2018-07-22 10:31 /test/input

      3.上传文件到HDFS

      在/home/jun下新建两个文件jun.dat和jun.txt

      (1)使用-put将文件从本地复制到HDFS集群

    [jun@master ~]$ hadoop fs -put /home/jun/jun.dat /test/input/jun.dat

      (2)使用-copyFromLocal将文件从本地复制到HDFS集群

    [jun@master ~]$ hadoop fs -copyFromLocal -f /home/jun/jun.txt  /test/input/jun.txt

      (3)查看是否复制成功

    [jun@master ~]$ hadoop fs -ls /test/input
    Found 2 items
    -rw-r--r--   1 jun supergroup         22 2018-07-22 10:38 /test/input/jun.dat
    -rw-r--r--   1 jun supergroup         22 2018-07-22 10:39 /test/input/jun.txt

      4.下载文件到本地

      (1)使用-get将文件从HDFS集群复制到本地

    [jun@master ~]$ hadoop fs -get /test/input/jun.dat /home/jun/jun1.dat

      (2)使用-copyToLocal将文件从HDFS集群复制到本地

    [jun@master ~]$ hadoop fs -copyToLocal /test/input/jun.txt /home/jun/jun1.txt

      (3)查看是否复制成功

    [jun@master ~]$ ls -l /home/jun/
    total 16
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Desktop
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Documents
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Downloads
    drwxr-xr-x. 10 jun jun 161 Jul 21 19:25 hadoop
    drwxrwxr-x.  3 jun jun  17 Jul 20 20:07 hadoopdata
    -rw-r--r--.  1 jun jun  22 Jul 22 10:43 jun1.dat
    -rw-r--r--.  1 jun jun  22 Jul 22 10:44 jun1.txt
    -rw-rw-r--.  1 jun jun  22 Jul 22 10:35 jun.dat
    -rw-rw-r--.  1 jun jun  22 Jul 22 10:35 jun.txt
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Music
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Pictures
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Public
    drwxr-xr-x.  2 jun jun   6 Jul 20 16:43 Resources
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Templates
    drwxr-xr-x.  2 jun jun   6 Jul 19 15:14 Videos

      5.查看HDFS集群中的文件

    [jun@master ~]$ hadoop fs -cat /test/input/jun.txt
    This is the txt file.
    [jun@master ~]$ hadoop fs -text /test/input/jun.txt
    This is the txt file.
    [jun@master ~]$ hadoop fs -tail /test/input/jun.txt
    This is the txt file.

      6.删除HDFS文件

    [jun@master ~]$ hadoop fs -rm /test/input/jun.txt
    Deleted /test/input/jun.txt
    [jun@master ~]$ hadoop fs -ls /test/input
    Found 1 items
    -rw-r--r--   1 jun supergroup         22 2018-07-22 10:38 /test/input/jun.dat

      7.也可以在slave节点上执行命令

    [jun@slave0 ~]$ hadoop fs -ls /test/input
    Found 1 items
    -rw-r--r--   1 jun supergroup         22 2018-07-22 10:38 /test/input/jun.dat

      二、在Hadoop集群中运行程序

      Hadoop安装文件中有一个MapReduce示例程序,该程序用来计算圆周率pi的Java程序包,

      参数说明:pi(类名)、10(Map次数)、10(随机生成点的次数)

    [jun@master ~]$ hadoop jar /home/jun/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.4.jar pi 10 10
    Number of Maps  = 10
    Samples per Map = 10
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Wrote input for Map #5
    Wrote input for Map #6
    Wrote input for Map #7
    Wrote input for Map #8
    Wrote input for Map #9
    Starting Job
    18/07/22 10:55:07 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.100:18040
    18/07/22 10:55:08 INFO input.FileInputFormat: Total input files to process : 10
    18/07/22 10:55:08 INFO mapreduce.JobSubmitter: number of splits:10
    18/07/22 10:55:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1532226440522_0001
    18/07/22 10:55:10 INFO impl.YarnClientImpl: Submitted application application_1532226440522_0001
    18/07/22 10:55:10 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1532226440522_0001/
    18/07/22 10:55:10 INFO mapreduce.Job: Running job: job_1532226440522_0001
    18/07/22 10:55:20 INFO mapreduce.Job: Job job_1532226440522_0001 running in uber mode : false
    18/07/22 10:55:20 INFO mapreduce.Job:  map 0% reduce 0%
    18/07/22 10:56:21 INFO mapreduce.Job:  map 10% reduce 0%
    18/07/22 10:56:22 INFO mapreduce.Job:  map 40% reduce 0%
    18/07/22 10:56:23 INFO mapreduce.Job:  map 50% reduce 0%
    18/07/22 10:56:33 INFO mapreduce.Job:  map 100% reduce 0%
    18/07/22 10:56:34 INFO mapreduce.Job:  map 100% reduce 100%
    18/07/22 10:56:36 INFO mapreduce.Job: Job job_1532226440522_0001 completed successfully
    18/07/22 10:56:36 INFO mapreduce.Job: Counters: 49
        File System Counters
            FILE: Number of bytes read=226
            FILE: Number of bytes written=1738836
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=2590
            HDFS: Number of bytes written=215
            HDFS: Number of read operations=43
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=3
        Job Counters 
            Launched map tasks=10
            Launched reduce tasks=1
            Data-local map tasks=10
            Total time spent by all maps in occupied slots (ms)=635509
            Total time spent by all reduces in occupied slots (ms)=10427
            Total time spent by all map tasks (ms)=635509
            Total time spent by all reduce tasks (ms)=10427
            Total vcore-milliseconds taken by all map tasks=635509
            Total vcore-milliseconds taken by all reduce tasks=10427
            Total megabyte-milliseconds taken by all map tasks=650761216
            Total megabyte-milliseconds taken by all reduce tasks=10677248
        Map-Reduce Framework
            Map input records=10
            Map output records=20
            Map output bytes=180
            Map output materialized bytes=280
            Input split bytes=1410
            Combine input records=0
            Combine output records=0
            Reduce input groups=2
            Reduce shuffle bytes=280
            Reduce input records=20
            Reduce output records=0
            Spilled Records=40
            Shuffled Maps =10
            Failed Shuffles=0
            Merged Map outputs=10
            GC time elapsed (ms)=59206
            CPU time spent (ms)=54080
            Physical memory (bytes) snapshot=2953310208
            Virtual memory (bytes) snapshot=23216238592
            Total committed heap usage (bytes)=2048393216
        Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
        File Input Format Counters 
            Bytes Read=1180
        File Output Format Counters 
            Bytes Written=97
    Job Finished in 88.689 seconds
    Estimated value of Pi is 3.20000000000000000000

      最后可以看到,得到的结果近似为3.2。

  • 相关阅读:
    vscode默认vue模板设置 Jim
    LaTeX公式中括号大小不一致问题与绝对值符号问题
    二分查找总结
    drools规则的入门使用
    XtraReport中序号的实现
    数据转换位串字节数组
    实验二验收2
    实验二验收1
    SQL SERVER之分区表创建
    RabbitMq基础二之direct模式
  • 原文地址:https://www.cnblogs.com/BigJunOba/p/9349402.html
Copyright © 2020-2023  润新知