• Hadoop运行单词统计


    1.创建input文件夹

    hadoop fs -mkdir input

    2.上传文件到hadoop

    hadoop fs -put /root/data/output.txt input

    3.运行wordcount(运行前删除旧的output文件夹,可以使用eclipse删除)

    hadoop jar ./hadoop-examples-1.2.1.jar wordcount input output

    4.下载文件到本地

    hadoop fs -get output /root/data/

    运行结果:

    [root@VM_238_215_centos hadoop-1.2.1]# hadoop jar ./hadoop-examples-1.2.1.jar wordcount input output
    Warning: $HADOOP_HOME is deprecated.
    
    17/05/08 13:31:19 INFO input.FileInputFormat: Total input paths to process : 1
    17/05/08 13:31:19 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    17/05/08 13:31:19 WARN snappy.LoadSnappy: Snappy native library not loaded
    17/05/08 13:31:20 INFO mapred.JobClient: Running job: job_201705080035_0003
    17/05/08 13:31:21 INFO mapred.JobClient:  map 0% reduce 0%
    17/05/08 13:31:27 INFO mapred.JobClient:  map 100% reduce 0%
    17/05/08 13:31:34 INFO mapred.JobClient:  map 100% reduce 33%
    17/05/08 13:31:36 INFO mapred.JobClient:  map 100% reduce 100%
    17/05/08 13:31:37 INFO mapred.JobClient: Job complete: job_201705080035_0003
    17/05/08 13:31:37 INFO mapred.JobClient: Counters: 29
    17/05/08 13:31:37 INFO mapred.JobClient:   Map-Reduce Framework
    17/05/08 13:31:37 INFO mapred.JobClient:     Spilled Records=8008
    17/05/08 13:31:37 INFO mapred.JobClient:     Map output materialized bytes=51608
    17/05/08 13:31:37 INFO mapred.JobClient:     Reduce input records=4004
    17/05/08 13:31:37 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=587849728
    17/05/08 13:31:37 INFO mapred.JobClient:     Map input records=1
    17/05/08 13:31:37 INFO mapred.JobClient:     SPLIT_RAW_BYTES=118
    17/05/08 13:31:37 INFO mapred.JobClient:     Map output bytes=203949
    17/05/08 13:31:37 INFO mapred.JobClient:     Reduce shuffle bytes=51608
    17/05/08 13:31:37 INFO mapred.JobClient:     Physical memory (bytes) snapshot=196730880
    17/05/08 13:31:37 INFO mapred.JobClient:     Reduce input groups=4004
    17/05/08 13:31:37 INFO mapred.JobClient:     Combine output records=4004
    17/05/08 13:31:37 INFO mapred.JobClient:     Reduce output records=4004
    17/05/08 13:31:37 INFO mapred.JobClient:     Map output records=19391
    17/05/08 13:31:37 INFO mapred.JobClient:     Combine input records=19391
    17/05/08 13:31:37 INFO mapred.JobClient:     CPU time spent (ms)=1230
    17/05/08 13:31:37 INFO mapred.JobClient:     Total committed heap usage (bytes)=177016832
    17/05/08 13:31:37 INFO mapred.JobClient:   File Input Format Counters 
    17/05/08 13:31:37 INFO mapred.JobClient:     Bytes Read=126386
    17/05/08 13:31:37 INFO mapred.JobClient:   FileSystemCounters
    17/05/08 13:31:37 INFO mapred.JobClient:     HDFS_BYTES_READ=126504
    17/05/08 13:31:37 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=213603
    17/05/08 13:31:37 INFO mapred.JobClient:     FILE_BYTES_READ=51608
    17/05/08 13:31:37 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=35986
    17/05/08 13:31:37 INFO mapred.JobClient:   Job Counters 
    17/05/08 13:31:37 INFO mapred.JobClient:     Launched map tasks=1
    17/05/08 13:31:37 INFO mapred.JobClient:     Launched reduce tasks=1
    17/05/08 13:31:37 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=9105
    17/05/08 13:31:37 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
    17/05/08 13:31:37 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=5744
    17/05/08 13:31:37 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
    17/05/08 13:31:37 INFO mapred.JobClient:     Data-local map tasks=1
    17/05/08 13:31:37 INFO mapred.JobClient:   File Output Format Counters 
    17/05/08 13:31:37 INFO mapred.JobClient:     Bytes Written=35986

  • 相关阅读:
    PHP中curl_init和file_get_contents配合使用
    在PHP语言中使用JSON
    网页中获取IFrame内容
    Golden Gate
    windows 7某些中文程序乱码
    Oracle的同义词(synonyms)总结
    VERITAS容灾技术方案
    VERITAS 备份及容灾方案建议书
    1 FAST ESP 简介
    Linux文件查找命令find,xargs详述
  • 原文地址:https://www.cnblogs.com/bincoding/p/6824673.html
Copyright © 2020-2023  润新知