• 解读:计数器Counter


    Counters: 44
    File System Counters
            FILE: Number of bytes read=655771325
            FILE: Number of bytes written=984244425
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=260407668
            HDFS: Number of bytes written=17681802
            HDFS: Number of read operations=37
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=10
    Job Counters 
            Launched map tasks=4
            Launched reduce tasks=5
            Other local map tasks=1
            Data-local map tasks=3
            Total time spent by all maps in occupied slots (ms)=60987
            Total time spent by all reduces in occupied slots (ms)=50362
    Map-Reduce Framework
            Map input records=1152870
            Map output records=22472940
            Map output bytes=282888289
            Map output materialized bytes=327843405
            Input split bytes=1173
            Combine input records=0
            Combine output records=0
            Reduce input groups=579532
            Reduce shuffle bytes=327843405
            Reduce input records=22472940
            Reduce output records=579532
            Spilled Records=67418820
            Shuffled Maps =20
            Failed Shuffles=0
            Merged Map outputs=20
            GC time elapsed (ms)=2826
            CPU time spent (ms)=69670
            Physical memory (bytes) snapshot=2287190016
            Virtual memory (bytes) snapshot=7904223232
            Total committed heap usage (bytes)=1572864000
    Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
    File Input Format Counters 
            Bytes Read=0
    File Output Format Counters 
            Bytes Written=17681802

    Counters: 44表示计数器总共44个,粉色表示计数器种类,即6类 

    1). File System Counters:MR-Job执行依赖的数据来自不同的文件系统,这个group表示job与文件系统交互的读写统计

    • HDFS: Number of bytes read=260407668  //map从HDFS读取数据,包括源文件内容、split元数据。所以这个值比FileInputFormatCounters.BYTES_READ 要略大些。
    • FILE: Number of bytes written=984244425  //表示map task往本地磁盘中总共写了多少字节(其实,Reduce端的Merge也会写入本地File)
    • FILE: Number of bytes read=655771325  //reduce从本地文件系统读取数据(map结果保存在本地磁盘)
    • HDFS: Number of bytes written=17681802  //最终结果写入HDFS

    2). Job Counters:MR子任务统计,即map tasks 和 reduce tasks

    • Launched map tasks=4  //启用map task的个数
    • Launched reduce tasks=5  //启用reduce task的个数

     3). Map-Reduce Framework:MR框架计数器

    • Map input records=1152870  //map task从HDFS读取的文件总行数
    • Reduce input groups=579532    //Reduce输入的分组个数,如<hello,{1,1}> <me,1> <you,1>。如果有Combiner的话,那么这里的数值就等于map端Combiner运算后的最后条数,如果没有,那么就应该等于map的输出条数
    • Combine input records=0  //Combiner输入 = map输出
    • Spilled Records=67418820  //spill过程在map和reduce端都会发生,这里统计在总共从内存往磁盘中spill了多少条数据

    4). Shuffle Errors

    5). File Input Format Counters:文件输入格式化计数器

      Bytes Read=0  //map阶段,各个map task的map方法输入的所有value值字节数之和

    6). File Output Format Counters:文件输出格式化计数器

      Bytes Written=17681802  //MR输出总的字节数,包括【单词】,【空格】,【单词个数】及每行的【换行符】

    自定义计数器

    //自定义计数器<Key , Value>的形式
    
    Counter counter = context.getCounter("查找hello", "hello");
    
    if(string.contains("hello")){
    
    counter.increment(1l);//出现一次+1
    
    }

     

  • 相关阅读:
    VSCode创建自定义代码段
    生命不息,折腾不止 ~ 旧PC改造之家庭影音
    万物互联之~网络编程基础篇
    PyCharm创建自定义代码段(JetBrains系列通用)
    VSCode设置Tab键为4个空格
    Jupyter-Notebook服务器自定义密码
    Jupyter ~ 像写文章般的 Coding (附:同一个ipynb文件,执行多语言代码)
    centos下使用nohup
    在centos中创建nginx启动脚本
    查看centos中的用户和用户组
  • 原文地址:https://www.cnblogs.com/skyl/p/4739803.html
Copyright © 2020-2023  润新知