• 4 weekend110的textinputformat对切片规划的源码分析 + 倒排索引的mr实现 + 多个job在同一个main方法中提交


      

    好的,现在,来weekend110的textinputformat对切片规划的源码分析,

    Inputformat默认是textinputformat,一通百通。

     

     

     

     

    这就是今天,weekend110的textinputformat对切片规划的源码分析入口

    [LocatedFileStatus{path=hdfs://weekend110:9000/wc/srcdata/words.log; isDirectory=false; length=90; replication=1; blocksize=134217728; modification_time=1469247371536; access_time=1469501356933; owner=hadoop; group=supergroup; permission=rw-r--r--; isSymlink=false}]

     

     

     

    [hdfs://weekend110:9000/wc/srcdata/words.log:0+90]

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp

    Found 1 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:25 /tmp/hadoop-yarn

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn

    Found 1 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging

    Found 2 items

    drwx------   - hadoop supergroup          0 2016-07-23 12:25 /tmp/hadoop-yarn/staging/hadoop

    drwxr-xr-x   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history

    Found 1 items

    drwxrwxrwt   - hadoop supergroup          0 2016-07-23 12:26 /tmp/hadoop-yarn/staging/history/done_intermediate

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate

    Found 1 items

    drwxrwx---   - hadoop supergroup          0 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop fs -ls /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop

    Found 48 items

    -rwxrwx---   1 hadoop supergroup      32973 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001-1469247921943-hadoop-wc.jar-1469248148068-1-1-SUCCEEDED-default-1469248027901.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001.summary

    -rwxrwx---   1 hadoop supergroup      91579 2016-07-23 12:29 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469234255012_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32957 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001-1469447061251-hadoop-wc.jar-1469447138744-1-1-SUCCEEDED-default-1469447093632.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001.summary

    -rwxrwx---   1 hadoop supergroup      91579 2016-07-25 19:45 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469446305412_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      33003 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001-1469536528574-hadoop-flow.jar-1469536711053-1-1-SUCCEEDED-default-1469536621793.jhist

    -rwxrwx---   1 hadoop supergroup        349 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-26 20:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469500058449_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32975 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001-1469581609069-hadoop-flow.jar-1469581669098-1-1-SUCCEEDED-default-1469581639942.jhist

    -rwxrwx---   1 hadoop supergroup        349 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:07 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32966 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002-1469581980369-hadoop-flow.jar-1469582016624-1-1-SUCCEEDED-default-1469581991321.jhist

    -rwxrwx---   1 hadoop supergroup        348 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:13 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      32947 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003-1469583259497-hadoop-flow.jar-1469583283697-1-1-SUCCEEDED-default-1469583266059.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:34 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      32973 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004-1469584535785-hadoop-flow.jar-1469584574236-1-1-SUCCEEDED-default-1469584549659.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004.summary

    -rwxrwx---   1 hadoop supergroup      91594 2016-07-27 09:56 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469581296388_0004_conf.xml

    -rwxrwx---   1 hadoop supergroup      32994 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001-1469609254627-hadoop-flowSort.jar-1469609480611-1-1-SUCCEEDED-default-1469609373636.jhist

    -rwxrwx---   1 hadoop supergroup        353 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001.summary

    -rwxrwx---   1 hadoop supergroup      91630 2016-07-27 16:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      32989 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002-1469609990434-hadoop-flowSort.jar-1469610090600-1-1-SUCCEEDED-default-1469610004692.jhist

    -rwxrwx---   1 hadoop supergroup        353 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002.summary

    -rwxrwx---   1 hadoop supergroup      91622 2016-07-27 17:01 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469604813941_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      52581 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001-1469629441509-hadoop-flowArea.jar-1469629695512-1-0-FAILED-default-1469629461365.jhist

    -rwxrwx---   1 hadoop supergroup        352 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:28 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      30548 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002-1469629856935-hadoop-flowArea.jar-1469630543551-1-0-FAILED-default-1469630477324.jhist

    -rwxrwx---   1 hadoop supergroup        350 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:42 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      30560 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003-1469630391568-hadoop-flowArea.jar-1469631307275-1-0-FAILED-default-1469631249046.jhist

    -rwxrwx---   1 hadoop supergroup        350 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-27 22:55 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469628834819_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      54558 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001-1469668063936-hadoop-flowArea.jar-1469668319036-1-0-FAILED-default-1469668087466.jhist

    -rwxrwx---   1 hadoop supergroup        352 2016-07-28 09:11 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001.summary

    -rwxrwx---   1 hadoop supergroup      91494 2016-07-28 09:12 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0001_conf.xml

    -rwxrwx---   1 hadoop supergroup      30329 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002-1469669047716-hadoop-flow.jar-1469669116225-1-0-FAILED-default-1469669070963.jhist

    -rwxrwx---   1 hadoop supergroup        346 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002.summary

    -rwxrwx---   1 hadoop supergroup      91595 2016-07-28 09:25 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0002_conf.xml

    -rwxrwx---   1 hadoop supergroup      30331 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003-1469669444122-hadoop-flow.jar-1469669914163-1-0-FAILED-default-1469669867080.jhist

    -rwxrwx---   1 hadoop supergroup        346 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003.summary

    -rwxrwx---   1 hadoop supergroup      91595 2016-07-28 09:38 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0003_conf.xml

    -rwxrwx---   1 hadoop supergroup      32950 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004-1469670210160-hadoop-flow.jar-1469670688549-1-1-SUCCEEDED-default-1469670670491.jhist

    -rwxrwx---   1 hadoop supergroup        347 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004.summary

    -rwxrwx---   1 hadoop supergroup      91619 2016-07-28 09:51 /tmp/hadoop-yarn/staging/history/done_intermediate/hadoop/job_1469667659405_0004_conf.xml

    [hadoop@weekend110 ~]$

     

     file:/tmp/hadoop-Administrator/mapred/staging/Administrator1242101173/.staging/job_local1242101173_0001/job.xml

    job-id : job_local1242101173_0001uber-mode : falsemap-progress : 1.0reduce-progress : 1.0cleanup-progress : 1.0setup-progress : 1.0runstate : SUCCEEDEDstart-time : 0user-name : Administratorpriority : NORMALscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0

    主题:

        主用户: NTUserPrincipal: Administrator

        主用户: NTSidUserPrincipal: S-1-5-21-2155837731-1039603112-1552600933-500

        主用户: NTDomainPrincipal: WIN-BQOBV63OBNM

        主用户: NTSidDomainPrincipal: S-1-5-21-2155837731-1039603112-1552600933

        主用户: NTSidPrimaryGroupPrincipal: S-1-5-21-2155837731-1039603112-1552600933-513

        主用户: NTSidGroupPrincipal: S-1-1-0

        主用户: NTSidGroupPrincipal: S-1-5-114

        主用户: NTSidGroupPrincipal: S-1-5-32-544

        主用户: NTSidGroupPrincipal: S-1-5-32-545

        主用户: NTSidGroupPrincipal: S-1-5-4

        主用户: NTSidGroupPrincipal: S-1-2-1

        主用户: NTSidGroupPrincipal: S-1-5-11

        主用户: NTSidGroupPrincipal: S-1-5-15

        主用户: NTSidGroupPrincipal: S-1-5-113

        主用户: NTSidGroupPrincipal: S-1-5-5-0-112222

        主用户: NTSidGroupPrincipal: S-1-2-0

        主用户: NTSidGroupPrincipal: S-1-5-64-10

        主用户: NTSidGroupPrincipal: S-1-16-12288

        主用户: Administrator

        公共身份证明: NTNumericCredential: 2088

        专用身份证明: org.apache.hadoop.security.Credentials@77084cb5

    以上是weekend110的textinputformat的对切片规划的源码分析

     

    建立索引

    看mr程序实现倒排索引

    Soga:

    之前在分析切片规划的源码分析时,Inputspilt里,我们知道,是包括block信息、文件路径信息、、、

     

     

     

    [hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop jar ii.jar cn.itcast.hadoop.mr.ii.InverseIndexStepOne /ii/data /ii/stepone

     

    为什么可以,因为,

     

     

    拿这个结果,作为输入

     

    以上是weekend110的倒排索引的mr实现

     

    以下是,多个job在同一个mian方法中提交

    总结,不推荐这种哈。当然,在这里是玩玩而已

  • 相关阅读:
    爬取笔趣阁小说(一念永恒)
    爬虫requests爬去网页乱码问题
    requests bs4 datetime re json
    添加背景音乐。c
    strip()
    爬虫学习中遇到的问题
    super的用法(带了解)
    user-agent
    输入n个字符串,用空格隔开。这些字符串中有重复出现的。现在统计每个字符串出现的次数,并找出出现次数最多的字符串。
    字节跳动小程序的一些坑
  • 原文地址:https://www.cnblogs.com/zlslch/p/5901865.html
Copyright © 2020-2023  润新知