• Python运行MapReducer程序时所遇异常


    landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop jar contrib/streaming/hadoop-streaming-1.0.4.jar -mapper home/landen/UntarFile/hadoop-1.0.4/PythonMR/wordMapper.py -reducer /home/landen/UntarFile/hadoop-1.0.4/PythonMR/wordReducer.py -input /input/* -output wordCountOutput
    Warning: $HADOOP_HOME is deprecated.

    packageJobJar: [/home/landen/UntarFile/hadoop-1.0.4/datas/tmp/hadoop-unjar2023262079914179173/] [] /tmp/streamjob1615815049526219730.jar tmpDir=null
    14/03/19 11:22:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/03/19 11:22:49 WARN snappy.LoadSnappy: Snappy native library not loaded
    14/03/19 11:22:49 INFO mapred.FileInputFormat: Total input paths to process : 1
    14/03/19 11:22:50 INFO streaming.StreamJob: getLocalDirs(): [/home/landen/UntarFile/hadoop-1.0.4/datas/tmp/mapred/local]
    14/03/19 11:22:50 INFO streaming.StreamJob: Running job: job_201403182127_0006
    14/03/19 11:22:50 INFO streaming.StreamJob: To kill this job, run:
    14/03/19 11:22:50 INFO streaming.StreamJob: /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -Dmapred.job.tracker=Master:9001 -kill job_201403182127_0006
    14/03/19 11:22:50 INFO streaming.StreamJob: Tracking URL: http://Master:50030/jobdetails.jsp?jobid=job_201403182127_0006
    14/03/19 11:22:51 INFO streaming.StreamJob:  map 0%  reduce 0%
    14/03/19 11:23:27 INFO streaming.StreamJob:  map 100%  reduce 100%
    14/03/19 11:23:27 INFO streaming.StreamJob: To kill this job, run:
    14/03/19 11:23:27 INFO streaming.StreamJob: /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -Dmapred.job.tracker=Master:9001 -kill job_201403182127_0006
    14/03/19 11:23:27 INFO streaming.StreamJob: Tracking URL: http://Master:50030/jobdetails.jsp?jobid=job_201403182127_0006
    bug出现:
    14/03/19 11:23:27 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201403182127_0006_m_000000
    14/03/19 11:23:27 INFO streaming.StreamJob: killJob...
    Streaming Command Failed!

    查看hadoop logs文件发现:
    Caused by: java.io.IOException: Cannot run program "./PythonMR/wordMapper.py": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
        ... 23 more

    Reason 1:执行py文件时开头没指定相关的python执行文件地址路径和编码
    #!/usr/bin/python
    # coding=utf-8
    Reason 2:未给python文件相关执行权限
    chmod a+x *.py
    Reason 3:在提交作业时,采用-file选项指定这些文件, 比如上面例子中,可以使用“-file Mapper -file Reducer” 或者 “-file Mapper.py -file Reducer.py”, 这样,Hadoop会将这两个文件自动分发到各个节点上(Distributed Cache).

    landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop jar contrib/streaming/hadoop-streaming-1.0.4.jar -mapper ./PythonMR/wordMapper.py -reducer ./PythonMR/wordReducer.py -input /input/* -output wordCountOutput -file ./PythonMR/wordMapper.py -file ./PythonMR/wordReducer.py
    Warning: $HADOOP_HOME is deprecated.

    packageJobJar: [./PythonMR/wordMapper.py, ./PythonMR/wordReducer.py, /home/landen/UntarFile/hadoop-1.0.4/datas/tmp/hadoop-unjar3733581910057274756/] [] /tmp/streamjob8413860595071502704.jar tmpDir=null
    14/03/19 11:33:51 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/03/19 11:33:51 WARN snappy.LoadSnappy: Snappy native library not loaded
    14/03/19 11:33:51 INFO mapred.FileInputFormat: Total input paths to process : 1
    14/03/19 11:33:51 INFO streaming.StreamJob: getLocalDirs(): [/home/landen/UntarFile/hadoop-1.0.4/datas/tmp/mapred/local]
    14/03/19 11:33:51 INFO streaming.StreamJob: Running job: job_201403182127_0007
    14/03/19 11:33:51 INFO streaming.StreamJob: To kill this job, run:
    14/03/19 11:33:51 INFO streaming.StreamJob: /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -Dmapred.job.tracker=Master:9001 -kill job_201403182127_0007
    14/03/19 11:33:51 INFO streaming.StreamJob: Tracking URL: http://Master:50030/jobdetails.jsp?jobid=job_201403182127_0007
    14/03/19 11:33:52 INFO streaming.StreamJob:  map 0%  reduce 0%
    14/03/19 11:34:06 INFO streaming.StreamJob:  map 50%  reduce 0%
    14/03/19 11:34:07 INFO streaming.StreamJob:  map 100%  reduce 0%
    14/03/19 11:34:18 INFO streaming.StreamJob:  map 100%  reduce 100%
    14/03/19 11:34:24 INFO streaming.StreamJob: Job complete: job_201403182127_0007
    14/03/19 11:34:24 INFO streaming.StreamJob: Output: wordCountOutput

  • 相关阅读:
    加密模块
    Flask_Blueprint(蓝图)
    Python中__get__ ,__getattr__ ,__getattribute__用法与区别?
    为什么要使用数据库连接池?以及用法(DBUtils)
    Flask_配置文件
    CRM知识点汇总(未完💩💩💩💩💩)
    popUp
    Django_调查问卷
    Django_form
    Numpy
  • 原文地址:https://www.cnblogs.com/likai198981/p/3611606.html
Copyright © 2020-2023  润新知