• hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits


    insert overwrite table canal_amt1......
    2014-10-09 10:40:27,368 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 2772.48 sec
    2014-10-09 10:40:28,426 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 2772.48 sec
    2014-10-09 10:40:29,481 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 2774.12 sec
    2014-10-09 10:40:30,885 Stage-1 map = 100%,  reduce = 32%, Cumulative CPU 2774.36 sec
    2014-10-09 10:40:31,963 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2693.96 sec
    2014-10-09 10:40:33,071 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2693.96 sec
    2014-10-09 10:40:34,126 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2693.96 sec
    2014-10-09 10:40:35,182 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2693.96 sec
    MapReduce Total cumulative CPU time: 44 minutes 53 seconds 960 msec
    Ended Job = job_1409124602974_0745 with errors
    Error during job, obtaining debugging information...
    Examining task ID: task_1409124602974_0745_m_000003 (and more) from job job_1409124602974_0745
    Examining task ID: task_1409124602974_0745_m_000002 (and more) from job job_1409124602974_0745
    Examining task ID: task_1409124602974_0745_r_000000 (and more) from job job_1409124602974_0745
    Examining task ID: task_1409124602974_0745_r_000006 (and more) from job job_1409124602974_0745
    
    Task with the most failures(4): 
    -----
    Task ID:
      task_1409124602974_0745_r_000003
    
    URL:
      http://HADOOP2:8088/taskdetails.jsp?jobid=job_1409124602974_0745&tipid=task_1409124602974_0745_r_000003
    -----
    Diagnostic Messages for this Task:
    Container [pid=22068,containerID=container_1409124602974_0745_01_000047] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
    Dump of the process-tree for container_1409124602974_0745_01_000047 :
            |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
            |- 22087 22068 22068 22068 (java) 2536 833 2730713088 265378 /usr/jdk64/jdk1.6.0_31/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/root/appcache/application_1409124602974_0745/container_1409124602974_0745_01_000047/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1409124602974_0745/container_1409124602974_0745_01_000047 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 54.0.88.58 41150 attempt_1409124602974_0745_r_000003_3 47 
            |- 22068 2381 22068 22068 (bash) 1 1 110755840 302 /bin/bash -c /usr/jdk64/jdk1.6.0_31/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/root/appcache/application_1409124602974_0745/container_1409124602974_0745_01_000047/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1409124602974_0745/container_1409124602974_0745_01_000047 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 54.0.88.58 41150 attempt_1409124602974_0745_r_000003_3 47 1>/hadoop/yarn/log/application_1409124602974_0745/container_1409124602974_0745_01_000047/stdout 2>/hadoop/yarn/log/application_1409124602974_0745/container_1409124602974_0745_01_000047/stderr  
    
    Container killed on request. Exit code is 143
    
    
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    MapReduce Jobs Launched: 
    Job 0: Map: 23  Reduce: 7   Cumulative CPU: 2693.96 sec   HDFS Read: 6278784712 HDFS Write: 590228229 FAIL
    Total MapReduce CPU Time Spent: 44 minutes 53 seconds 960 msec

    原因:空间不足

    解决办法:

    在执行hive语句前加上

    set mapreduce.map.memory.mb=1025;//只要大于1024,hive默认分配的内存分大一倍,也就是2048M  
    set mapreduce.reduce.memory.mb=1025;  


    执行结果:

    MapReduce Total cumulative CPU time: 0 days 1 hours 10 minutes 14 seconds 590 msec
    Ended Job = job_1409124602974_0746
    Loading data to table default.canal_amt1
    Table default.canal_amt1 stats: [num_partitions: 0, num_files: 7, num_rows: 0, total_size: 4131948868, raw_data_size: 0]
    MapReduce Jobs Launched: 
    Job 0: Map: 23  Reduce: 7   Cumulative CPU: 4214.59 sec   HDFS Read: 6278784712 HDFS Write: 4131948868 SUCCESS
    Total MapReduce CPU Time Spent: 0 days 1 hours 10 minutes 14 seconds 590 msec
    OK
    Time taken: 673.851 seconds


    网上查询可能其他原因:

    1.map阶段报空指针

    原因:数据字段中插入了空值

    2.Exception in thread "Thread-19" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local 

    参考http://grokbase.com/p/cloudera/cdh-user/126wqvfwyt/hive-refuses-to-work-with-yarn 

    解决方法:

    就是在hive-site.xml中添加设置

    In the meantime I recommend doing the following if you need to run Hive on 
    MR2: 
    * Keep Hive happy by setting mapred.job.tracker to a bogus value. 
    * Disable task log retrieval by setting 
    hive.exec.show.job.failure.debug.info=false 

    3.protuf版本不一致。
     


  • 相关阅读:
    十二、 Spring Boot 静态资源处理
    九、 Spring Boot 拦截器
    docker之搭建私有仓库
    docker之Dokcerfile 常用指令
    docker之网络管理
    docker之故障问题解决方案
    docker之搭建LNMP
    docker之容器数据持久化
    都说岁月不饶人,我们又何曾饶过岁月
    docker之容器管理
  • 原文地址:https://www.cnblogs.com/kxdblog/p/4034244.html
Copyright © 2020-2023  润新知