• troubleshooting-Container 'PHYSICAL' memory limit


    原因分析

    CDH 集群环境没有对 Container分配足够的运行环境(内存)

    解决办法

    需要修改的配置文件,将具体的配置项修改匹配集群环境资源。如下:
    配置文件
    配置设置
    解释
    计算值(参考)
    yarn-site.xml
    yarn.nodemanager.resource.memory-mb
    分配给容器的物理内存数量
    = 52 * 2 =104 G
    yarn-site.xml
    yarn.scheduler.minimum-allocation-mb
    容器可以请求的最小物理内存量(以 MiB 为单位)
    = 2G
    yarn-site.xml
    yarn.scheduler.maximum-allocation-mb
    为容器请求的最大物理内存数量(以 MiB 为单位)。
    = 52 * 2 = 104G
    yarn-site.xml (check)
    yarn.app.mapreduce.am.resource.mb
    ApplicationMaster 的物理内存要求 (MiB)。
    = 2 * 2=4G
    yarn-site.xml (check)
    yarn.app.mapreduce.am.command-opts
    传递到 MapReduce ApplicationMaster 的 Java 命令行参数
    = 0.8 * 2 * 2=3.2G
    yarn-site.xml
    yarn.nodemanager.vmem-pmem-ratio
    容器内存限制时虚拟内存与物理内存的比率
    默认是2.1,根据实际情况调整这个配置项的值
    mapred-site.xml
    mapreduce.map.memory.mb
    为作业的每个 Map 任务分配的物理内存量(MiB)。
    = 2G
    mapred-site.xml
    mapreduce.reduce.memory.mb
    为作业的每个 Reduce 任务分配的物理内存量(MiB)。
    = 2 * 2=4G
    mapred-site.xml
    mapreduce.map.java.opts
    Map 进程的 Java 选项。
    = 0.8 * 2=1.6G
    mapred-site.xml
    mapreduce.reduce.java.opts
    Reduce 进程的 Java 选项。
    = 0.8 * 2 * 2=3.2G

    异常日志

    'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
     
    Application application_1543392650432_0855 failed 2 times due to AM Container for appattempt_1543392650432_0855_000002 exited with exitCode: -104
    Failing this attempt.Diagnostics: [2018-12-01 14:57:17.762]Container [pid=31682,containerID=container_1543392650432_0855_02_000001] is running 120156160B beyond the 'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
    Dump of the process-tree for container_1543392650432_0855_02_000001 :
    |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    |- 1080 31768 31682 31682 (java) 2769 194 3968139264 299128 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hive-exec-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar -localtask -plan file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10010/plan.xml -jobconffile file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10011/jobconf.xml
    |- 31682 31680 31682 31682 (bash) 0 0 11960320 344 /bin/bash -c /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM 1>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stdout 2>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stderr
    |- 31689 31682 31682 31682 (java) 355 28 14790037504 76787 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM
    |- 31768 31756 31682 31682 (java) 1750 114 4003151872 176993 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/lib/hive-cli-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.cli.CliDriver --hiveconf hive.query.redaction.rules=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/conf/redaction-rules.json --hiveconf hive.exec.query.redactor.hooks=org.cloudera.hadoop.hive.ql.hooks.QueryRedactor --hiveconf hive.aux.jars.path=file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-core.jar -S -v -e
    |- 31756 31689 31682 31682 (initialization_) 0 0 11960320 371 /bin/bash ./initialization_data_step2.sh 20181123 20181129 dwp_order_log_process
     
    [2018-12-01 14:57:17.770]Container killed on request. Exit code is 143
    [2018-12-01 14:57:17.778]Container exited with a non-zero exit code 143.
    For more detailed output, check the application tracking page: https://master.prodcdh.com:8090/cluster/app/application_1543392650432_0855 Then click on links to logs of each attempt.
    . Failing the application.

    引申参考

    https://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits

    https://yq.aliyun.com/articles/25470

  • 相关阅读:
    webpack4 plugins 篇
    webpack4 打包静态资源
    babel 7 简单指北
    JS: 深拷贝
    JS: 数组的循环函数
    async await 的执行
    redux
    TCP通信
    理解Javascript的原型和原型链
    「译」forEach循环中你不知道的3件事
  • 原文地址:https://www.cnblogs.com/chwilliam85/p/10061140.html
Copyright © 2020-2023  润新知