场景描述:
hadoop集群中正在运行的任务,点击“application_1438756578740_5947”链接,然后能看到ApplicationMaters信息,有N个Node节点在运行,然后点击任一个Node的logs链接,会报错如下:“Container does not exist.”
hadoop jira上貌似是2.3的一个bug,2.4fix了
added comment in ContainerLogsUtils.getContainerLogDirs() as below.
"It is not required to have null check for container ( container == null ) and throw back exception.Because when container is completed, NodeManager remove container information from its NMContext.Configuring log aggregation to false, container log view request is forwarded to NM. NM does not have completed container information,but still NM serve request forreading container logs."
https://issues.apache.org/jira/browse/YARN-1206
PS: 经过测试,当我启动日志聚合功能(log aggregation to true),然后再启动 hadoop history server(端口是19888)进程,就不会再报“Container does not exist.”
之前也遇到这个问题
Failed redirect for container_1412602970010_0037_01_000002
Failed while trying to construct the redirect url to the log server. Log Server url may not be configured
Container does not exist.
但在yarn-site.xml下增加如下内容:
<property>
<name>yarn.log.server.url</name>
<value>http://hnn002.dev.com:19888/jobhistory/logs/</value>
</property>
by default logs from hdfs://user/history/* are not accessible through JobHistory server.
When I changed the permission on hdfs://user/history/
hadoop fs -chmod -R 777 /user/history/
参考链接:
https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/HBGzj_NG9_s
https://issues.apache.org/jira/browse/YARN-1206
尊重原创,未经允许不得转载:
http://blog.csdn.net/stark_summer/article/details/47616773
版权声明:本文为博主原创文章,未经博主允许不得转载。