• 【异常】org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.:


    1 详细异常

    org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /wm1/link/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/003993.sst
            at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
            at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
            at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:181)
            at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:245)
            at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
            at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:562)
            at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
    Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /wm1/link/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/003993.sst
            at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
            at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
            at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
            at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:950)
            at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:937)
            at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:210)
            at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
            ... 5 more
    2020-01-06 10:14:24,136 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NodeManager at ****。****
    ************************************************************/
    

      

     
    发现疑似目录:/var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state下存在: 005615.sst 005638.log 005640.log CURRENT LOCK MANIFEST-004397移除所有文件。重启nodemanager 成功。 回顾错误原因可能是,我在该nodemanager终止情况下,在集群中添加了新的nodemanager,使得角色数目增加,而启动失败的nodemanager时,它使用存储的状态来恢复,在和数据库校验过程中发现数目不符合而启动失败。因此删除上述目录下的文件。
     
     
     
  • 相关阅读:
    【Flink系列十二】使用OpenResty 在InfluxDB协议层拦截Flink指标
    【Azkaban优化】防止IP变化导致频繁登录
    困扰多年的Quartz重复调度的问题,终于找到原因
    【Flink系列十一】FlinkSQL Gateway以及支持Kerberos多租户的实现思路
    【Flink系列十三】FlinkKafkaConnector KafkaSource FlinkKafkaConsumer没有上报指标
    IDEA项目结构出现 0% methods,0% lines covered up
    java去重 up
    svndown下来项目但是导入不了maven包,解决办法 up
    新型病毒加强勒索病毒预防 up
    java list 时间排序, up
  • 原文地址:https://www.cnblogs.com/QuestionsZhang/p/12182374.html
Copyright © 2020-2023  润新知