• Namenode主节点停止报错 Error: flush failed for required journal


    主节点间歇性报错其他没有问题 ,SNN的NN没有问题,相关的journalNode也都在,就是主节点的NN会停止。

    查看hadoop主节点的NN日志。

    2016-11-21 22:36:40,908 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19822 ms (timeout=20000 ms) for a response for sendEdits. No responses yet.
    2016-11-21 22:36:41,088 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.58.183:8485, 192.168.58.181:8485, 192.168.58.182:8485], stream=QuorumOutputStream starting at txid 24533))
    java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
    	at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
    	at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
    	at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
    	at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
    	at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
    	at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
    	at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
    	at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
    	at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:639)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2645)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2520)
    	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:579)
    	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:394)
    	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
    	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
    	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
    	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:422)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
    	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2034)
    2016-11-21 22:36:41,089 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 24533
    2016-11-21 22:36:41,113 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
    2016-11-21 22:36:41,122 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave2/192.168.58.182:8485. Already tried 0 time(s); maxRetries=45
    2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: Slave1/192.168.58.181:8485. Already tried 0 time(s); maxRetries=45
    2016-11-21 22:36:41,123 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: StandByNameNode/192.168.58.183:8485. Already tried 0 time(s); maxRetries=45
    2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20050ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.182:8485
    2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20052ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.181:8485
    2016-11-21 22:36:41,137 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 20065ms to send a batch of 1 edits (218 bytes) to remote journal 192.168.58.183:8485
    2016-11-21 22:36:41,145 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at CentOSMaster/192.168.58.180
    ************************************************************/

      首先保证设置dfs.namenode.edits.dir和dfs.journalnode.edits.dir,然后设置在hdfs-site.xml中超时时间如下:

    <property>
       <name>dfs.qjournal.start-segment.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.prepare-recovery.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.accept-recovery.timeout.ms</name>
       <value>600000000</value>
      </property>
      <property>
       <name>dfs.qjournal.prepare-recovery.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.accept-recovery.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.finalize-segment.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.select-input-streams.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.get-journal-state.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.new-epoch.timeout.ms</name>
       <value>600000000</value>
      </property>
    
      <property>
       <name>dfs.qjournal.write-txns.timeout.ms</name>
       <value>600000000</value>
      </property>
    

      貌似解决了,至今今天早上没出问题。

  • 相关阅读:
    tomcat发布的class中有一部分类会生成同名的XXX$1.class
    报错:The method encodeBase64String(byte[]) is undefined for the type Base64
    bootstrap中的fileInput上传文件时,文件名称中有-(中划线)改为了_下划线
    java中去html/jsp等前台页面&nbsp;造成的空格
    # 50 个最常被问到的 Selenium 面试问题和答案
    # 为什么测试人员学习测试自动化(仍然)如此困难
    # 如何引进高级的 IT 自动化项目:一个 3 步走计划
    **Selenium IDE、Selenium RC 和 WebDriver 之间有什么区别?**
    pandas 数据分析好的博文
    pandas contains 函数
  • 原文地址:https://www.cnblogs.com/hxsyl/p/6087573.html
Copyright © 2020-2023  润新知