• kafka报错问题整理


    1、broker挂了  关键字LogDirFailureChannel    NoSuchFileException     Shutdown broker because all log dirs in /tmp/kafka-logs have failed

    装的是单机单节点的kafka,运行了一段时间后挂了,回头看日志如下,查阅了一些问题单,发现这个问题还是很普遍的

    [2022-03-28 10:36:38,194] ERROR Failed to clean up log for __consumer_offsets-2 in dir /tmp/kafka-logs due to IOException (kafka.server.LogDirFailureChannel)
    java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-2/00000000000000000000.log
            at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
            at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
            at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
            at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
            at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
            at java.nio.file.Files.move(Files.java:1395)
            at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:806)
            at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:224)
            at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:489)
            at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:1960)
            at kafka.log.Log$$anonfun$replaceSegments$3.apply(Log.scala:2023)
            at kafka.log.Log$$anonfun$replaceSegments$3.apply(Log.scala:2018)
            at scala.collection.immutable.List.foreach(List.scala:392)
            at kafka.log.Log.replaceSegments(Log.scala:2018)
            at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:582)
            at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:512)
            at kafka.log.Cleaner$$anonfun$doClean$4.apply(LogCleaner.scala:511)
            at scala.collection.immutable.List.foreach(List.scala:392)
            at kafka.log.Cleaner.doClean(LogCleaner.scala:511)
            at kafka.log.Cleaner.clean(LogCleaner.scala:489)
            at kafka.log.LogCleaner$CleanerThread.cleanLog(LogCleaner.scala:350)
            at kafka.log.LogCleaner$CleanerThread.cleanFilthiestLog(LogCleaner.scala:319)
            at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:300)
            at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
            Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-logs/__consumer_offsets-2/00000000000000000000.log -> /tmp/kafka-logs/__consumer_offsets-2/00000000000000000000.log.deleted
                    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
                    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
                    at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
                    at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
                    at java.nio.file.Files.move(Files.java:1395)
                    at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:803)
                    ... 17 more
    ..................
    
    [2022-03-28 10:36:39,277] INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions __consumer_offsets-22,logaudit_20220314-0,__consumer_offsets-30,logaudit_20220307-4,logaudit_20220314-2,logaudit_20220314-11,__consumer_offsets-8,logaudit_20220314-7,__consumer_offsets-21,__consumer_offsets-4,__consumer_offsets-27,__consumer_offsets-7,logaudit_20220307-11,__consumer_offsets-9,__consumer_offsets-46,logaudit_20220307-8,__consumer_offsets-25,__consumer_offsets-35,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,logaudit_20220314-8,__consumer_offsets-47,__consumer_offsets-16,__consumer_offsets-28,logaudit_20220307-1,logaudit_20220314-3,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,logaudit_20220307-7,__consumer_offsets-18,__consumer_offsets-37,__consumer_offsets-15,__consumer_offsets-24,logaudit_20220307-6,logaudit_20220314-9,logaudit_20220314-4,__consumer_offsets-38,__consumer_offsets-17,logaudit_20220307-9,__consumer_offsets-48,__consumer_offsets-19,logaudit_20220307-2,__consumer_offsets-11,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,logaudit_20220314-5,logaudit_20220314-1,logaudit_20220307-5,logaudit_20220314-6,__consumer_offsets-20,__consumer_offsets-0,logaudit_20220314-10,__consumer_offsets-44,__consumer_offsets-39,logaudit_20220307-3,__consumer_offsets-12,yanbiao_1-0,logaudit_20220307-10,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,logaudit_20220307-0,__consumer_offsets-40 and stopped moving logs for partitions  because they are in the failed log directory /tmp/kafka-logs. (kafka.server.ReplicaManager)
    [2022-03-28 10:36:39,279] INFO Stopping serving logs in dir /tmp/kafka-logs (kafka.log.LogManager)
    [2022-03-28 10:36:39,633] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs have failed (kafka.log.LogManager)

    报错位置的源码:

    private def cleanLog(cleanable: LogToClean): Unit = {
        val startOffset = cleanable.firstDirtyOffset
    ​    var endOffset = startOffset
    ​    try {
    ​        val (nextDirtyOffset, cleanerStats) = cleaner.clean(cleanable)
    ​        endOffset = nextDirtyOffset
    ​        recordStats(cleaner.id, cleanable.log.name, startOffset, endOffset, cleanerStats)
    ​      } catch {
    ​        case _: LogCleaningAbortedException => // task can be aborted, let it go.
    ​        case _: KafkaStorageException => // partition is already offline. let it go.
    ​        case e: IOException =>
    ​        val logDirectory = cleanable.log.parentDir
    ​        val msg = s"Failed to clean up log for ${cleanable.topicPartition} in dir $logDirectory due to IOException"
    ​        logDirFailureChannel.maybeAddOfflineLogDir(logDirectory, msg, e)
    ​      } finally {
    ​        cleanerManager.doneCleaning(cleanable.topicPartition, cleanable.log.parentDirFile, endOffset)
    ​      }
    ​    }

    显示是因为找不到文件报错,我在前面安装部署的时候log文件的位置是log.dirs=/tmp/kafka-logs,说明这里面的文件被悄悄的清理了

    因为没人会无缘无故到这里来手动清理,网上比较靠谱的解释是linux系统会定时清理/tmp下的文件,解决该问题的方案也是先清空log.dirs下的文件,然后重启broker

    但是这个方案只是临时的,并且暴力全部删除会丢失数据(kafka消息就是通过log文件记录的)或数据混乱,在kafka的官网issue里找到的问题单也做了说明,目前没有解决此问题:

    https://issues.apache.org/jira/browse/KAFKA-6188

  • 相关阅读:
    POJ 1700 过河坐船最短时间问题
    C++继承与派生上机记录
    POJ 1007 DNA Sorting
    大一C++语言程序设计6-20上机作业
    POJ 1006 Biorhythms
    对“C++添加一个头文件和extern以及全局变量和局部变量问题”的解释
    C++添加一个头文件和extern以及全局变量和局部变量问题(16.3.19上机的一小题)
    Node 中的模块化(module对象中的exports以及导入require方法)
    http 模块
    path 路径模块
  • 原文地址:https://www.cnblogs.com/yb38156/p/16106616.html
Copyright © 2020-2023  润新知