• kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date


    错误如标题:

    场景:k8s 容器中通过 go语言编写的 sarama 创建一个 AsyncProducer

    错误原因查找

    1.通过放开sarama的日志(自己实现日志接口,重定义Logger)

    1.1 sarama源码

    */
    package sarama
    
    import (
    	"io/ioutil"
    	"log"
    )
    
    // Logger is the instance of a StdLogger interface that Sarama writes connection
    // management events to. By default it is set to discard all log messages via ioutil.Discard,
    // but you can set it to redirect wherever you want.
    var Logger StdLogger = log.New(ioutil.Discard, "[Sarama] ", log.LstdFlags)
    
    // StdLogger is used to log error messages.
    type StdLogger interface {
    	Print(v ...interface{})
    	Printf(format string, v ...interface{})
    	Println(v ...interface{})
    }
    

    1.2 源码中具体实现

    type Feedback struct {
    	out *log.Logger
    	log *log.Logger
    }
    
    func (fb *Feedback) Println(v ...interface{}) {
    	fb.output(fmt.Sprintln(v...))
    }
    
    func (fb *Feedback) Printf(format string, v ...interface{}) {
    	fb.output(fmt.Sprintf(format, v...))
    }
    
    func (fb *Feedback) Print(v ...interface{}) {
    	fb.output(fmt.Sprint(v...))
    }
    
    func (fb *Feedback) output(s string) {
    	if fb.out != nil {
    		fb.out.Output(2, s)
    	}
    	if fb.log != nil {
    		fb.log.Output(2, s)
    	}
    }
    

    1.3 自定义 打印日志类

    type ourLog struct {
    }
    
    func (fb *ourLog) Println(v ...interface{}) {
    	log.Debug(fmt.Sprintln(v...))
    }
    
    func (fb *ourLog) Printf(format string, v ...interface{}) {
    	log.Debug(fmt.Sprintf(format, v...))
    }
    
    func (fb *ourLog) Print(v ...interface{}) {
    	log.Debug(fmt.Sprint(v...))
    }
    

    1.4 重定义 sarama.Logger

    #在main中加入
    
    sarama.Logger = &ourLog{}
    

    2.重启k8s中docker服务后看程序执行日志  

    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [retrying-1]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 abandoning broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [closed] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 shut down
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) client/metadata fetching metadata for [bi-data-cti-prod-31] from broker sany-onprem-repm-node03:9092
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 starting up
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [open] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 selected broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [flushing-1]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [normal]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [retrying] on bi-data-cti-prod-31/4 because kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [retrying-2]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 abandoning broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [closed] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 shut down
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) client/metadata fetching metadata for [bi-data-cti-prod-31] from broker sany-onprem-repm-node03:9092
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 starting up
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [open] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 selected broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [flushing-2]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [normal]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [retrying] on bi-data-cti-prod-31/4 because kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [retrying-3]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 abandoning broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [closed] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 shut down
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) client/metadata fetching metadata for [bi-data-cti-prod-31] from broker sany-onprem-repm-node03:9092
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 starting up
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [open] on bi-data-cti-prod-31/4
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 selected broker 1003
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [flushing-3]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/leader/bi-data-cti-prod-31/4 state change to [normal]
    [11:25:58 CST 2020/08/06] [DEBG] (main.(*ourLog).Printf:47) producer/broker/1003 state change to [retrying] on bi-data-cti-prod-31/4 because kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.
    [11:25:58 CST 2020/08/06] [DEBG] (app/kafka.(*AsyncProducer).run:92) p-,&{addrs:[sany-onprem-repm-node02:9092 sany-onprem-repm-node03:9092 sany-onprem-repm-node01:9092] username: password: certFile: channelBufferSize:102400 producer:0xc4202ec8c0 done:0xc4202142a0}
    [11:25:58 CST 2020/08/06] [DEBG] (app/kafka.(*AsyncProducer).run:93) p.producer-, &{client:0xc420218300 conf:0xc420092300 ownClient:true errors:0xc4202c8300 input:0xc4202c8360 successes:0xc4202c83c0 retries:0xc4202c8420 inFlight:{noCopy:{} state1:[0 0 0 0 0 0 0 0 0 0 0 0] sema:0} brokers:map[0xc4200ea160:0xc4200a4240 0xc4200ea6e0:0xc4204990e0 0xc4200eb080:0xc4200a48a0] brokerRefs:map[0xc4200a4240:2 0xc4204990e0:1 0xc4200a48a0:1] brokerLock:{state:0 sema:0}}
    [11:25:58 CST 2020/08/06] [DEBG] (app/kafka.(*AsyncProducer).run:94) p.producer.Input-,0xc4202c8360
    [11:25:58 CST 2020/08/06] [DEBG] (app/kafka.(*AsyncProducer).run:95) p.producer.Successes-, 0xc4202c8360
    [11:25:58 CST 2020/08/06] [EROR] (app/kafka.(*AsyncProducer).run:96) kafka: Failed to produce message to topic bi-data-cti-prod-31: kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date., &{Topic:bi-data-cti-prod-31 Key:119 Value:{"MainType":2,"ExtType":9,"Mode":2,"ModeParm":"ag-19","MSGID":"753","TELID":"def","MSG":{"vcc_id":"1","ag_id":"19","que_id":["1"],"grp_id":"0","ag_sta":"3","ag_sta_reason":"1","ag_sta_id":"7","ag_sta_bef":"2","ag_sta_time":"1596684344"}} Metadata:<nil> Offset:0 Partition:4 Timestamp:0001-01-01 00:00:00 +0000 UTC retries:0 flags:0}
    

    3.日志分析

    发现 client 每隔10分钟会定期从 kafka broker 拉取最新的 metadata,在我们新建Producer时,默认retries是3,当3次均拉取不到metadata时,那我们当前消息就写不到kafa,并抛出以上异常

    kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date, 尝试在kafka构建的服务器上发布此程序,没有发现取不到metadata

    的情况,在k8s 容器上执行此程序,不定期就会存在此问题

    4.处理方式
    目前还没有找到解决方式,可以尝试增加retries的值  

     

      

      

  • 相关阅读:
    vue-cli3 打包路径参数说明
    vuex使用map在module的模式下的写法
    普通的JS文件中使用vuex
    vue cli 3+ 版本的source map添加方法
    vue-cli的安装及版本查看/更新
    搭建一个vue项目
    无法将“vue”项识别为 cmdlet、函数、脚本文件或可运行程序的名称
    Centos7开放及查看端口
    记录一次idae和maven设置的巨坑
    解决npm安装node-sass太慢及编译错误问题
  • 原文地址:https://www.cnblogs.com/long-yuan/p/13445404.html
Copyright © 2020-2023  润新知