• 从外部重置一个运行中consumer group的消费进度


    对于0.10.1以上版本的kafka, 如何从外部重置一个运行中的consumer group的进度呢?比如有一个控制台,可以主动重置任意消费组的消费进度重置到12小时之前, 而用户的程序可以保持运行状态,无需下线或重启。

    需要这么几个步骤:

    1. 加入这个group

    2. 踢掉所有其它group memeber

    3. try assign all TopicPartition to this client

    4. commit offsets

    5. leave group

    其中第二步是为了让自己当上leader,当然有可能不需要踢掉其它所有成员就能当上leader(因为谁能当leader实际上是按hashmap的迭代次序来的)。

    当上consumer group的leader以后,需要把所有partition assign给自己,这个需要一个特殊的PartitionAssignor。由于这个assignor的协议跟其它consumer group协议不同(但是也可以搞一个表面上协议相同,实际上逻辑不同的assignor),而cooridnator会阻止与当前leader使用的协议不同的成员加入,所以还是需要踢掉其它成员。

    public class ExclusiveAssignor extends AbstractPartitionAssignor {
    
        public interface Callback {
            void onSuccess();
        }
    
    
        private static Logger LOGGER = LoggerFactory.getLogger(ExclusiveAssignor.class);
    
        public static String NAME = "exclusive";
    
    
        private String leaderId = null;
        private Callback callback = null;
    
        public void setLeaderId(String leaderId) {
            this.leaderId = leaderId;
        }
        public void setCallBack(Callback callBack){this.callback = callBack;}
    
    
        @Override
        public String name() {
            return NAME;
        }
    
        private Map<String, List<String>> consumersPerTopic(Map<String, List<String>> consumerMetadata) {
            Map<String, List<String>> res = new HashMap<>();
            for (Map.Entry<String, List<String>> subscriptionEntry : consumerMetadata.entrySet()) {
                String consumerId = subscriptionEntry.getKey();
                for (String topic : subscriptionEntry.getValue())
                    put(res, topic, consumerId);
            }
            return res;
        }
    
    
        @Override
        public Map<String, List<TopicPartition>> assign(Map<String, Integer> partitionsPerTopic,
                                                        Map<String, List<String>> subscriptions) {
            LOGGER.info("perform exclusive assign");
            if(leaderId == null)
                throw new IllegalArgumentException("leaderId should already been set before assign is called");
            if(callback == null)
                throw new IllegalArgumentException("callback should already been set before assign is called");
    
            List<TopicPartition> allPartitions = new ArrayList<TopicPartition>();
            partitionsPerTopic.forEach((topic, partitionNumber) -> {
                for(int i=0; i < partitionNumber; i++)
                    allPartitions.add(new TopicPartition(topic, i));
            });
            Map<String, List<TopicPartition>> assignment = new HashMap<>();
            for (String memberId : subscriptions.keySet()) {
                assignment.put(memberId, new ArrayList<TopicPartition>());
                if(memberId.equals(leaderId)){
                    assignment.get(memberId).addAll(allPartitions);
                }
            }
            callback.onSuccess();
            return assignment;
        }
    
    }

    这个assignor需要知道leaderId是哪个,而leaderId可以在KafkaConsumer的

     protected Map<String, ByteBuffer> performAssignment(String leaderId,
                                                            String assignmentStrategy,
                                                            Map<String, ByteBuffer> allSubscriptions)

    中获取,所以还需要修改一下KafkaConsumer的代码,以确保这个KafkaConsumer的poll并不实际拉取消息,而只是执行commit。

    驱逐其它member,可以使用AdminClient完成

      def forceLeave(coordinator: Node, memberId: String, groupId: String) = {
        logger.info(s"forcing group member: $memberId to leave group: $groupId ")
        send(coordinator, ApiKeys.LEAVE_GROUP, new LeaveGroupRequest(groupId, memberId))
      }

    最终的逻辑就是

      private def forceCommit(consumer: SimpleKafkaConsumer[_, _], groupId: String, topics: Seq[String], maxRetries: Int, toCommit: Map[TopicPartition, OffsetAndMetadata], coordinatorOpt: Option[Node] = None) = {
        consumer.subscribe(JavaConversions.seqAsJavaList(topics))
        val assignedAll = new AtomicBoolean(false)
        consumer.setExclusiveAssignorCallback(new Callback {
          override def onSuccess(): Unit = assignedAll.set(true)
        })
        var currentRetries = 0
        val coordinatorNode = coordinatorOpt.getOrElse(adminClient.findCoordinator(groupId))
        while (!assignedAll.get() && currentRetries < maxRetries) {
          logger.info(s"trying to reset offset for $groupId, retry count $currentRetries  ....")
          clearCurrentMembers(coordinatorNode, groupId, Some(ConsumerGroupManager.magicConsumerId))
          consumer.poll(5000)
          printCurrentAssignment(consumer)
          currentRetries = currentRetries + 1
        }
        if (currentRetries >= maxRetries)
          throw new RuntimeException(s"retry exhausted when getting leadership of $groupId")
        val javaOffsetToCommit = JavaConversions.mapAsJavaMap(toCommit)
        consumer.commitSync(javaOffsetToCommit)
        logger.info(s"successfully committed offset for $groupId: $toCommit")
        consumer.unsubscribe()
      }
      def forceReset(offsetLookupActor: ActorRef, groupId: String, ts: Long, maxRetries: Int)(implicit executionContext: ExecutionContext): Boolean = {
        logger.info(s"resetting offset for $groupId to $ts")
        val groupSummary = adminClient.describeConsumerGroup(groupId)
        val topics = groupSummary.subscribedTopics
        if (topics.isEmpty)
          throw new IllegalStateException(s"group $groupId currently subscribed no topic")
        val offsetToCommit = getOffsetsBehindTs(offsetLookupActor, topics, ts, 10000)
        val consumer = createConsumer(groupId)
        try {
          forceCommit(consumer, groupId, topics, maxRetries, offsetToCommit)
          true
        } finally {
          consumer.close()
        }
      }

     具体代码见 https://github.com/iBuddha/kafka-simple-ui/blob/master/app/kafka/authorization/manager/utils/ConsumerGroupManager.scala

    需要注意的是,发送LeaveGroupRequest可能会使得某些成员到broker的连接断掉,发生这种情况的原因是:当一个consumer发送JoinGroupRequest以后,外部的client再发送一个LeaveGroupRequest把这个consumer踢掉,会使得它个consumer无法收到JoinGroupResponse,从而使得NetworkClient以为连接挂掉。不过client以后会重新连接。而且,在外部client踢掉其它成员并且重新commit offset的过程中,其它consumer不一定有机会加入到group中,因而可能不受这个问题的影响。

  • 相关阅读:
    asp.net下的网页编辑器
    在Visual C#中访问不同数据库
    VS2008 sp1中文版下载地址
    常用封装链接数据库类
    常用封装日志类
    动态构建OrderBy的Lambda表达式
    用户管理抽象类
    存储过程导出数据库数据
    应用程序xml 配置文件抽象基类
    ini文件示例说明
  • 原文地址:https://www.cnblogs.com/devos/p/7223867.html
Copyright © 2020-2023  润新知