• kafka 生产者(一)


    1、生产者消息发送流程

    1.1、发送原理

    在消息发送的过程中,涉及到了两个线程——main线程和Sender线程。在main线程中创建了一个双端队列RecordAccumulator。main线程将消息发送给RecordAccumulator,Sender线程不断从RecordAccumulator中拉取消息发送到KafkaBroker。

    1.2、生产者参数列表

    2、同步发送API

    2.1、普通异步发送

     引入依赖

            <dependency>
                <groupId>org.apache.kafka</groupId>
                <artifactId>kafka-clients</artifactId>
                <version>3.0.0</version>
            </dependency>

    异步发送测试代码

    public class CustomProducer {
        public static void main(String[] args) {
            Properties properties = new Properties();
            //连接ZK
            properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "hadoop103:9092,");
            //设置KV序列化
            properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
    
            //指定 kv 的序列化类型
            //1、创建 生产者
            KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据 put异步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first", i + "  hello wdh01"));
            }
            //3、关闭资源
            KafkaProducer.close();
    
        }
    }

    开启kafka 消费数据 

    [hui@hadoop103 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop103:9092 --topic first
    0  hello wdh01
    1  hello wdh01
    2  hello wdh01
    3  hello wdh01
    4  hello wdh01

    2.2、带回调函数的异步发送

    回调函数会在producer收到ack时调用,为异步调用,该方法有两个参数,分别是元数据信息(RecordMetadata)和异常信息(Exception),如果Exception为null,说明消息发送成功,如果Exception不为null,说明消息发送失败。

     注意:消息发送失败会自动重试,不需要我们在回调函数中手动重试。

    public class CustomProducerCallBack {
        public static void main(String[] args) {
            Properties properties = new Properties();
            //连接ZK
            properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "hadoop103:9092,");
            //设置KV序列化
            properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
    
            //指定 kv 的序列化类型
            //1、创建 生产者
            KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据 put异步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first", i + "  hello wdh01"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println("主题 " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            //3、关闭资源
            KafkaProducer.close();
        }
    }

    消费到数据

    [hui@hadoop103 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop103:9092 --topic first
    0  hello wdh01
    1  hello wdh01
    2  hello wdh01
    3  hello wdh01
    4  hello wdh01

    控制台输出回调信息

    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1

    3、同步发送API

     只需在异步发送的基础上,再调用一下get()方法即可

    public class CustomProducerSync {
        public static void main(String[] args) throws ExecutionException, InterruptedException {
            Properties properties = new Properties();
            //连接ZK
            properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "hadoop103:9092,");
            //设置KV序列化
            properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            //指定 kv 的序列化类型
            //1、创建 生产者
            KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据  同步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first", i + "  hello wdh01")).get();
            }
            //3、关闭资源
            KafkaProducer.close();
        }
    }

    消费到数据

    [hui@hadoop103 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop103:9092 --topic first
    0  hello wdh01
    1  hello wdh01
    2  hello wdh01
    3  hello wdh01
    4  hello wdh01

    4、生产者分区

    4.1、分区好处

    1. 便于合理使用存储资源,每个Partition在一个Broker上存储,可以把海量的数据按照分区切割成一块一块数据存储在多台Broker上。合理控制分区的任务,可以实现负载均衡的效果。
    2. 提高并行度,生产者可以以分区为单位发送数据;消费者可以以分区为单位进行消费数据。ConsumerConsumerConsumerss 100T资料le

     4.2、分区策略

    分区策略在 DefaultPartitioner 有详细的说明,idea 里 ctrl + n 输入 DefaultPartitioner

    /**
     * The default partitioning strategy:
     * <ul>
     * <li>If a partition is specified in the record, use it
     * <li>If no partition is specified but a key is present choose a partition based on a hash of the key
     * <li>If no partition or key is present choose the sticky partition that changes when the batch is full.
     * 
     * See KIP-480 for details about sticky partitioning.
     */
    public class DefaultPartitioner implements Partitioner {

    以下几个方法都指明partition的情况,直接将指明的值作为partition值;例如partition=0,所有数据写入分区0

     public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value, Iterable<Header> headers) {
     public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value) {
     public ProducerRecord(String topic, Integer partition, K key, V value, Iterable<Header> headers) {
     public ProducerRecord(String topic, Integer partition, K key, V value) {

    方法内容详见

        public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value, Iterable<Header> headers) {
            if (topic == null)
                throw new IllegalArgumentException("Topic cannot be null.");
            if (timestamp != null && timestamp < 0)
                throw new IllegalArgumentException(
                        String.format("Invalid timestamp: %d. Timestamp should always be non-negative or null.", timestamp));
            if (partition != null && partition < 0)
                throw new IllegalArgumentException(
                        String.format("Invalid partition: %d. Partition number should always be non-negative or null.", partition));
            this.topic = topic;
            this.partition = partition;
            this.key = key;
            this.value = value;
            this.timestamp = timestamp;
            this.headers = new RecordHeaders(headers);
        }
    
        /**
         * Creates a record with a specified timestamp to be sent to a specified topic and partition
         *
         * @param topic The topic the record will be appended to
         * @param partition The partition to which the record should be sent
         * @param timestamp The timestamp of the record, in milliseconds since epoch. If null, the producer will assign the
         *                  timestamp using System.currentTimeMillis().
         * @param key The key that will be included in the record
         * @param value The record contents
         */
        public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value) {
            this(topic, partition, timestamp, key, value, null);
        }
    
        /**
         * Creates a record to be sent to a specified topic and partition
         *
         * @param topic The topic the record will be appended to
         * @param partition The partition to which the record should be sent
         * @param key The key that will be included in the record
         * @param value The record contents
         * @param headers The headers that will be included in the record
         */
        public ProducerRecord(String topic, Integer partition, K key, V value, Iterable<Header> headers) {
            this(topic, partition, null, key, value, headers);
        }
        
        /**
         * Creates a record to be sent to a specified topic and partition
         *
         * @param topic The topic the record will be appended to
         * @param partition The partition to which the record should be sent
         * @param key The key that will be included in the record
         * @param value The record contents
         */
        public ProducerRecord(String topic, Integer partition, K key, V value) {
            this(topic, partition, null, key, value, null);
        }
    View Code

    下面这个分区逻辑没有指明partition值但有key的情况下,将key的hash值与topic的partition数进行取余得到partition值;例如:key1的hash值=5,key2的hash值=6,topic的partition数=2,那么key1对应的value1写入1号分区,key2对应的value2写入0号分区。

        public ProducerRecord(String topic, K key, V value) {
            this(topic, null, null, key, value, null);
        }

    最后一个分区既没有partition值又没有key值的情况下,Kafka采用StickyPartition(黏性分区器),会随机选择一个分区,并尽可能一直使用该分区,待该分区的batch已满或者已完成,Kafka再随机一个分区进行使用(和上一次的分区不同)。例如:第一次随机选择0号分区,等0号分区当前批次满了(默认16k)或者linger.ms设置的时间到,Kafka再随机一个分区进行使用(如果还是0会继续随机)。

        public ProducerRecord(String topic, V value) {
            this(topic, null, null, null, value, null);
        }

    测试1 将数据发往指定partition的情况下,例如,将所有数据发往分区1中。

    public class CustomProducerCallBackPartitions {
        public static void main(String[] args) {
            Properties properties = new Properties();
            //连接ZK
            properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "hadoop103:9092,");
            //设置KV序列化
            properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
    
            //指定 kv 的序列化类型
            //1、创建 生产者
            KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据 put异步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first", 1, "", i + "  hello wdh01"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println("主题 " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            //3、关闭资源
            KafkaProducer.close();
        }
    }

    执行后回调信息

    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1

    消费数据

    [hui@hadoop103 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop103:9092 --topic first
    0  hello wdh01
    1  hello wdh01
    2  hello wdh01
    3  hello wdh01
    4  hello wdh01

     测试2 没有指明partition值但有key的情况下,将key的hash值与topic的partition数进行取余得到partition值。

      KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据 put异步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first",   "a", i + "  hello wdh01"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println(" a 主题  " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            Thread.sleep(1000);
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first",   "b", i + "  hello wdh01"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println(" b 主题 " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            Thread.sleep(1000);
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first",   "f", i + "  hello wdh01"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println(" f 主题 " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            //3、关闭资源
            KafkaProducer.close();

    回调结果

     a 主题  first 分区 1
     a 主题  first 分区 1
     a 主题  first 分区 1
     a 主题  first 分区 1
     a 主题  first 分区 1
     b 主题 first 分区 2
     b 主题 first 分区 2
     b 主题 first 分区 2
     b 主题 first 分区 2
     b 主题 first 分区 2
     f 主题 first 分区 0
     f 主题 first 分区 0
     f 主题 first 分区 0
     f 主题 first 分区 0
     f 主题 first 分区 0
    View Code

    4.3、自定义分区

    kafka 支持自定义分区,只要实现一个  Partitioner 即可

    案例

    public class MyPartitioner implements Partitioner {
        @Override
        public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
            //过滤数据
            int partiton;
            String mag = value.toString();
            if (mag.contains("wdh01")) {
                partiton = 0;
            } else {
                partiton = 1;
            }
            return partiton;
        }
    
        @Override
        public void close() {
    
        }
    
        @Override
        public void configure(Map<String, ?> configs) {
    
        }
    }

    自定义分区测试

    public class CustomProducerCallBackPartitionsCustom {
        public static void main(String[] args) {
            Properties properties = new Properties();
            properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "hadoop103:9092,");
            properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
            //关联自定义分区器
            properties.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, "org.wdh01.kk.MyPartitioner");
            //指定 kv 的序列化类型
            //1、创建 生产者
            KafkaProducer<String, String> KafkaProducer = new KafkaProducer<String, String>(properties);
            //2、发送数据 put异步发送
            for (int i = 0; i < 5; i++) {
                KafkaProducer.send(new ProducerRecord<>("first", i + "  hello wdh1"), new Callback() {
                    // new Callback( 回调函数
                    @Override
                    public void onCompletion(RecordMetadata metadata, Exception exception) {
                        if (exception == null) {
                            System.out.println("主题 " + metadata.topic() + " 分区 " + metadata.partition());
                        }
                    }
                });
            }
            //3、关闭资源
            KafkaProducer.close();
        }
    }

    回调结果

    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
    主题 first 分区 1
  • 相关阅读:
    linux下运行、停止jar包
    .net开发CAD2008无法调试的解决方法
    【转】C#获取当前程序运行路径的方法集合
    如何在arcmap中调试addin或者插件
    【转】WriteMessage的信息在AutoCAD中命令行中实时显示
    ArcGIS10的附件功能
    CAD调试时抛出“正试图在 os 加载程序锁内执行托管代码。不要尝试在 DllMain 或映像初始化函数内运行托管代码”异常的解决方法
    [转]Tesseract 3.02中文字库训练
    Truncated incorrect DOUBLE value
    tomcat无故停止
  • 原文地址:https://www.cnblogs.com/wdh01/p/16071106.html
Copyright © 2020-2023  润新知