• BigDataKafka MQ Messaging Queue


    2017-10-2416:51:12

    传统的消息 两种模式:

    1 Queue  队列,消费者池可以从服务器读取,每条记录都转到其中一个;

      优点是:它允许在多个消费者实例上分配数据处理,从而可以扩展处理。

      缺点是:不是多用户,一旦一个进程读取数据,它就会消失。

    2 Publish / Subscribe 发布/订阅,记录广播给所有消费者。

      允许将数据广播到多个进程,但无法缩放处理。每个消息都发送给每个用户。

    Kafka 允许对进程集合(消费组的成员)进行分割处理。与发布订阅一样, Kafka 允许将消息广播到多个消费者组。

      传统的队列在服务器上保存顺序的记录,如果多个消费者从队列消费,则服务器按照存储的顺序输出记录。

      然而,如果服务器按顺序输出记录,但是记录被异步传递给消费者,可能会在不同的消费者处按顺序到达。

      这意味着在并行消耗的情况下,记录的排序丢失。

      消息传递系统通常通过使用“唯一消费者”的概念只允许哟个进程从队列中消费,但这当然以为这处理中没有并行性。

      Kafka does it better. By having a notion of parallelism -- the partition -- within the topics,

      Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes.

      This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each

      partitions in the topic to the consumed by exactly one comsumer is the only reader of that partition and consumers

      the data in order.

      Since there are many partitions this still balances the load over many consumer instance.

      Note however that there cannot be more consumer instances in a consumer group than partitions.

      

      Kafka for Stream Processing

      Data Written to Kafka is written to disk and replicated for fault-tolerance.

      Kafka allows producers to wait on acknowledgement so that a write isn`t considered complete until it is fully

      replicated and guranteed to persist even if the server written to fails.

      Kafka combines Storage and  Streaming Processing 

      a platform for streaming applications as well as for streaming data pipelines.

      

      By combining storage and low-latency subscriptions, streaming applications can treat both

    past and future data the same way. That is a single application can process historical,

    Stored data but rather than ending when it reaches the last record it can keep processing as future data arrives.

    This is a generalized notion of stream processing as well as message-driven application

      通过组合存储和低延迟订阅, 流式应用程序可以以相同的方式处理过去和未来的数据。

      这是一个单一的应用程序可以处理历史的,存储的数据, 而不是在到达最后一个记录时结束,它可以

    在将来的数据到达时继续处理。 这是流式处理的一般概念,其中包含批处理以及消息驱动的应用程序。

      Likewise for streaming data pipelines the combination of subscription to real-time events

    make it possible to use Kafka for very low-latency pipelines;

      but the ability to store data reliably make it possible to use it for critial data where 

    the delivery of data must be guaranteed or for integration with offline systems that load data only periodically or may go down for extended periods of time for maintenance.

      The stream processing facilities make it possible to transform data as it arrives.

      

      

      

      

  • 相关阅读:
    mysql 从库执行insert失败导致同步停止
    MySQL 占用cpu 100%
    MySQl 主从配置实战
    tomcat 线程数与 mysql 连接数综合调优
    mysql 数据迁移
    Windows系统上设置 Git Bash 的 Font 及 Locale
    java -jar 使用要点
    ConcurrentHashMap 从Java7 到 Java8的改变
    sql索引组织
    注册、启动、停止windows服务
  • 原文地址:https://www.cnblogs.com/masterSoul/p/7727513.html
Copyright © 2020-2023  润新知