• kafka 0.11 spark 2.11 streaming例子


    
    
    """
     Counts words in UTF8 encoded, '
    ' delimited text received from the network every second.
     Usage: kafka_wordcount.py <zk> <topic>
     To run this on your local machine, you need to setup Kafka and create a producer first, see
     http://kafka.apache.org/documentation.html#quickstart
     and then run the example
        `$ bin/spark-submit --jars 
          external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar 
          examples/src/main/python/streaming/kafka_wordcount.py 
          localhost:2181 test`
    """
    from __future__ import print_function
    
    import sys
    
    from pyspark import SparkContext
    from pyspark.streaming import StreamingContext
    from pyspark.streaming.kafka import KafkaUtils
    
    if __name__ == "__main__":
        if len(sys.argv) != 3:
            print("Usage: kafka_wordcount.py <zk> <topic>", file=sys.stderr)
            exit(-1)
    
        sc = SparkContext(appName="PythonStreamingKafkaWordCount")
        ssc = StreamingContext(sc, 1)
    
        zkQuorum, topic = sys.argv[1:]
        kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1})
        lines = kvs.map(lambda x: x[1])
        counts = lines.flatMap(lambda line: line.split(" ")) 
            .map(lambda word: (word, 1)) 
            .reduceByKey(lambda a, b: a+b)
        counts.pprint()
    
        ssc.start()
        ssc.awaitTermination()

    /spark-kafka/spark-2.1.1-bin-hadoop2.6# ./bin/spark-submit --jars ~/spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar examples/src/main/python/streaming/kafka_wordcount.py localhost:2181 test

     其中:spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar在 http://search.maven.org/#search%7Cga%7C1%7Cspark-streaming-kafka-0-8-assembly 下载

    kafka 使用0.11版本:

    1.3 Quick Start

    This tutorial assumes you are starting fresh and have no existing Kafka or ZooKeeper data. Since Kafka console scripts are different for Unix-based and Windows platforms, on Windows platforms use binwindows instead of bin/, and change the script extension to .bat.

    Step 1: Download the code

    Download the 0.11.0.0 release and un-tar it.
    1
    2
    > tar -xzf kafka_2.11-0.11.0.0.tgz
    > cd kafka_2.11-0.11.0.0

    Step 2: Start the server

    Kafka uses ZooKeeper so you need to first start a ZooKeeper server if you don't already have one. You can use the convenience script packaged with kafka to get a quick-and-dirty single-node ZooKeeper instance.

    1
    2
    3
    > bin/zookeeper-server-start.sh config/zookeeper.properties
    [2013-04-22 15:01:37,495] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
    ...

    Now start the Kafka server:

    1
    2
    3
    4
    > bin/kafka-server-start.sh config/server.properties
    [2013-04-22 15:01:47,028] INFO Verifying properties (kafka.utils.VerifiableProperties)
    [2013-04-22 15:01:47,051] INFO Property socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties)
    ...

    Step 3: Create a topic

    Let's create a topic named "test" with a single partition and only one replica:

    1
    > bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

    We can now see that topic if we run the list topic command:

    1
    2
    > bin/kafka-topics.sh --list --zookeeper localhost:2181
    test

    Alternatively, instead of manually creating topics you can also configure your brokers to auto-create topics when a non-existent topic is published to.

    Step 4: Send some messages

    Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster. By default, each line will be sent as a separate message.

    Run the producer and then type a few messages into the console to send to the server.

    1
    2
    3
    > bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
    This is a message
    This is another message

    Step 5: Start a consumer

    Kafka also has a command line consumer that will dump out messages to standard output.

    1
    2
    3
    > bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
    This is a message
    This is another message
  • 相关阅读:
    C++ Opencv 傅里叶变换的代码实现及关键函数详解
    C++ Opencv createTrackbar()创建滑动条实现对比度、亮度调节及注意事项
    C++ Opencv Mat类型使用的几个注意事项及自写函数实现Laplace图像锐化
    C++ Opencv split()通道分离函数 merge()通道合并函数 使用操作详解
    如何将nupkg文件安装到VS2017
    [C#]如何访问及调用类中私有成员及方法
    [C#]使用Redis来存储键值对(Key-Value Pair)
    [C#]使用log4net记录日志
    [C#]使用Quartz.NET来创建定时工作任务
    [C#]使用IFormattable接口来实现字符串格式化
  • 原文地址:https://www.cnblogs.com/bonelee/p/7435506.html
Copyright © 2020-2023  润新知