• flume 配置与使用


    1、下载flume,解压到自建文件夹

    2、修改flume-env.sh文件

      在文件中添加JAVA_HOME

    3、修改flume.conf 文件(原名好像不叫这个,我自己把模板名改了)

    里面我自己配的(具体配置参见 http://flume.apache.org/FlumeUserGuide.html)

    agent1 是我的代理名称

    source是netcat (数据源)

    channel 是memory(内存)

    sink是hdfs(输出)

    注意配置中添加

    agent1.sinks.k1.hdfs.fileType = DataStream

    否则hdfs中接收的文件会出现乱码

    如果要配置根据时间来分类写入hdfs的功能,要求传入的文件必须要有时间戳(datastamp)

    # Licensed to the Apache Software Foundation (ASF) under one
    # or more contributor license agreements.  See the NOTICE file
    # distributed with this work for additional information
    # regarding copyright ownership.  The ASF licenses this file
    # to you under the Apache License, Version 2.0 (the
    # "License"); you may not use this file except in compliance
    # with the License.  You may obtain a copy of the License at
    #
    #  http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing,
    # software distributed under the License is distributed on an
    # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    # KIND, either express or implied.  See the License for the
    # specific language governing permissions and limitations
    # under the License.
    
    
    # The configuration file needs to define the sources, 
    # the channels and the sinks.
    # Sources, channels and sinks are defined per agent, 
    # in this case called 'agent'
    
    agent1.sources = r1
    agent1.channels = c1
    agent1.sinks = k1
    
    # For each one of the sources, the type is defined
    agent1.sources.r1.type = netcat
    agent1.sources.r1.channels = c1
    #agent1.sources.r1.ack-every-event = false
    agent1.sources.r1.max-line-length = 100
    agent1.sources.r1.bind = 192.168.19.107
    agent1.sources.r1.port = 44445
    
    
    # Describe/configure the interceptor
    
    #agent1.sources.r1.interceptors = i1
    #agent1.sources.r1.interceptors.i1.type = com.nd.bigdata.insight.interceptor.KeyTimestampForKafka$Builder
    
    
    # Each sink's type must be defined
    #agent1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
    #agent1.sinks.k1.topic = insight-test6
    #agent1.sinks.k1.brokerList =192.168.181.120:9092,192.168.181.121:9092,192.168.181.66:9092
    #agent1.sinks.k1.batchSize=100
    #agent1.sinks.k1.requiredAcks = 0
    
    # logger
    #agent1.sinks.k1.type = logger
    #agent1.sinks.k1.channel = c1
    
    #HDFS
    agent1.sinks.k1.type = hdfs
    agent1.sinks.k1.channel = c1
    agent1.sinks.k1.hdfs.path = test/flume/events/
    agent1.sinks.k1.hdfs.filePrefix = events-
    agent1.sinks.k1.hdfs.round = true
    agent1.sinks.k1.hdfs.roundValue = 10
    agent1.sinks.k1.hdfs.roundUnit = minute
    agent1.sinks.k1.hdfs.fileType = DataStream
    
    # Each channel's type is defined.
    agent1.channels.c1.type = memory
    agent1.channels.c1.capacity = 1000
    "flume.conf" 70L, 2291C   
    

    4、启动flume:(文件根目录下启动)

     bin/flume-ng agent --conf conf --conf-file conf/flume.conf --name agent1 -Dflume.root.logger=INFO,console(里面的flume.conf ,agent1 请替换成你自己的名字)

    5、用另一台机试着发送文件

     telnet 192.168.19.107  44445  (创建连接)

    然后发送内容

    6、生成的hdfs文件

  • 相关阅读:
    yarn的工作原理
    105. Construct Binary Tree from Preorder and Inorder Traversal
    3. Longest Substring Without Repeating Characters
    2. Add Two Numbers
    windows相关
    《UNIX环境高级编程》源码配置——apue.3e 安装
    PB数据窗口只存储过程数据源创建
    PB连接ORALCE数据库方法
    PB 计算公式算出结果赋值给另外一列
    PowerBuilder中pbm_keydown()和pbm_dwnkey()的区别:
  • 原文地址:https://www.cnblogs.com/LazyJoJo/p/7356975.html
Copyright © 2020-2023  润新知