大数据学习6之分布式日志收集框架Flume——Flume实战应用之从指定的网络端口采集数据输出到控制台

标签：Flume flume java agent source conf 控制台日志 channel

从指定的网络端口采集数据输出到控制台

进入官网，查看文档，setting up an agent，看到a simple example
使用Flume的关键就是写flume的agent配置文件
1. 配置source
2. 配置channel
3. 配置sink
4. 把以上三个组件串起来

文章目录

从指定的网络端口采集数据输出到控制台

（1）例如：写一个example.conf配置文件，放置到flume的conf文件夹下
（2）启动agent，可见官网文档starting an agent
（3）使用telnet进行进行测试

（1）例如：写一个example.conf配置文件，放置到flume的conf文件夹下

# example.conf: A single-node Flume configuration

# Name the components on this agent     
# 解析：a1代表agent的名称，r1代表source的名称，k1代表sink的名称，c1代表channel的名称
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
# 详见官网 NetCat TCP Source配置说明
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop000
a1.sources.r1.port = 44444

# Describe the sink
# 详见官网 logger sink配置说明
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
# 详见官网 memory channel
a1.channels.c1.type = memory
#a1.channels.c1.capacity = 1000
#a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
# source-channel-sink      一个source可以输出到多个channel，一个channel只能输出到一个sink
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

（2）启动agent，可见官网文档starting an agent

flume-ng agent命令用于启动一个agent
-n 即–name，代表agent名称，必填
-c 即–conf，代表在具体哪个目录下使用configs配置文件
-f 即–conf-file，代表指定使用哪个config配置文件的具体位置
-Dflume.root.logger=INFO,console 把日志信息打印到控制台上

$ bin/flume-ng agent 
-n $agent_name -c conf -f conf/flume-conf.properties.template

例如：

flume-ng agent \
--name a1 \
--conf $FLUME_HOME/conf   \
--conf-file $FLUME_HOME/conf/example.conf \
-Dflume.root.logger=INFO,console

报错：

2021-04-04 14:54:28,745 (lifecycleSupervisor-1-5) [ERROR - org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:253)] Unable to start EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:r1,state:IDLE} } - Exception follows.
org.apache.flume.FlumeException: java.net.BindException: Cannot assign requested address
  at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173)
  at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
  at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.BindException: Cannot assign requested address
  at sun.nio.ch.Net.bind0(Native Method)
  at sun.nio.ch.Net.bind(Net.java:433)
  at sun.nio.ch.Net.bind(Net.java:425)
  at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
  at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
  at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167)
  ... 9 more

解决：

hadoop000对应的ip地址不正确

大数据学习6之分布式日志收集框架Flume——Flume实战应用之从指定的网络端口采集数据输出到控制台_.net

（3）使用telnet进行进行测试

使用git bash或其他终端远程连接虚拟机

telnet hadoop000 44444

输入内容即可

大数据学习6之分布式日志收集框架Flume——Flume实战应用之从指定的网络端口采集数据输出到控制台_大数据_02

此时Flume收集到信息并打印到控制台中

大数据学习6之分布式日志收集框架Flume——Flume实战应用之从指定的网络端口采集数据输出到控制台_flume_03

其中Event是Flume中数据传输的基本单元，

Event = 可选的header + byte array

标签：Flume,flume,java,agent,source,conf,控制台,日志,channel
From： https://blog.51cto.com/u_12528551/5900314