首先需要安装flume，我选择的是1.9.0版本，然后对于配置文件只需要配置相关的环境和jdk即可

flume-env.sh

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced
# during Flume startup.

# Enviroment variables can be set here.

export JAVA_HOME=/usr/local/java8/jdk1.8.0_371

# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
# export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"

# Let Flume write raw event data and configuration information to its log files for debugging
# purposes. Enabling these flags is not recommended in production,
# as it may result in logging sensitive user information or encryption secrets.
# export JAVA_OPTS="$JAVA_OPTS -Dorg.apache.flume.log.rawdata=true -Dorg.apache.flume.log.printconfig=true "

# Note that the Flume conf directory is always included in the classpath.
#FLUME_CLASSPATH=""

使用 Avro 数据源测试 Flume

先创建文件helloword.txt和avro.conf

helloword.txt

Hello World
Hello Flume

avro.conf

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

flume-ng agent -c /usr/local/flume-1.9.0/conf -f /usr/local/flume-1.9.0/conf/avro.conf -n a1 Dflume.root.logger=INFO,console

/usr/local/flume-1.9.0/bin/flume-ng avro-client -H localhost -p 4141 -F /home/bill/test/helloworld.txt

使用 netcat 数据源测试 Flume

创建netcat.conf

# netcat.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

yum list telnet* # 列出telnet相关的安装包
yum install telnet-server # 安装telnet服务
yum install telnet.* # 安装telnet客户端

flume-ng agent -c /usr/local/flume-1.9.0/conf -f /usr/local/flume-1.9.0/conf/netcat.conf -n a1 -Dflume.root.logger=INFO,console

telnet localhost 44444

标签：flume,channels,r1,sources,实践,a1,Streaming,Spark,c1
From： https://www.cnblogs.com/liyiyang/p/18026232

Docker Exec 命令详解与实践指南
简介DockerExec是Docker中一个非常有用的命令，它允许您在正在运行的容器内部执行命令。这对于调试、管理和与容器进行交互非常有帮助。在本篇文章中，我们将深入探讨DockerExec命令的使用方法，并提供一些实用的示例，旨在帮助初学者更好地理解和运用这一功能。什么是DockerE......
Spark中RDD阶段划分
分析源码步骤：第一步程序入口：第二步一直查看runjob方法，可以看出collect()是RDD行动算子，与Job运行提交相关rdd.scala sparkcontext.scala sparkcontext.scala sparkcontext.scala 第三步runJob()与DAG调度有关sparkcontext.scala第四步runJob()核心代码-......
spark实验五Spark SQL
1．SparkSQL基本操作将下列JSON格式数据复制到Linux系统中，并保存命名为employee.json。{"id":1,"name":"Ella","age":36}{"id":2,"name":"Bob","age":29}{"id":3,"name"......
spark实验四RDD 编程初级实践
1．spark-shell交互式编程请到本教程官网的“下载专区”的“数据集”中下载chapter5-data1.txt，该数据集包含了某大学计算机系的成绩，数据格式如下所示：Tom,DataBase,80Tom,Algorithm,50Tom,DataStructure,60Jim,DataBase,90Jim,Algorithm,60Jim,DataStructure,80……请根......
百度搜索exgraph图执行引擎设计与实践
导读百度搜索exgraph图执行引擎设计重点分成三个部分：图描述语言、图执行引擎、对接扩展。图描述语言是一种基于文本可读的图描述语言，用于描述任务中的算子以及算子之间的依赖关系，即让人可以理解，也可以被计算机理解并执行。图执行引擎是exgraph的核心，负责根据图描述语言生成的......
spark编写WordCount代码（scala）
代码demopackagecom.spark.wordcountimportorg.apache.spark.SparkContextimportorg.apache.spark.SparkContext._importorg.apache.spark.SparkConfobjectWordCount{defmain(args:Array[String]){//文件位置valinputFile="hdfs://192.168.10......
spark为什么比mapreduce快？
spark为什么比mapreduce快？首先澄清几个误区：1：两者都是基于内存计算的，任何计算框架都肯定是基于内存的，所以网上说的spark是基于内存计算所以快，显然是错误的2;DAG计算模型减少的是磁盘I/O次数（相比于mapreduce计算模型而言），而不是shuffle次数，因为shuffle是根据数据重组的次数而定，所......
2024年，提升Windows开发和使用体验的实践经验 - RIME输入法
前言上一篇文章介绍了Windows下的包管理器，本文继续介绍输入法。事实上Windows的输入法生态比Linux/Mac丰富很多，不过很多国产输入法存在窃取隐私、植入广告、乱安装流氓软件等问题，现在有开源的RIME输入法可以选择，何必受这气呢......
2024年，提升Windows开发和使用体验的一些实践 - 包管理器篇
前言短暂的春节假期转瞬即逝，忙碌的一年又要开启了......
2月18日 spark实验三 hadoop和spark的安装和使用
1．安装Hadoop和Spark进入Linux系统，参照本教程官网“实验指南”栏目的“Hadoop的安装和使用”，完成Hadoop伪分布式模式的安装。完成Hadoop的安装以后，再安装Spark（Local模式）。2．HDFS常用操作使用hadoop用户名登录进入Linux系统，启动Hadoop，参照相关Hadoop书籍或网......

Spark实践之Spark Streaming