首页 > 系统相关 >SeaTunnel 2.3.6 在Ubuntu环境的安装

SeaTunnel 2.3.6 在Ubuntu环境的安装

时间:2024-08-13 08:56:50浏览次数:13  
标签:2024 SeaTunnel 12 54 08 seatunnel connector Ubuntu 2.3

SeaTunnel 2.3.6 在Ubuntu环境的安装

目录

环境说明

  • SeaTunnel 2.3.6
  • Ubuntu 24.04 LTS
  • sudo User : seatunnel
  • 程序目录:/opt/apache-seatunnel-2.3.6

环境变量

export  SEATUNNEL_HOME=/opt/apache-seatunnel-2.3.6

下载软件

下载SeaTunnel二进制文件
下载地址:https://seatunnel.apache.org/download/

  • apache-seatunnel-2.3.6-bin.tar.gz
    解压文件:
tar -xvf apache-seatunnel-2.3.6-bin.tar.gz

得到:

seatunnel@ubuntu24:/tmp$ ll
drwxr-xr-x 10 seatunnel        seatunnel        4096 Nov  8  2023 apache-seatunnel-2.3.6/

移动文件:

sudo mv apache-seatunnel-2.3.6 /opt/

下载连接器

连接器下载配置

连接器配置列表:
文件路径: apache-seatunnel-2.3.6/config/plugin_config

建议初始下载连接器配置:

--connectors-v2--
connector-cdc-mysql
connector-fake
connector-console
--end--

默认下载连接器配置文件:
默认配置文件包含全部支持的连接器插件,如无必要,不需要全部下载。
config/plugin_config

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
# This mapping is used to resolve the Jar package name without version (or call artifactId)
#
# corresponding to the module in the user Config, helping SeaTunnel to load the correct Jar package.
# Don't modify the delimiter " -- ", just select the plugin you need
--connectors-v2--
connector-amazondynamodb
connector-assert
connector-cassandra
connector-cdc-mysql
connector-cdc-mongodb
connector-cdc-sqlserver
connector-cdc-postgres
connector-cdc-oracle
connector-clickhouse
connector-datahub
connector-dingtalk
connector-doris
connector-elasticsearch
connector-email
connector-file-ftp
connector-file-hadoop
connector-file-local
connector-file-oss
connector-file-jindo-oss
connector-file-s3
connector-file-sftp
connector-file-obs
connector-google-sheets
connector-google-firestore
connector-hive
connector-http-base
connector-http-feishu
connector-http-gitlab
connector-http-github
connector-http-jira
connector-http-klaviyo
connector-http-lemlist
connector-http-myhours
connector-http-notion
connector-http-onesignal
connector-http-wechat
connector-hudi
connector-iceberg
connector-influxdb
connector-iotdb
connector-jdbc
connector-kafka
connector-kudu
connector-maxcompute
connector-mongodb
connector-neo4j
connector-openmldb
connector-pulsar
connector-rabbitmq
connector-redis
connector-druid
connector-s3-redshift
connector-sentry
connector-slack
connector-socket
connector-starrocks
connector-tablestore
connector-selectdb-cloud
connector-hbase
connector-amazonsqs
connector-easysearch
connector-paimon
connector-rocketmq
connector-tdengine
connector-web3j
connector-milvus

下载连接器插件

进入程序目录:

cd /opt/apache-seatunnel-2.3.6

开始下载:

# 推荐
bash bin/install-plugin.sh 
# 或:
./bin/install-plugin.sh 
# 或:
sh bin/install-plugin.sh

注意: 请保证执行器为:bash ,以防解释器是 dash 而导致出错。

下载位置:
apache-seatunnel-2.3.6/connectors/

注: 经测试,SeaTunnel 2.3.4版本及以后 与 SeaTunnel 2.3.3之前 下载连接器路径不同

2.3.3 : apache-seatunnel-2.3.3/connectors/seatunnel
2.3.4 : apache-seatunnel-2.3.4/connectors/
2.3.6 : apache-seatunnel-2.3.6/connectors/

下载连接器加速

使用默认方式下载连接器插件时,可以注意到是从默认的apache仓库下载的。

Downloading from central: https://repo.maven.apache.org/maven2/org/apache/seatunnel/connector-cdc-mysql/2.3.6/connector....

速度很慢。
首次执行 install-plugin.sh 脚本后,可使用 Ctrl+C 终止掉,生成默认的 mavne wrapper 配置,.m2 文件夹配置。
配置 maven 地址:
~/.m2/settings.xml
如果没有此文件可新增。

<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
	<pluginGroups></pluginGroups>
	<proxies></proxies>

	<servers>
	</servers>

<mirrors>
<!-- 阿里云仓库 -->
<mirror>
    <id>alimaven</id>
    <mirrorOf>*</mirrorOf>
    <name>aliyun maven</name>
    <url>https://maven.aliyun.com/repository/central</url>
</mirror>
</mirrors>

<profiles>
</profiles>

</settings>

然后再重新执行:

bash bin/install-plugin.sh 

可注意到已从阿里云仓库进行下载了。

测试SeaTunnel示例批任务

运行示例任务:

./bin/seatunnel.sh --config ./config/v2.batch.config.template -e local

示例运行成功日志:

2024-08-12 08:54:05,670 INFO  [o.a.s.e.c.j.ClientJobProxy    ] [main] - Job (875301094702448641) end with state FINISHED
2024-08-12 08:54:05,707 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] -
***********************************************
           Job Statistic Information
***********************************************
Start Time                : 2024-08-12 08:54:03
End Time                  : 2024-08-12 08:54:05
Total Time(s)             :                   2
Total Read Count          :                  32
Total Write Count         :                  32
Total Failed Count        :                   0
***********************************************

2024-08-12 08:54:05,707 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel-664865] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTTING_DOWN
2024-08-12 08:54:05,713 INFO  [c.h.i.s.t.TcpServerConnection ] [hz.main.IO.thread-in-1] - [localhost]:5801 [seatunnel-664865] [5.1] Connection[id=1, /127.0.0.1:5801->/127.0.0.1:50189, qualifier=null, endpoint=[127.0.0.1]:50189, remoteUuid=4584e8d2-6b2f-4a10-af64-892d2fa897cb, alive=false, connectionType=JVM, planeIndex=-1] closed. Reason: Connection closed by the other side
2024-08-12 08:54:05,714 INFO  [.c.i.c.ClientConnectionManager] [main] - hz.client_1 [seatunnel-664865] [5.1] Removed connection to endpoint: [localhost]:5801:89ddf390-cb35-4347-ab51-c794b2c6a868, connection: ClientConnection{alive=false, connectionId=1, channel=NioChannel{/127.0.0.1:50189->localhost/127.0.0.1:5801}, remoteAddress=[localhost]:5801, lastReadTime=2024-08-12 08:54:05.701, lastWriteTime=2024-08-12 08:54:05.670, closedTime=2024-08-12 08:54:05.710, connected server version=5.1}
2024-08-12 08:54:05,714 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel-664865] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is CLIENT_DISCONNECTED
2024-08-12 08:54:05,718 INFO  [c.h.c.i.ClientEndpointManager ] [hz.main.event-5] - [localhost]:5801 [seatunnel-664865] [5.1] Destroying ClientEndpoint{connection=Connection[id=1, /127.0.0.1:5801->/127.0.0.1:50189, qualifier=null, endpoint=[127.0.0.1]:50189, remoteUuid=4584e8d2-6b2f-4a10-af64-892d2fa897cb, alive=false, connectionType=JVM, planeIndex=-1], clientUuid=4584e8d2-6b2f-4a10-af64-892d2fa897cb, clientName=hz.client_1, authenticated=true, clientVersion=5.1, creationTime=1723452843171, latest clientAttributes=lastStatisticsCollectionTime=1723452843212,enterprise=false,clientType=JVM,clientVersion=5.1,clusterConnectionTimestamp=1723452843154,clientAddress=127.0.0.1,clientName=hz.client_1,credentials.principal=null,os.committedVirtualMemorySize=3176402944,os.freePhysicalMemorySize=3446554624,os.freeSwapSpaceSize=2147479552,os.maxFileDescriptorCount=1048576,os.openFileDescriptorCount=51,os.processCpuTime=4630000000,os.systemLoadAverage=0.240234375,os.totalPhysicalMemorySize=8317079552,os.totalSwapSpaceSize=2147479552,runtime.availableProcessors=2,runtime.freeMemory=277072344,runtime.maxMemory=477626368,runtime.totalMemory=330301440,runtime.uptime=3282,runtime.usedMemory=53229096, labels=[]}
2024-08-12 08:54:05,719 INFO  [c.h.c.LifecycleService        ] [main] - hz.client_1 [seatunnel-664865] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTDOWN
2024-08-12 08:54:05,720 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
2024-08-12 08:54:05,720 INFO  [c.h.c.LifecycleService        ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] [localhost]:5801 is SHUTTING_DOWN
2024-08-12 08:54:05,724 INFO  [c.h.i.p.i.MigrationManager    ] [hz.main.cached.thread-11] - [localhost]:5801 [seatunnel-664865] [5.1] Shutdown request of Member [localhost]:5801 - 89ddf390-cb35-4347-ab51-c794b2c6a868 this master is handled
2024-08-12 08:54:05,729 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] Shutting down connection manager...
2024-08-12 08:54:05,732 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] Shutting down node engine...
2024-08-12 08:54:05,747 INFO  [.c.c.DefaultClassLoaderService] [main] - close classloader service
2024-08-12 08:54:05,747 INFO  [o.a.s.e.s.TaskExecutionService] [event-forwarder-0] - [localhost]:5801 [seatunnel-664865] [5.1] Event forward thread interrupted
2024-08-12 08:54:08,759 INFO  [c.h.i.i.NodeExtension         ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] Destroying node NodeExtension.
2024-08-12 08:54:08,760 INFO  [c.h.i.i.Node                  ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] Hazelcast Shutdown is completed in 3037 ms.
2024-08-12 08:54:08,760 INFO  [c.h.c.LifecycleService        ] [main] - [localhost]:5801 [seatunnel-664865] [5.1] [localhost]:5801 is SHUTDOWN
2024-08-12 08:54:08,760 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed HazelcastInstance ......
2024-08-12 08:54:08,761 INFO  [s.c.s.s.c.ClientExecuteCommand] [main] - Closed metrics executor service ......
2024-08-12 08:54:09,726 INFO  [s.c.s.s.c.ClientExecuteCommand] [Thread-26] - run shutdown hook because get close signal

测试 Mysql-CDC 到 Postgresql

创建测试表

连接 Mysql 数据库,并创建表。

create table test.test_001(id int ,name varchar(100));

编辑任务配置文件

config/stream_mysql_postgresql.config

env {
  job.mode = "STREAMING"
  job.name = "streaming-mysql-pg"
}

source {
  MySQL-CDC {
    base-url = "jdbc:mysql://192.168.8.101:3306/test"
    username = "root"
    password = "123456"
    table-names = ["test.test_001"]
  }
}

sink {
  jdbc {
    url = "jdbc:postgresql://192.168.8.101:5432/postgres"
    driver = "org.postgresql.Driver"
    database = "postgres"
    user = "postgres"
    password = "postgres"
    table = "test.test_001"
    generate_sink_sql = true
  }
}

注意:postgres 不支持跨库直接引用表名。如:登录数据库为 postgres 则不允许直接向表:test.test.test_001 插入数据。
因此,sink 中 jdbc 连接穿中的 database 与表配置中的 database 项要保持一致。

下载数据库驱动

下载MySQL驱动 Postgreql 驱动,并添加到lib目录
如:

mkdir -p ${SEATUNNEL_HOME}/plugins/jdbc/lib/
cp mysql-connector-j-8.2.0.jar ${SEATUNNEL_HOME}/plugins/jdbc/lib/
cp postgresql-42.7.2.jar ${SEATUNNEL_HOME}/plugins/jdbc/lib/

注:

  1. 按照 plugins/README.md 的说明,如果使用 Zeta Engine,请把jdbc drivers放到 $SEATUNNEL_HOME/lib/ 下。
  2. 经实验,驱动放到$SEATUNNEL_HOME/lib/下,需重启集群模式,否则加载不到。而plugins/jdbc/lib为动态加载。

启动集群模式

./bin/seatunnel-cluster.sh -d

启动任务

bash bin/seatunnel.sh --config config/stream_mysql_postgresql.config

TODO:

  1. 发现 bug。postgresql 的目录是 3 级结构:dataabse --> schema --> table ,而 mysql 是 2 级结构:database --> table 。
    如果想同步:mysql 下的 test.test_table 到 postgresql 下的 postgres.test.test_table 自动建表语句将失败。
    前提:
    postgres.test schema 不存在。
    postgres.test.test_table 不存在。
ERROR: database "postgres" already exists

标签:2024,SeaTunnel,12,54,08,seatunnel,connector,Ubuntu,2.3
From: https://www.cnblogs.com/nookvoice/p/18355020

相关文章

  • 夏日狂欢,游戏新体验,植物大战僵尸杂交版 v2.3.5
    ......
  • 遇到安装的Ubuntu系统无法与主机共享剪切板
    Ubuntu遇到的问题遇到安装的Ubuntu系统无法与主机共享剪切板方法一:首先,‌通过运行命令sudoapt-getautoremoveopen-vm-tools卸载旧版本的open-vm-tools,‌然后更新软件源sudoapt-getupdate。‌接着,‌安装open-vm-tools和其桌面组件sudoapt-getinstallopen-vm-tools和sudo......
  • 虚拟机搭建区块链(Ubuntu系统)
    虚拟机搭建区块链搭建区块链网络搭建单群组联盟链第一步安装依赖sudoaptinstall-yopensslcurl第二步.创建操作目录,下载安装脚本##创建操作目录cd~&&mkdir-pfisco&&cdfisco##下载脚本curl-#LOhttps://github.com/FISCO-BCOS/FISCO-BCOS/releases/do......
  • Ubuntu卸载软件
    Ubuntu使用过程中,常常会遇到内存不够用的情况,除了清理数据,也想要卸载一些软件包以尽量释放空间。1.卸载通过apt-get命令安装的软件包apt-get,APT(AdvancedPackagingTool)包处理实用程序,一个命令行接口。用于从经过身份验证的源中检索软件包和相关软件包的信息,以及安装、升级......
  • 阿布吞的基础使用——Ubuntu
    Ubuntu是Linux系统的发行版,Linux操作系统中比较流行的一个版本,广泛用于个人电脑、服务器和嵌入式设备。今天来简单讲解一下Ubuntu的基础使用。1. 桌面环境登录:安装完成后,启动计算机,输入用户名和密码登录到Ubuntu桌面环境。桌面界面:Ubuntu使用GNOME桌面环境,界面简洁易......
  • Ubuntu基础入门指南
    简介        Ubuntu是一个基于Debian的Linux发行版,以其易用性和强大的社区支持而闻名。无论你是初学者还是有经验的用户,Ubuntu都能提供丰富的功能和友好的用户体验。本博客将带你了解Ubuntu的基础知识,帮助你快速入门。安装Ubuntu        安装Ubuntu相对简......
  • 【Linux学习】Ubuntu配置
    1、如何在Ubuntu18.04上面安装VMware-tools实现屏幕适配,以及文件拖拽、复制、粘贴功能先设置以下:此处一定要设置路径保证客户机隔离选项两个勾选将主机桌面文件夹设置为共享 点击VMware顶部菜单,“虚拟机”>“安装VMwareTools”,桌面会出现光盘图标“VMwareTools”......
  • ubuntu网卡驱动修复
    问题引起更换显卡驱动时频繁重启,突然发现右上角没有网络图标了。尝试法1网络上查到的大多与NetworkManager相关。sudoserviceNetworkManagerstopsudorm/var/lib/NetworkManager/NetworkManager.statesudogedit/etc/NetworkManager/NetworkManager.conf这一步将打......
  • ubuntu 22.04 安装 docker(服务器从毛胚到精装)
    1、用户操作阿里云默认是root用户,我们一般要自己创建一个用户,然后给该用户sudo权限添加用户sudoaddusernewUserName赋予sudo权限sudousermod-aGsudonewUserName删除用户sudodeluser--remove-home--remove-all-filesnewUserName切换用户sudosu-newUse......
  • 研究C++20语法----在ubuntu中安装gcc13和g++13
    前言由于要学习一点C++20的知识点,故需要安装gcc13和g++13Ubuntu默认不能直接下载gcc13和g++13,但是只有g++13和gcc13支持C++20的语法,故想要学习C++20,就必须借助第三方工具下载。默认安装目录:/usr/bin本机安装环境:ubuntu22.4文章目录1、安装build_essential2、添加ppa......