首页 > 其他分享 >首次尝试SeaTunnel同步Doris至Hive?这些坑你不能不避

首次尝试SeaTunnel同步Doris至Hive?这些坑你不能不避

时间:2024-05-16 10:53:24浏览次数:22  
标签:lang engine SeaTunnel java seatunnel Hive apache org Doris

笔者使用SeaTunnel 2.3.2版本将Doris数据同步到Hive(cdh-6.3.2)首次运行时有如下报错,并附上报错的解决方案:

  1. java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException
  2. java.lang.NoClassDefFoundError: org/apache/thrift/TBase
  3. java.lang.NoClassDefFoundError:org/apache/hadoop/hive/conf/HiveConf
  4. java.lang.NoClassDefFoundError:com/facebook/fb303/FacebookService$Iface
  5. java.lang.OutOfMemoryError: Java heap space

目录:

  1. java.lang.NoClassDefFoundError:org/apache/hadoop/hive/metastore/api/MetaException

    1.1 解决办法

  2. java.lang.NoClassDefFoundError: org/apache/thrift/TBase

    2.1 解决办法

  3. java.lang.NoClassDefFoundError:org/apache/hadoop/hive/conf/HiveConf

    3.1 解决办法

  4. java.lang.NoClassDefFoundError:com/facebook/fb303/FacebookService$Iface

    4.1 解决办法

  5. java.lang.OutOfMemoryError: Java heap space

    5.1 解决办法

1、java.lang.NoClassDefFoundError:org/apache/hadoop/hive/metastore/api/MetaException

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException
        at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:59)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.prepare(HiveSink.java:123)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:190)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:162)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:515)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:170)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.metastore.api.MetaException
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClassWithoutExceptionHandling(SeaTunnelBaseClassLoader.java:56)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader.loadClassWithoutExceptionHandling(SeaTunnelChildFirstClassLoader.java:86)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClass(SeaTunnelBaseClassLoader.java:47)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 11 more

file

1.1 解决办法

原因是缺少对应的包,去Hive里面的lib将包复制到SeaTunnel的包下面即可:

hive-metastore-2.1.1-cdh6.3.2.jar # cdh版本的记得将这个也复制上
hive-metastore.jar

2、java.lang.NoClassDefFoundError: org/apache/thrift/TBase

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/thrift/TBase
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClassWithoutExceptionHandling(SeaTunnelBaseClassLoader.java:56)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader.loadClassWithoutExceptionHandling(SeaTunnelChildFirstClassLoader.java:86)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClass(SeaTunnelBaseClassLoader.java:47)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:59)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.prepare(HiveSink.java:123)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:190)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:162)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:515)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:170)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.ClassNotFoundException: org.apache.thrift.TBase
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

file

2.1 解决办法

是缺少对应的包,去Hive里面的lib将包复制到SeaTunnel的包下面即可

libthrift-0.9.3-1.jar

3、java.lang.NoClassDefFoundError:org/apache/hadoop/hive/conf/HiveConf

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:48)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:74)
        at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:59)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.prepare(HiveSink.java:123)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:190)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:162)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:515)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:170)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClassWithoutExceptionHandling(SeaTunnelBaseClassLoader.java:56)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader.loadClassWithoutExceptionHandling(SeaTunnelChildFirstClassLoader.java:86)
        at org.apache.seatunnel.engine.common.loader.SeaTunnelBaseClassLoader.loadClass(SeaTunnelBaseClassLoader.java:47)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 13 more

file

3.1 解决办法
是缺少对应的包,去Hive里面的lib将包复制到SeaTunnel的包下面即可

hive-common-2.1.1-cdh6.3.2.jar #cdh版本的记得复制过来
hive-common.jar
hive-exec-2.1.1-cdh6.3.2.jar #cdh版本的记得复制过来
hive-exec.jar

4、java.lang.NoClassDefFoundError:com/facebook/fb303/FacebookService$Iface

Exception in thread "main" java.lang.NoClassDefFoundError: com/facebook/fb303/FacebookService$Iface
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:58)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:74)
        at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:59)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.prepare(HiveSink.java:123)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:190)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:162)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:515)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:170)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.ClassNotFoundException: com.facebook.fb303.FacebookService$Iface
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 25 more

4.1 解决办法
是缺少对应的包,这个我的Hive里面的libthrift-0.9.3-1.jar包里面没有,所以报了这个错,去下面这个地址直接下载即可

下载地址:https://repo1.maven.org/maven2/org/apache/thrift/libfb303/0.9.3/libfb303-0.9.3.jar

file

5、java.lang.OutOfMemoryError: Java heap space

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:379)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:4129)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:4115)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:563)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:303)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:219)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:58)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:74)
        at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:59)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.prepare(HiveSink.java:123)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:190)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:162)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:515)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:170)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)

file

5.1 解决办法
去SeaTunnel的config里面增加对应值,截图如下:

file

将值修改大一点就可以了

file

原文链接:https://blog.csdn.net/qq_43224174/article/details/131430223

本文由 白鲸开源 提供发布支持!

标签:lang,engine,SeaTunnel,java,seatunnel,Hive,apache,org,Doris
From: https://www.cnblogs.com/seatunnel/p/18195531

相关文章

  • hive3.1.2概述和基本操作
    1.hive基本概念hive简介hive的本质:Hive本质是将SQL转换为MapReduce的任务进行运算,底层由HDFS来提供数据存储,说白了hive可以理解为一个将SQL转换为MapReduce的任务的工具,甚至更近一步说hive就是一个MapReduce客户端。经常有面试问什么时hive我们可以从两点来回答:1.hive时数据......
  • hive-3.1.2分布式搭建文档
    hive-3.1.2分布式搭建文档1.上传解压配置环境变量#1、解压tar-zxvfapache-hive-3.1.2-bin.tar.gz-C/usr/local/soft/#2、重名名mvapache-hive-3.1.2-binhive-3.1.2#3、配置环境变量vim/etc/profile#4、在最后增加配置exportHIVE_HOME=/usr/local/soft/hiv......
  • Hive分析函数
    ●测试表test1.groupingsets ①未使用②使用groupingsets(与上面等价)【代码实例】查看代码 --todo方式一--所有学校的人数总和select'全学校'asschool,'全年级'asgrade,count(name)asnum,1asgrouping__idfrom......
  • hive on spark
    1Hive的执行引擎Hive:专业的数仓软件,可以高效的读写和管理数据集。  Hive的运行原理:  ①hive主要是写HQL的(类SQL,相似度90%,剩下的10%就是HQL里面一些独有的语法)  ②写的HQL会根据不同的计算引擎翻译成不同的代码 2数仓搭建技术选型SparkOnHive:基于Hive的Spar......
  • Hive计算窗口内的累计值
    一个值得记下来的窗口累计计算办法,使用的情况是:计算某个窗口内的累计值1.ExamplePart1CREATETABLEtest_table(dailyDATE,person_numINT,app_regionSTRING)ROWFORMATDELIMITEDFIELDSTERMINATEDBY'\t'STOREDASTEXTFILE;--一张包含了daily日期、......
  • HiveSQL
    1.表sublime格式化ctrl+kctrl+f--创建学生表DROPTABLEIFEXISTSstudent_info;createtableifnotexistsstudent_info(stu_idstringCOMMENT'学生id',stu_namestringCOMMENT'学生姓名',birthdaystringCOMMENT'出生日期',......
  • Doris、StarRocks 压测对比
    先说结论:0、本次测试,未调优二者的参数,开箱起服务,直接测试的,部署架构一致。1、在单表查询下,StarRocks在部分场景下优于Doris,但是二次查询,二者不分伯仲。2、在多表查询下,仅在一个场景下Doris速度逊于StarRocks,大部分场景是Doris优于StarRocks的。3、在cpu和内存的使用上,dori......
  • Hive中sql语句是如何转换成MapReduce的(面试题)
    Hive中的sql语句是如何转化成MR任务的(面试)元数据存储在数据库中,默认存在自己自带的derby数据库中(derby在Hive启用的时候会占用元数据库,且数据不会共享给客户端,所以1一次只能有一个客户端使用,开了另一个客户端就会连接不上)1)、解析器(SQLParser):将SQL字符串转换成抽象语法树AST(3.......
  • Hive基础命令
    Hive基本操作1、Hive库操作1)创建一个数据库,数据库在HDFS上的默认存储路径是/hive/warehouse/*.dbcreatedatabasetestdb;2)避免要创建的数据库已经存在错误,增加ifnotexists判断。(标准写法)createdatabaseifnotexiststestdb;2、创建数据库以及位置(loccation)create......
  • Hive优化
    hive优化1、hive的随机抓取策略hive中的sql都应该经过解析器,编译器,优化器和执行器产生mapreduce作业进行处理,但是在我们使用过程中,对于一些进行查询之类的任务的时候并没有产生mapreduce任务进行处理,这是因为hive的抓取策略帮我们省略了这个步骤,将split切片的过程体欠安帮我们做......