问题现象
Unable to find config file hive-site.xml
Unable to find config file hivemetastore-site.xml
Unable to find config file metastore-site.xml
本文记录这个问题是如何导致的,并记录如何向 Hive、Hudi 提供 hive-site.xml 以便正确加载。
问题分析: HiveMetaStore 是如何查找配置文件路径的
位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#findConfigFile
private static URL findConfigFile(ClassLoader classLoader, String name) {
// First, look in the classpath
URL result = classLoader.getResource(name);
if (result == null) {
// Nope, so look to see if our conf dir has been explicitly set
result = seeIfConfAtThisLocation("METASTORE_CONF_DIR", name, false);
if (result == null) {
// Nope, so look to see if our home dir has been explicitly set
result = seeIfConfAtThisLocation("METASTORE_HOME", name, true);
if (result == null) {
// Nope, so look to see if Hive's conf dir has been explicitly set
result = seeIfConfAtThisLocation("HIVE_CONF_DIR", name, false);
if (result == null) {
// Nope, so look to see if Hive's home dir has been explicitly set
result = seeIfConfAtThisLocation("HIVE_HOME", name, true);
if (result == null) {
// Nope, so look to see if we can find a conf file by finding our jar, going up one
// directory, and looking for a conf directory.
URI jarUri = null;
try {
jarUri = MetastoreConf.class.getProtectionDomain().getCodeSource().getLocation().toURI();
} catch (Throwable e) {
LOG.warn("Cannot get jar URI", e);
}
result = seeIfConfAtThisLocation(new File(jarUri).getParent(), name, true);
// At this point if we haven't found it, screw it, we don't know where it is
if (result == null) {
LOG.info("Unable to find config file " + name);
}
}
}
}
}
}
LOG.info("Found configuration file " + result);
return result;
}
显然是因为 classpath 没有,METASTORE_CONF_DIR、METASTORE_HOME、HIVE_CONF_DIR、HIVE_HOME, 这些位置相应的都没有
并且甚至 MetastoreConf 类所在的 jar 包内也没有
寻找原因:为什么所有的位置都没有读取到 hive-site.xml
位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#newMetastoreConf
if(hiveSiteURL == null) {
/*
* this 'if' is pretty lame - QTestUtil.QTestUtil() uses hiveSiteURL to load a specific
* hive-site.xml from data/conf/<subdir> so this makes it follow the same logic - otherwise
* HiveConf and MetastoreConf may load different hive-site.xml ( For example,
* HiveConf uses data/conf/spark/hive-site.xml and MetastoreConf data/conf/hive-site.xml)
*/
hiveSiteURL = findConfigFile(classLoader, "hive-site.xml");
}
if (hiveSiteURL != null) {
conf.addResource(hiveSiteURL);
}
当 hiveSiteURL 静态变量未设置的时候,才调用 findConfigFile,这个是正常情况。
Flink 相关的
位置:
org.apache.flink.table.catalog.hive.HiveCatalog.createHiveConf
// ignore all the static conf file URLs that HiveConf may have set
HiveConf.setHiveSiteLocation(null);
结论:
- Flink 清理了这个静态变量,导致进入 findConfigFile。
- MetastoreConf 看样子不支持 HDFS上的 hive-site.xml
- Flink 如果new了HiveCatalog,一定导致查找过程
CLASSPATH 分析
Flink的 CLASSPATH 已经提供了为何仍然加载不了 hive-site.xml
lib/hive-site.xml
但 classLoader.getResource(name); 仍然加载不了,推测是因为 name应当是 "lib/hive-site.xml" 才能正确加载 ?
结论:
需要指定 HIVE_CONF_DIR
解决方案
给 Flink 程序传入 HIVE_CONF_DIR,那么具体怎么做的?可以参考 kyuubi
即:
-Dcontainerized.master.env.HIVE_CONF_DIR=/etc/hive/conf
标签:xml,hudi,HIVE,hive,site,result,conf
From: https://www.cnblogs.com/slankka/p/18456896