在大数据时代,组织通常需要处理存储在不同系统和格式中的大量数据。Sqoop:是apache旗下一款“Hadoop和关系数据库服务器之间传送数据”的工具,是一个强大的数据传输工具,可以在关系型数据库和Apache Hadoop生态系统组件之间提供无缝的数据导入和导出。
功能:
导入数据:MySQL,Oracle导入数据到hadoop的HDFS、HIVE、HBASE等数据存储系统;
导出数据:从hadoop的文件系统中导出数据到关系数据库;
环境要求:具有java和hadoop的环境
安装步骤
1.wget https://archive.apache.org/dist/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz # 只用到里面的jar包
wget https://archive.apache.org/dist/sqoop/1.4.7/sqoop-1.4.7.tar.gzwget https://archive.apache.org/dist/sqoop/1.4.7/sqoop-1.4.7.tar.gz
2:提取sqoop-1.4.7.bin__hadoop-2.6.0根目录下的sqoop-1.4.7.jar放到sqoop-1.4.7根目录.提取出sqoop-1.4.7.jar放在hadoop的lib下
3:提取lib目录下的这三个必须的jar包放到sqoop-1.4.7/lib/目录下,正常纯净版sqoop的lib目录下是没有文件的。如果没有从网上下载传上去
修改配置文件
cd /export/software/sqoop-1.4.7/conf
复制文件:
cp sqoop-env-template.sh sqoop-env.sh
编辑文件:vim sqoop-env.sh
在文件尾部添加:
export HADOOP_COMMON_HOME=/export/software/hadoop-3.2.4
export HADOOP_MAPRED_HOME=/export/software/hadoop-3.2.4
export HIVE_HOME=/export/software/hive3.1.3
export ZOOKEEPER_HOME=/export/software/zookeeper-3.4.14
export ZOOCFGDIR=/export/software/zookeeper-3.4.14/conf
保存退出,重新生效:
source sqoop-env.sh
进入Sqoop安装目录的lib目录
cd ../lib
添加MySQL的连接驱动包
之前上传过,版本为mysql-connector-java-5.1.27.jar
配置环境变量:
vim /etc/profile
在文件末尾添加:
export SQOOP_HOME=/export/software/sqoop-1.4.7
export PATH=$PATH:$SQOOP_HOME/bin
保存退出,重新生效:
source /etc/profile
检查是否配置正确:
sqoop version
启动Sqoop作业时会出现下面的警告信息:
Warning: /opt/modules/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/modules/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/modules/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
解决方法:
进入$SQOOP_HOME/bin下,修改configure-sqoop文件,将下面的内容注释掉:
##user define note#####
## Moved to be a runtime check in sqoop.
if false;then
if [ ! -d "${HCAT_HOME}" ]; then
echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
fi
if [ ! -d "${ACCUMULO_HOME}" ]; then
echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
fi
if [ ! -d "${ZOOKEEPER_HOME}" ]; then
echo "Warning: $ZOOKEEPER_HOME does not exist! Accumulo imports will fail."
echo 'Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.'
fi
##
fi
再次启动作业的时候,就不会有警告的信息了。
标签:1.4,bin,Sqoop,sqoop,hadoop,介绍,export,HOME,安装
From: https://blog.csdn.net/sadfasdfsafadsa/article/details/141744920