1、sqoop定义
Sqoop 是一款开源的工具,主要用于在 Hadoop 和传统数据库(MySQL、PostgreSQL 等)之间进行数据传递,可以将一个关系型数据库(例如 MySQL、Oracle、Postgres 等)中的数据导入到 Hadoop 的 HDFS 中,也可以将 HDFS 的数据导入到关系型数据库中。Sqoop 中一大亮点就是可以通过 Hadoop 的 MapReduce 把数据从关系型数据库中导入数据到 HDFS。
2、sqoop测试过程记录
测试时的全纪录可以供以后使用参考,出于安全省略了一些关键的引用版本输出,不影响看全过程的实录内容。
MySQL [mysql]> use test;标签:master1,sqoop,--,纪录,SQOOP,hadoop,EMR,test,MySQL From: https://blog.51cto.com/keep11/6037303
Database changed
MySQL [test]> show tables;
Empty set (0.00 sec)
MySQL [test]> create table sqoop_test(id int not null primary key auto_increment, title varchar(64), time timestamp, content varchar(255));
Query OK, 0 rows affected (0.01 sec)
MySQL [test]> insert into sqoop_test values(null, 'first', now(), 'hdfs');
Query OK, 1 row affected (0.00 sec)
MySQL [test]> insert into sqoop_test values(null, 'second', now(), 'mr');
Query OK, 1 row affected (0.00 sec)
MySQL [test]> insert into sqoop_test values(null, 'third', now(), 'yarn');
Query OK, 1 row affected (0.00 sec)
MySQL [test]> select * from sqoop_test;
+----+--------+---------------------+---------+
| id | title | time | content |
+----+--------+---------------------+---------+
| 1 | first | 2022-08-29 11:56:19 | hdfs |
| 2 | second | 2022-08-29 11:56:27 | mr |
| 3 | third | 2022-08-29 11:56:34 | yarn |
+----+--------+---------------------+---------+
3 rows in set (0.00 sec)
MySQL [test]> exit
Bye
[root@emr-master1 ~]# su hadoop
[hadoop@emr-master1 root]$ cd
[hadoop@emr-master1 ~]$ sqoop-import --connect jdbc:mysql://x.x.x.x/test --username root -P --table sqoop_test --target-dir /sqoop
Warning: /usr/local/service/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
省略输出.........
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Enter password: [hadoop@^Cr-master1 ~]$ sqoop-import --connect jdbc:mysql://x.x.x.x/test --username root -P --table sqoop_test --target-dir /sqoop
[hadoop@emr-master1 ~]$ /bin/sqoop-import --connect jdbc:mysql://x.x.x.x/test --username root -P --table sqoop_test --target-dir /sqoop
bash: /bin/sqoop-import: No such file or directory
[hadoop@emr-master1 ~]$ which sqoop-import
/usr/local/service/sqoop/bin/sqoop-import
[hadoop@emr-master1 ~]$ /usr/local/service/sqoop/bin/sqoop-import --connect jdbc:mysql://x.x.x.x/test --username root -P --table sqoop_test --target-dir /sqoop
Warning: /usr/local/service/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
省略输出.........
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Enter password:
Note: /xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Note: Recompile with -Xlint:deprecation for details.
[hadoop@emr-master1 ~]$ hadoop fs -cat /sqoop/*
1,first,2022-08-29 11:56:19.0,hdfs
2,second,2022-08-29 11:56:27.0,mr
3,third,2022-08-29 11:56:34.0,yarn
[hadoop@emr-master1 ~]$ mysql -h x.x.x.x -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 191754
Server version: 5.6.28-cdb2016-log 20220302
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [test]> show tables;
+----------------+
| Tables_in_test |
+----------------+
| sqoop_test |
+----------------+
1 row in set (0.00 sec)
MySQL [test]> create table sqoop_test_back(id int not null primary key auto_increment, title varchar(64), time timestamp, content varchar(255));
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> show tables;
+-----------------+
| Tables_in_test |
+-----------------+
| sqoop_test |
| sqoop_test_back |
+-----------------+
2 rows in set (0.00 sec)
MySQL [test]> exit
Bye
[hadoop@emr-master1 ~]$ which sqoop-export
/usr/local/service/sqoop/bin/sqoop-export
[hadoop@emr-master1 ~]$ /usr/local/service/sqoop/bin/sqoop-export --connect jdbc:mysql://x.x.x.x/test --username root -P --table sqoop_test_back --export-dir /sqoop
Warning: /usr/local/service/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Enter password:
Note: /tmp/sqoop-hadoop/compile/4256a084d8c542ff03767ae114553d13/sqoop_test_back.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
0 [main] INFO com.hadoop.compression.lzo.GPLNativeCodeLoader - Loaded native gpl library from the embedded binaries
25 [main] INFO com.hadoop.compression.lzo.LzoCodec -
[hadoop@emr-master1 ~]$ mysql -h x.x.x.x -uroot -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 191880
Server version: 5.6.28-cdb2016-log 20220302
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [test]> select * from sqoop_test_back;
+----+--------+---------------------+---------+
| id | title | time | content |
+----+--------+---------------------+---------+
| 1 | first | 2022-08-29 11:56:19 | hdfs |
| 2 | second | 2022-08-29 11:56:27 | mr |
| 3 | third | 2022-08-29 11:56:34 | yarn |
+----+--------+---------------------+---------+
3 rows in set (0.00 sec)
MySQL [test]>