使用Sqoop1将MySQL 导入数据到 HDFS
本篇文章主要介绍使用Sqoop1将MySQL 导入数据到 HDFS。
说明:
1、Sqoop 与数据库进行通信,获取数据库表的元数据信息
2、Sqoop启动一个Map-Only的MapReduce作业,利用元数据信息并行将数据写入Hadoop
简要步骤:
①、mysql中创建数据库sqoop
②、创建表dept和emp表
③、插入数据
④、从mysql导出数据到HDFS
⑤、查看数据
⑥、导入过程遇到的错误以及解决方案
详细步骤:
一、mysql中创建数据库sqoop
create database sqoop;
use sqoop;
二、创建表dept和emp表
CREATE TABLE DEPT(
DEPTNO int(2) PRIMARY KEY,
DNAME VARCHAR(14),
LOC VARCHAR(13)
);
CREATE TABLE EMP(
EMPNO int(4) PRIMARY KEY,
ENAME VARCHAR(10),
JOB VARCHAR(9),
MGR int(4),
HIREDATE DATE,
SAL int(7),
COMM int(7),
DEPTNO int(2),
foreign key(deptno) references DEPT(DEPTNO)
);
三、插入数据
use sqoop;
INSERT INTO DEPT VALUES
(10,'ACCOUNTING','NEW YORK');
INSERT INTO DEPT VALUES (20,'RESEARCH','DALLAS');
INSERT INTO DEPT VALUES
(30,'SALES','CHICAGO');
INSERT INTO DEPT VALUES
(40,'OPERATIONS','BOSTON');
INSERT INTO EMP VALUES
(7369,'SMITH','CLERK',7902,'1980-12-17',800,NULL,20);
INSERT INTO EMP VALUES
(7499,'ALLEN','SALESMAN',7698,'1981-2-20',1600,300,30);
INSERT INTO EMP VALUES
(7521,'WARD','SALESMAN',7698,'1981-2-22',1250,500,30);
INSERT INTO EMP VALUES
(7566,'JONES','MANAGER',7839,'1981-4-2',2975,NULL,20);
INSERT INTO EMP VALUES
(7654,'MARTIN','SALESMAN',7698,'1981-9-28',1250,1400,30);
INSERT INTO EMP VALUES
(7698,'BLAKE','MANAGER',7839,'1981-5-1',2850,NULL,30);
INSERT INTO EMP VALUES
(7782,'CLARK','MANAGER',7839,'1981-6-9',2450,NULL,10);
INSERT INTO EMP VALUES
(7788,'SCOTT','ANALYST',7566,'87-7-13',3000,NULL,20);
INSERT INTO EMP VALUES
(7839,'KING','PRESIDENT',NULL,'1981-11-17',5000,NULL,10);
INSERT INTO EMP VALUES
(7844,'TURNER','SALESMAN',7698,'1981-9-8',1500,0,30);
INSERT INTO EMP VALUES
(7876,'ADAMS','CLERK',7788,'87-7-13',1100,NULL,20);
INSERT INTO EMP VALUES
(7900,'JAMES','CLERK',7698,'1981-12-3',950,NULL,30);
INSERT INTO EMP VALUES
(7902,'FORD','ANALYST',7566,'1981-12-3',3000,NULL,20);
INSERT INTO EMP VALUES
(7934,'MILLER','CLERK',7782,'1982-1-23',1300,NULL,10);
四、从mysql导出数据到HDFS
①导入数据使用import命令,输入如下命令查看帮助说明
sqoop help import
②输入命令导入数据
其中-m 1 参数代表的含义是使用多少个并行,这个参数的值是1,说明没有开启并行功能。
将m参数的数值调为5或者更大,Sqoop就会开启5个进程,同时进行数据的导入操作。
注意:mysql数据库的表中需要有个主键,如果没有主键的话需要手动选取一个合适的拆分字段。
-m 1是map的数量
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --username root --password [email protected] --table EMP -m 1
错误解决方案:
Sqoop 导入数据报错:No columns to generate for ClassWriter
启动mr历史记录命令:
mr-jobhistory-daemon.sh start historyserver
五、查看数据
六、错误以及解决方案
1、Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver
2、https://blog.****.net/zjh_746140129/article/details/84962235