创建表
创建内表
create table customer(
customerId int,
firstName string,
lastName STRING,
birstDay timestamp
) row format delimited fields terminated by ','
创建外表
CREATE EXTERNAL table salaries(
gender string,
age int ,
salary DOUBLE,
zip int
)row format delimited fields terminated by ',' LOCATION '/user/train/salaries/';
载入数据
load DATA LOCAL inpath '/root/user/customer.txt' overwrite into table customer;
load DATA LOCAL inpath '/root/user/salaries.txt' overwrite into table salaries;
查看文本数据
[root@centos172 user]# cat /root/user/customer.txt
1,f,jack,,
2,f,luccy,,
[root@centos172 user]# cat /root/user/salaries.txt
male,21,10000,1
female,22,12000,2
查看数据库数据
hive> desc customer;
OK
customerid int
firstname string
lastname string
birstday timestamp
Time taken: 0.053 seconds, Fetched: 4 row(s)
hive> desc salaries;
OK
gender string
age int
salary double
zip int
Time taken: 0.041 seconds, Fetched: 4 row(s)
hive> select * from customer;
OK
1 f jack NULL
2 f luccy NULL
Time taken: 0.067 seconds, Fetched: 2 row(s)
hive> select * from salaries;
OK
male 21 10000.0 1
female 22 12000.0 2
Time taken: 0.066 seconds, Fetched: 2 row(s)
hive>
区别
因为我hive也是刚开始了解,所以只讲一部分
1.内表主要放在hdfs中默认的hive目录。外表指定了location
2.删除内表,重新创建一个一样的内表,数据不会装载
删除外表,重新创建一个一样的外表,数据会自动的装载
删除外表的操作如下
hive> drop table salaries;
OK
Time taken: 0.092 seconds
hive> select * from salaries;
FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'salaries'
hive> show tables;
OK
customer
Time taken: 0.035 seconds, Fetched: 1 row(s)
hive> CREATE EXTERNAL table salaries(
> gender string,
> age int ,
> salary DOUBLE,
> zip int
> )row format delimited fields terminated by ',' LOCATION '/user/train/salaries/';
OK
Time taken: 0.058 seconds
hive> show tables;
OK
customer
salaries
Time taken: 0.025 seconds, Fetched: 2 row(s)
hive> select * from salaries;
OK
male 21 10000.0 1
female 22 12000.0 2
Time taken: 0.058 seconds, Fetched: 2 row(s)
hive>
区别1的:
内表的默认路径
指定外表的路径如图:
hive是什么
我当前接触到就是:
1.把hdf文件具体为table
2.用来查询,类似sql语句处理