Hive提供Hive Query language(HQL)用来进行数据操作
Hive查询提供下面的特性
- 数据模型,例如数据库和表的创建
- ETL方法,例如提取,转变,加载数据到表里
- join来合并不同的数据表
- 用户自定义的脚本
- 基于Hadoop快速查询工具
首先创建表
hive> > create table employee(id int, name string, price int, area string) > row format delimited > fields terminated by '|' > location '/yandufeng/employee'; hive> local data local inpath '/home/hive/test.txt' into table employee; hive> > > select * from employee; OK 16 john 4000 USA 17 robert 2000 USA 18 andrew 4000 USA 19 katty 2000 USA 27 edward 4000 UK 29 alan 3000 UK 31 kerry 4000 UK 34 tom 3000 UK 35 zack 2000 UK Time taken: 0.106 seconds, Fetched: 9 row(s)
排序查询
hive> select * from employee order by id desc; Query ID = hive_20170223061813_9d7b31aa-7375-4773-b324-0214e6f55778 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 1.02 s -------------------------------------------------------------------------------- OK 35 zack 2000 UK 34 tom 3000 UK 31 kerry 4000 UK 29 alan 3000 UK 27 edward 4000 UK 19 katty 2000 USA 18 andrew 4000 USA 17 robert 2000 USA 16 john 4000 USA Time taken: 1.541 seconds, Fetched: 9 row(s)
分组查询
hive> > select count(*), area from employee group by area; Query ID = hive_20170223062041_5e496181-62d3-40e5-9202-4183b2facab7 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 4.04 s -------------------------------------------------------------------------------- OK 5 UK 4 USA Time taken: 5.331 seconds, Fetched: 2 row(s)
Sort by:
hive> select * from employee sort by id desc; Query ID = hive_20170223062352_c02cbd86-9e49-4473-a951-5bdad33ca4b9 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 4.87 s -------------------------------------------------------------------------------- OK 35 zack 2000 UK 34 tom 3000 UK 31 kerry 4000 UK 29 alan 3000 UK 27 edward 4000 UK 19 katty 2000 USA 18 andrew 4000 USA 17 robert 2000 USA 16 john 4000 USA Time taken: 5.445 seconds, Fetched: 9 row(s)
Cluster By:
hive> > select * from employee cluster by id; Query ID = hive_20170223062557_c6c1b261-645d-4785-b017-0f4ceda9bb0b Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 3.85 s -------------------------------------------------------------------------------- OK 16 john 4000 USA 17 robert 2000 USA 18 andrew 4000 USA 19 katty 2000 USA 27 edward 4000 UK 29 alan 3000 UK 31 kerry 4000 UK 34 tom 3000 UK 35 zack 2000 UK Time taken: 4.496 seconds, Fetched: 9 row(s)
Distribute By:
hive> select * from employee distribute by id; Query ID = hive_20170223062709_3c81e53e-fbaf-4360-85d5-8573a2c17a7d Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 4.68 s -------------------------------------------------------------------------------- OK 16 john 4000 USA 18 andrew 4000 USA 34 tom 3000 UK 17 robert 2000 USA 19 katty 2000 USA 27 edward 4000 UK 29 alan 3000 UK 31 kerry 4000 UK 35 zack 2000 UK Time taken: 5.211 seconds, Fetched: 9 row(s)
Join查询
hive > select * from employee e join employee p on(e.id=p.id); Query ID = hive_20170223062925_ac84adc4-1daf-417c-8eee-f13ec1a1657e Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Map 2 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 4.64 s -------------------------------------------------------------------------------- OK 16 john 4000 USA 16 john 4000 USA 17 robert 2000 USA 17 robert 2000 USA 18 andrew 4000 USA 18 andrew 4000 USA 19 katty 2000 USA 19 katty 2000 USA 27 edward 4000 UK 27 edward 4000 UK 29 alan 3000 UK 29 alan 3000 UK 31 kerry 4000 UK 31 kerry 4000 UK 34 tom 3000 UK 34 tom 3000 UK 35 zack 2000 UK 35 zack 2000 UK Time taken: 5.907 seconds, Fetched: 9 row(s)
子查询
hive> > select * from employee e where id not in (select id from employee p); Warning: Map Join MAPJOIN[28][bigTable=e] in task 'Map 1' is a cross product Query ID = hive_20170223063132_141d729e-eb03-40ec-a290-4885e79060ec Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1479783640101_0035) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Map 2 .......... SUCCEEDED 1 1 0 0 0 0 Map 3 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 4 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 7.29 s -------------------------------------------------------------------------------- OK Time taken: 8.419 seconds