• Hive 外部表的练习(多表关联查询,以及分组,子查询)


                                        Hive 外部表的练习

    hive创建库和表操作

       hive加载数据,4种发放

             1.数据放在Linux创建表结构 进行加载

            2.先定义表结构、将一个查询select语句结果插入到表中

            3.数据先放在hdfs 创建表结构 进行加载(hive 只能加载目标文件的上级目录)

            4.外部数据 external 内部表和外部表 使用上没有任何区别,删除时则有差别


    数据:


    创建表,以及添加数据:

    create external table if not exists my_course(
    courseid string,
    couresename string
    )
    row format delimited fields terminated by ','
    stored as textfile
    location '/hive/my_course';
    
    create external table if not exists my_source(
    userid string
    courseid string,
    score string
    )
    
    row format delimited fields terminated by ','
    stored as textfile 
    location '/hive/my_sourcet'
    
    
    create external table if not exists my_student(
    userid string
    name string
    sex string
    age string,
    xi string
    )
    
    row format delimited fields terminated by ','
    stored as textfile 
    location '/hive/my_student'


    1.问题:查询全体学生的学号与姓名

    select 
     userid,name
    from my_student;



    2.问题:查询选修了课程的学生姓名和课程名称

    select student.name,t.couresename
    from(
    select course.couresename couresename,score.userid userid
    from my_score score,my_course course
    where score.courseid=course.courseid) t,my_student student
    where t.userid=student.userid;


    3.问题:查询每个选修的课程共选了多少人

    select t.couresename,count(*) num
    from(
    select course.couresename couresename,score.userid userid
    from my_score score,my_course course
    where score.courseid=course.courseid) t,my_student student
    where t.userid=student.userid
    group by t.couresename
    order by num desc
    limit 3;


    4.问题:查询学生的总人数

    select count(distinct(userid))
    from my_student;
    


    5.问题:计算数据库课程的学生平均成绩

    select course.couresename,avg(score.score)
    from my_score score,my_course course
    where course.couresename='数据库' and score.courseid=course.courseid
    group by course.couresename;



    6.问题:查询选修数学课程的学生最高分数

    select max(score.score)
    from my_score score,my_course course
    where course.couresename='数学' and score.courseid=course.courseid;


    7.问题:查询选修了3门以上的课程的学生姓名

    select student.name,t.num
    from(
    select userid,count(*) num 
    from my_score 
    group by userid ) t,my_student student
    where t.userid =student.userid and t.num>=3;



    8.问题:按照年龄排序并直接输出到不同的文件中

    create table if not exists result2(
    userid string,
    name string,
    sex string,
    age string,
    xi string
    )
    row format delimited fields terminated by ','
    stored as textfile;
    
    insert into result2
    select *
    from my_student
    order by age desc;



    9.问题:查询学生的得分情况。

    select student.name,score.score
    from my_score score,my_student student 
    where  score.userid=student.userid;



    10.问题:查询选修信息系统课程且成绩在90分以上的所有学生。

    select distinct(t2.name)
    from(
    select score.userid userid
    from my_score score,my_course course
    where score.score>=90 and score.courseid=course.courseid) t1,my_student t2
    where t1.userid=t2.userid;



    11.问题:查询与“刘晨”在同一个系学习的其他学生

    select t3.name
    from(
    select t2.name name
    from my_student t1,my_student t2
    where t1.name='刘晨' and t1.xi=t2.xi) t3
    where t3.name<>'刘晨';
    


  • 相关阅读:
    一系列视频教程 收藏
    生成一个4位整数
    spring 实现定时任务
    判断字符串是否包含汉字
    pmd代码安全扫描工具
    IntelliJ IDEA
    李小龙传奇
    checkmarx使用笔记、原理
    pmd 使用笔记
    Mysql的安装(视频+部分视频截图)
  • 原文地址:https://www.cnblogs.com/meiLinYa/p/9302993.html
Copyright © 2020-2023  润新知