• 创建视图


    1.创建视图,视图是个虚表
    create view v1 as select a.id aid,a.name,b.id bid,b.orderno from customers a left outer join default.ttt b on a.id=b.cid;//创建视图字段不要重复,如果有重复需要b.id bid进行重复命名
    2.查看视图
    show tables;
    3.子查询,将查询出来的结果再进行一次过滤
    select aid from v1 ;
    4.笛卡尔积:两个表之间不加任何成分做连接,

    map端连接:
    $hive>set hive.auto.convert.join=true; //设置自动转换连接,默认机试true
    //使用mapjoin连接按时实现map连接
    $hive> select /+mapjoin(customers)/ a.,b. from customers aleft outer join orders b on a.id = b.cid;
    调优

    1.使用explain查看查询计划
    OK
    

    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-0 depends on stages: Stage-1

    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Map Operator Tree:
    TableScan
    alias: customers
    Statistics: Num rows: 1 Data size: 38 Basic stats: COMPLETE Column stats: COMPLETE
    Select Operator
    Statistics: Num rows: 1 Data size: 38 Basic stats: COMPLETE Column stats: COMPLETE
    Group By Operator
    aggregations: count()
    mode: hash
    outputColumnNames: _col0
    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
    Reduce Output Operator
    sort order:
    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
    value expressions: _col0 (type: bigint)
    Reduce Operator Tree:
    Group By Operator
    aggregations: count(VALUE._col0)
    mode: mergepartial
    outputColumnNames: _col0
    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
    File Output Operator
    compressed: false
    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
    table:
    input format: org.apache.hadoop.mapred.SequenceFileInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

    Stage: Stage-0
    Fetch Operator
    limit: -1
    Processor Tree:
    ListSink

    2、设置limit优化策略,避免全部查询
    hive>set hive.limit.optimize.enable=true;
    
    设置本地模式运行
    $hive>set mapred.job.tracker=local;
    //设置为自动本地模式
    $hive>set hive.exec.mode.local.auto=true;
    //并行执行
    $hive>set hive.exec.parallel=true;    //设置并行查询,只是执行不存在依赖关系的阶段
    //严格模式:至少指定一个静态分区,不能都是动态的
    $hive>set hive.mapred.mode =strict    //1.严格模式一定要指定分区表,否则不让查询
                                                                //2.order by时必须使用limit子句
                                                                //不允许笛卡尔积查询
    //设置MR的数量
  • 相关阅读:
    WPF自定义RoutedEvent事件代码段
    微信聊天和通讯录按钮样式
    【转】c#、wpf 字符串,color,brush之间的转换
    模块式开发
    通过属性进行浅拷贝
    【转】string.Format对C#字符串格式化
    酷学习笔记——前言
    C# 连蒙带骗不知所以然的搞定USB下位机读写
    C# Ioc、DI、Unity、TDD的一点想法和实践
    虹软人脸识别速度测试征集
  • 原文地址:https://www.cnblogs.com/stone-learning/p/9284009.html
Copyright © 2020-2023  润新知