• Hive数据类型


    1.hive常用的数据类型包括:

    2.类型转换

      隐式转换规则:任何整数类型都可以隐式转换为一个范围更广的类型。所有整数类型 + float + String都可以转换为Double类型

      可以使用cast操作进行数据类型显示转换。例如cast('1' as int)把字符串'1'转换成整数值1,转换失败则表达式返回空值NULL。 

    --将Double类型的工资sal转换成Int类型
    select eno,enam,cast(sal as int) from emp limit 5; 

    3.复杂类型

    hive> create table if not exists employees(
        >  name STRING COMMENT 'employee name' ,
        >  salary FLOAT COMMENT 'employee salary',
        >  subordinates ARRAY<STRING> COMMENT 'names of subordinates',
        >  deductions MAP<STRING,FLOAT> COMMENT 'keys are deductions names,values are precentages',
        >  address STRUCT<street:STRING,city:STRING,state:STRING,zip:INT> COMMENT 'home address'
        > )row format delimited
        > fields terminated by '|'
        > collection items terminated by ','
        > map keys terminated by ':';
    OK
    Time taken: 0.174 seconds
    hive> load data local inpath '/root/hive/employees' --导入数据
        > overwrite into table employees;
    Copying data from file:/root/hive/employees
    Copying file: file:/root/hive/employees
    Loading data to table test.employees
    rmr: DEPRECATED: Please use 'rm -r' instead.
    Deleted hdfs://hadoop01:9000/user/hive/warehouse/test.db/employees
    Table test.employees stats: [numFiles=1, numRows=0, totalSize=177, rawDataSize=0]
    OK
    Time taken: 0.465 seconds
    hive> dfs -text hdfs://hadoop01:9000/user/hive/warehouse/test.db/employees/employees; --查看数据文件
    yxy|1000000.0|zhangsan,lisi|federal taxes:.2,state taxes:.5|changanjie,bj,bj,00000
    yangzhen|1000000.0|zhangsan,lisi|federal taxes:.2,state taxes:.5|changanjie,bj,chaoyang,00001
    hive> select name,salary,
        > subordinates[0], --Array类型通过下标取值,字段类型必须相同
        > deductions['state taxes'], --Map类型通过key取值
        > address.state  --Struct类型通过定义时的名称取值,字段类型可以不同
        > from employees;
    OK
    name    salary  _c2     _c3     state
    yxy     1000000.0       zhangsan        0.5     bj
    yangzhen        1000000.0       zhangsan        0.5     chaoyang
    Time taken: 4.607 seconds, Fetched: 2 row(s) 
    --array_contains()函数作用:验证第一个参数集合中是否包含第二个参数
    --由于两条数据的subordinates集合中都包含'zhangsan',所以返回全部
    select * from employees where array_contains(subordinates, 'zhangsan'); 

    4.内置函数

    • 与SQL相同的:等值判断 x='a',空值判断 x is NULL,模式匹配 x like 'A%',算术操作 x+1,逻辑操作 x or y。
    • 其中和SQL-92有点区别的是:||是逻辑或(OR),而不是字符串"连接"。在mysql和hive中字符串拼接是concat()函数。
    • 可以使用:show functions获取函数列表。describe function获得某个特定函数的使用帮助。
  • 相关阅读:
    博客园如何统计个人博客的访问量
    博客园博客如何设置不显示日历,公告,相册,收藏夹等
    [Functional Programming] From simple implementation to Currying to Partial Application
    [Nginx] Configuration for SPA
    [Unit Testing] Using Mockito Annotations
    [Functional Programming] Using Lens to update nested object
    [Functional Programming + React] Provide a reasonable default value for mapStateToProps in case initial state is undefined
    [Angular CLI] Build application without remove dist folder for Docker Volume
    [Spring Boot] Introduce to Mockito
    [Tool] Enable Prettier in VSCode as Format on Save and add config files to gitingore
  • 原文地址:https://www.cnblogs.com/skyl/p/4736164.html
Copyright © 2020-2023  润新知