• Hive之数据类型struct的使用


    hive (student_test)> use student;
    OK
    Time taken: 0.015 seconds
    hive (student)> create table if not exists student_test(
                  > id int comment 'the number of a student',
                  > basic_info struct<name:string,age:int> comment 'the basic information of a student')
                  > row format delimited fields terminated by ','
                  > collection items terminated by ':';
    OK
    Time taken: 0.664 seconds
    hive (student)> show tables;
    OK
    student_test
    Time taken: 0.051 seconds
    hive (student)> describe student_test;
    OK
    id	int	the number of a student
    basic_info	struct<name:string,age:int>	the basic information of a student
    Time taken: 0.129 seconds
    
    hive (student)> describe extended student_test;
    OK
    id	int	the number of a student
    basic_info	struct<name:string,age:int>	the basic information of a student
    	 	 
    Detailed Table Information	Table(tableName:student_test, dbName:student, owner:landen, createTime:1364541810, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:the number of a student), FieldSchema(name:basic_info, type:struct<name:string,age:int>, comment:the basic information of a student)], location:hdfs://localhost:9000/home/landen/UntarFile/hive-0.10.0/user/hive/warehouse/student.db/student_test, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{colelction.delim=:, serialization.format=,, field.delim=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1364541810}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)	
    Time taken: 0.092 seconds
    
    hive (student)> load data local inpath '/home/landen/文档/student.txt'
                  > overwrite into table student_test;
    Copying data from file:/home/landen/文档/student.txt
    Copying file: file:/home/landen/文档/student.txt
    Loading data to table student.student_test
    Deleted hdfs://localhost:9000/home/landen/UntarFile/hive-0.10.0/user/hive/warehouse/student.db/student_test
    Table student.student_test stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 67, raw_data_size: 0]
    OK
    Time taken: 0.581 seconds
    hive (student)> select * from student_test;
    OK
    1	{"name":"KaiLee","age":24}
    2	{"name":"DuoPing","age":24}
    3	{"name":"JiangTao","age":25}
    4	{"name":"LiuRiJi","age":23}
    5	{"name":"GuangYuan","age":25}
    Time taken: 0.097 seconds
    
    hive (student)> select id,basic_info.name from student_test;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201303271617_0010, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201303271617_0010
    Kill Command = /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -kill job_201303271617_0010
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2013-03-29 15:40:17,082 Stage-1 map = 0%,  reduce = 0%
    2013-03-29 15:40:23,112 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:24,117 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:25,123 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:26,127 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:27,132 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:28,138 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.73 sec
    2013-03-29 15:40:29,142 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 0.73 sec
    MapReduce Total cumulative CPU time: 730 msec
    Ended Job = job_201303271617_0010
    MapReduce Jobs Launched: 
    Job 0: Map: 1   Cumulative CPU: 0.73 sec   HDFS Read: 332 HDFS Write: 52 SUCCESS
    Total MapReduce CPU Time Spent: 730 msec
    OK
    1	KaiLee
    2	DuoPing
    3	JiangTao
    4	LiuRiJi
    5	GuangYuan
    Time taken: 22.811 seconds
    hive (student)> 

    student.txt的内容如下:
    1,KaiLee:24
    2,DuoPing:24
    3,JiangTao:25
    4,LiuRiJi:23
    5,GuangYuan:25
  • 相关阅读:
    jenkins插件开发
    常用模块-------hashlib (加密模块)
    树莓派在任意无线网下连接笔记本(借用笔记本屏幕)
    常用模块-------时间模块(time/datetime),随机数模块(random)
    pycharm常用的快捷方式及设置
    迭代器
    生成器
    获取行业和概念列表
    钉钉页面扫码登录中hmac加密签名
    获取钉钉开发access_token
  • 原文地址:https://www.cnblogs.com/likai198981/p/2989020.html
Copyright © 2020-2023  润新知