• Hive导出表数据


    法一:

    hive (stuchoosecourse) > insert overwrite local directory '/home/landen/文档/exportDir'
                                       > select * from hiddenipinfo;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201312042044_0026, Tracking URL = http://Master:50030/jobdetails.jsp?jobid=job_201312042044_0026
    Kill Command = /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -kill job_201312042044_0026
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2013-12-09 19:33:35,962 Stage-1 map = 0%,  reduce = 0%
    2013-12-09 19:33:41,937 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    2013-12-09 19:33:43,008 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    2013-12-09 19:33:44,093 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    2013-12-09 19:33:45,146 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    2013-12-09 19:33:46,233 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    2013-12-09 19:33:47,271 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.4 sec
    MapReduce Total cumulative CPU time: 400 msec
    Ended Job = job_201312042044_0026
    Copying data to local directory /home/landen/文档/exportDir
    Copying data to local directory /home/landen/文档/exportDir
    3 Rows loaded to /home/landen/文档/exportDir
    MapReduce Jobs Launched:
    Job 0: Map: 1   Cumulative CPU: 0.4 sec   HDFS Read: 490 HDFS Write: 233 SUCCESS
    Total MapReduce CPU Time Spent: 400 msec
    OK
    ip    countrycode    countryname    region    regionname    city    latitude    longitude    timezone
    Time taken: 80.784 seconds

    But Hive使用 ^A 符号作为域的分隔符,导出后内容如下:

    221.12.10.218CNChina02ZhejiangHangzhou30.293594120.16141Asia/Shanghai
    60.180.248.201CNChina02ZhejiangWenzhou27.999405120.66681Asia/Shanghai
    125.111.251.118CNChina02ZhejiangNingbo29.878204121.5495Asia/Shanghai
    故可以使用sed命令将其替换为所需的域分隔符,命令如下:

    landen@Master:~/文档/exportDir$ sed -e 's/x01/ /g (后缀/g意味着sed会替换每一处匹配)' 000000_0

    此时只会显示执行后的文件内容,但000000_0文件内容still时原文件,故需进行重定向到新文件,如下:

    sed -e 's/x01/ /g' 000000_0 > (重定向到新文件) ipInfo.txt

    /Ng会忽略前N处匹配,并从第N+1出开始替换.
    landen@Master:~/文档/exportDir$ cat ipInfo.txt
    221.12.10.218    CN    China    02    Zhejiang    Hangzhou    30.293594    120.16141    Asia/Shanghai
    60.180.248.201    CN    China    02    Zhejiang    Wenzhou    27.999405    120.66681    Asia/Shanghai
    125.111.251.118    CN    China    02    Zhejiang    Ningbo    29.878204    121.5495    Asia/Shanghai

    法 二:

    landen@Master:~/UntarFile/hive-0.10.0$ bin/hive --database 'stuchoosecourse' -e 'select * from hiddenipinfo' >> /home/landen/文档/exportDir/ip.tsv
    WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
    Logging initialized using configuration in jar:file:/home/landen/UntarFile/hive-0.10.0/lib/hive-common-0.10.0.jar!/hive-log4j.properties
    Hive history file=/home/landen/UntarFile/hive-0.10.0/logs/hive_job_log_landen_201312091934_46210224.txt
    OK
    Time taken: 17.15 seconds
    OK
    Time taken: 6.904 seconds

    ip.tsv内容如下(包含表列名):

    ip    countrycode    countryname    region    regionname    city    latitude    longitude    timezone
    221.12.10.218    CN    China    02    Zhejiang    Hangzhou    30.293594    120.16141    Asia/Shanghai
    60.180.248.201    CN    China    02    Zhejiang    Wenzhou    27.999405    120.66681    Asia/Shanghai
    125.111.251.118    CN    China    02    Zhejiang    Ningbo    29.878204    121.5495    Asia/Shanghai

    法 三:

    landen@Master:~/UntarFile/hive-0.10.0$ bin/hive --database 'stuchoosecourse' -f '/home/landen/文档/testSql.q >> ~/ip.tsv'

    WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
    Logging initialized using configuration in jar:file:/home/landen/UntarFile/hive-0.10.0/lib/hive-common-0.10.0.jar!/hive-log4j.properties
    Hive history file=/home/landen/UntarFile/hive-0.10.0/logs/hive_job_log_landen_201312091450_505292945.txt
    OK
    Time taken: 4.939 seconds
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_201312042044_0024, Tracking URL = http://Master:50030/jobdetails.jsp?jobid=job_201312042044_0024
    Kill Command = /home/landen/UntarFile/hadoop-1.0.4/libexec/../bin/hadoop job  -kill job_201312042044_0024
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2013-12-09 14:51:19,055 Stage-1 map = 0%,  reduce = 0%
    2013-12-09 14:51:25,127 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:26,133 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:27,156 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:28,160 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:29,164 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:30,168 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.21 sec
    2013-12-09 14:51:31,172 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 1.21 sec
    MapReduce Total cumulative CPU time: 1 seconds 210 msec
    Ended Job = job_201312042044_0024
    MapReduce Jobs Launched:
    Job 0: Map: 1   Cumulative CPU: 1.21 sec   HDFS Read: 306 HDFS Write: 188 SUCCESS
    Total MapReduce CPU Time Spent: 1 seconds 210 msec
    OK
    _c0
    CN    China    02    Zhejiang    Hangzhou    30.293594    120.16141    Asia/Shanghai
    CN    China    02    Zhejiang    Wenzhou    27.999405    120.66681    Asia/Shanghai
    CN    China    02    Zhejiang    Ningbo    29.878204    121.5495    Asia/Shanghai
    Time taken: 47.517 seconds
    OK
    ip    countrycode    countryname    region    regionname    city    latitude    longitude    timezone
    221.12.10.218    CN    China    02    Zhejiang    Hangzhou    30.293594    120.16141    Asia/Shanghai
    60.180.248.201    CN    China    02    Zhejiang    Wenzhou    27.999405    120.66681    Asia/Shanghai
    125.111.251.118    CN    China    02    Zhejiang    Ningbo    29.878204    121.5495    Asia/Shanghai
    Time taken: 0.441 seconds

  • 相关阅读:
    重装系统后texstudio拼写检查不工作
    git bash使用端口转发连接服务器
    YCSB-mapkeeper
    编译thrift外篇-关于默认链接包-(使用mapkeeper运行leveldb成功)
    编译Thrift
    Could not resolve view with name 'sys/login' in servlet with name 'dispatcher'
    Eclipse创建一个Maven Web项目
    Maven安装配置
    使用Maven创建Web应用程序项目
    org.apache.jasper.JasperException: Unable to compile class for JSP:
  • 原文地址:https://www.cnblogs.com/likai198981/p/3466066.html
Copyright © 2020-2023  润新知