• hive 集群初探,查询比较


    在slave3安装MySQL作为hive 的matastore

    [root@slave3 hadoop]# yum search mysql

    [root@slave3 hadoop]# yum install mysql-server.x86_64

    成功安装MySQL

    Installed:

    mysql-server.x86_64 0:5.1.73-5.el6_6

    Dependency Installed:

    mysql.x86_64 0:5.1.73-5.el6_6 perl.x86_64 4:5.10.1-136.el6_6.1 perl-DBD-MySQL.x86_64 0:4.013-3.el6

    perl-DBI.x86_64 0:1.609-4.el6 perl-Module-Pluggable.x86_64 1:3.90-136.el6_6.1 perl-Pod-Escapes.x86_64 1:1.04-136.el6_6.1

    perl-Pod-Simple.x86_64 1:3.13-136.el6_6.1 perl-libs.x86_64 4:5.10.1-136.el6_6.1 perl-version.x86_64 3:0.77-136.el6_6.1

    Dependency Updated:

    mysql-libs.x86_64 0:5.1.73-5.el6_6

    Complete!

    [root@slave3 hadoop]# service mysqld start

    启动MySQL

    修改MySQL登录密码

    mysql> set password =password(‘root’);

    Query OK, 0 rows affected (0.00 sec)

    slave3cpu信息

    内存

    [hadoop@slave4 ~]$ free -m
    total used free shared buffers cached
    Mem: 1866 1798 68 0 7 1500
    -/+ buffers/cache: 290 1575
    Swap: 3999 0 3999

    单位为MB

    详细内存

    [hadoop@slave4 ~]$ cat /proc/meminfo
    MemTotal: 1911400 kB
    MemFree: 66904 kB
    Buffers: 7308 kB
    Cached: 1539760 kB
    SwapCached: 0 kB
    Active: 135492 kB
    Inactive: 1603924 kB
    Active(anon): 93060 kB
    Inactive(anon): 99604 kB
    Active(file): 42432 kB
    Inactive(file): 1504320 kB
    Unevictable: 0 kB
    Mlocked: 0 kB
    SwapTotal: 4095992 kB
    SwapFree: 4095992 kB
    Dirty: 48 kB
    Writeback: 0 kB
    AnonPages: 192444 kB
    Mapped: 15264 kB
    Shmem: 220 kB
    Slab: 73204 kB
    SReclaimable: 17448 kB
    SUnreclaim: 55756 kB
    KernelStack: 1264 kB
    PageTables: 2504 kB
    NFS_Unstable: 0 kB
    Bounce: 0 kB
    WritebackTmp: 0 kB
    CommitLimit: 5051692 kB
    Committed_AS: 292260 kB
    VmallocTotal: 34359738367 kB
    VmallocUsed: 540012 kB
    VmallocChunk: 34359184520 kB
    HardwareCorrupted: 0 kB
    AnonHugePages: 159744 kB
    HugePages_Total: 0
    HugePages_Free: 0
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    Hugepagesize: 2048 kB
    DirectMap4k: 9792 kB
    DirectMap2M: 2076672 kB

    磁盘

    查看硬盘和分区分布

    [hadoop@slave4 ~]$ lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sda 8:0 0 232.9G 0 disk
    鈹溾攢sda1 8:1 0 500M 0 part /boot
    鈹斺攢sda2 8:2 0 232.4G 0 part
    鈹溾攢vg_slave4-lv_root (dm-0) 253:0 0 50G 0 lvm /
    鈹溾攢vg_slave4-lv_swap (dm-1) 253:1 0 3.9G 0 lvm [SWAP]
    鈹斺攢vg_slave4-lv_home (dm-2) 253:2 0 178.5G 0 lvm /home

    [root@slave4 hadoop]# fdisk -l

    Disk /dev/sda: 250.1 GB, 250058268160 bytes
    255 heads, 63 sectors/track, 30401 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x2c63be37

    Device Boot Start End Blocks Id System
    /dev/sda1 * 1 64 512000 83 Linux
    Partition 1 does not end on cylinder boundary.
    /dev/sda2 64 30402 243684352 8e Linux LVM

    Disk /dev/mapper/vg_slave4-lv_root: 53.7 GB, 53687091200 bytes
    255 heads, 63 sectors/track, 6527 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000

    Disk /dev/mapper/vg_slave4-lv_swap: 4194 MB, 4194304000 bytes
    255 heads, 63 sectors/track, 509 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000

    Disk /dev/mapper/vg_slave4-lv_home: 191.7 GB, 191650332672 bytes
    255 heads, 63 sectors/track, 23300 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000

    dfs系统节点

    [hadoop@slave4 ~]$ hadoop dfsadmin -report
    Configured Capacity: 985770651648 (918.07 GB)
    Present Capacity: 934353133568 (870.18 GB)
    DFS Remaining: 590443302912 (549.89 GB)
    DFS Used: 343909830656 (320.29 GB)
    DFS Used%: 36.81%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0


    Datanodes available: 7 (7 total, 0 dead)

    Name: 192.168.2.6:50010
    Decommission Status : Normal
    Configured Capacity: 188643102720 (175.69 GB)
    DFS Used: 75069222912 (69.91 GB)
    Non DFS Used: 9779052544 (9.11 GB)
    DFS Remaining: 103794827264(96.67 GB)
    DFS Used%: 39.79%
    DFS Remaining%: 55.02%
    Last contact: Sun Jul 05 10:52:17 CST 2015

    Name: 192.168.2.13:50010
    Decommission Status : Normal
    Configured Capacity: 21277569024 (19.82 GB)
    DFS Used: 136781824 (130.45 MB)
    Non DFS Used: 1261211648 (1.17 GB)
    DFS Remaining: 19879575552(18.51 GB)
    DFS Used%: 0.64%
    DFS Remaining%: 93.43%
    Last contact: Sun Jul 05 10:52:17 CST 2015

    Name: 192.168.2.9:50010
    Decommission Status : Normal
    Configured Capacity: 188643102720 (175.69 GB)
    DFS Used: 58468474880 (54.45 GB)
    Non DFS Used: 9779011584 (9.11 GB)
    DFS Remaining: 120395616256(112.13 GB)
    DFS Used%: 30.99%
    DFS Remaining%: 63.82%
    Last contact: Sun Jul 05 10:52:18 CST 2015

    Name: 192.168.2.10:50010
    Decommission Status : Normal
    Configured Capacity: 188643102720 (175.69 GB)
    DFS Used: 74225582080 (69.13 GB)
    Non DFS Used: 9778978816 (9.11 GB)
    DFS Remaining: 104638541824(97.45 GB)
    DFS Used%: 39.35%
    DFS Remaining%: 55.47%
    Last contact: Sun Jul 05 10:52:17 CST 2015

    Name: 192.168.2.11:50010
    Decommission Status : Normal
    Configured Capacity: 188643102720 (175.69 GB)
    DFS Used: 63778144256 (59.4 GB)
    Non DFS Used: 9779015680 (9.11 GB)
    DFS Remaining: 115085942784(107.18 GB)
    DFS Used%: 33.81%
    DFS Remaining%: 61.01%
    Last contact: Sun Jul 05 10:52:15 CST 2015

    Name: 192.168.2.12:50010
    Decommission Status : Normal
    Configured Capacity: 21277569024 (19.82 GB)
    DFS Used: 966615040 (921.84 MB)
    Non DFS Used: 1261236224 (1.17 GB)
    DFS Remaining: 19049717760(17.74 GB)
    DFS Used%: 4.54%
    DFS Remaining%: 89.53%
    Last contact: Sun Jul 05 10:52:17 CST 2015

    Name: 192.168.2.7:50010
    Decommission Status : Normal
    Configured Capacity: 188643102720 (175.69 GB)
    DFS Used: 71265009664 (66.37 GB)
    Non DFS Used: 9779011584 (9.11 GB)
    DFS Remaining: 107599081472(100.21 GB)
    DFS Used%: 37.78%
    DFS Remaining%: 57.04%
    Last contact: Sun Jul 05 10:52:17 CST 2015

    安装之后修改配置文件,

    [hadoop@slave3 bin]$ ./hive
    15/07/06 10:58:20 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

    Logging initialized using configuration in file:/opt/hadoop/hive-0.12.0/conf/hive-log4j.properties
    hive> set javax.jdo.option.ConnectionURL

    ;
    javax.jdo.option.ConnectionURL=jdbc:mysql://slave3:3306/hive?createDatabaseIfNoExist=true
    hive>

    hive> show tables;

    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    hive>

    原因是没有driver

    加上之后还是不行

    2015-07-06 11:04:16,786 WARN bonecp.BoneCPConfig (BoneCPConfig.java:sanitize(1537)) - Max Connections < 1. Setting to 20

    2015-07-06 11:04:16,911 ERROR Datastore.Schema (Log4JLogger.java:error(125)) - Failed initialising database.

    Unable to open a test connection to the given database. JDBC url = jdbc:mysql://slave3:3306/hive?createDatabaseIfNoExist=true, username = root. Terminating connection pool. Original Exception: ——

    java.sql.SQLException: Access denied for user ‘root’@’slave3’ (using password: YES)

    MySQL连接不上

    https://hadooptutorial.info/unable-open-test-connection-given-database/

    根据这篇文章找到问题,原来是写错了

    下面惊醒ncdc 的数据分析阶段

    根据我的博客中间写的流程创建表;

    http://blog.csdn.net/mrcharles/article/details/46514359

    create table ncdc (

    year string,

    month string,

    data string,

    time string,

    air string,

    a string,

    b string,

    c string,

    d string,

    e string,

    f string,

    g string

    )ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘ ’ STORED AS TEXTFILE;

    ncdc的数据通过一个简单的java程序进行处理,将空格变为制表符

    1. 导入数据到hive中

    load data local inpath ‘/opt/software/ncdc/summary’ into table ncdc

    hive> load data local inpath ‘/opt/hadoop/hadoopDATA/summary’ into table ncdc;

    Copying data from file:/opt/hadoop/hadoopDATA/summary

    Copying file: file:/opt/hadoop/hadoopDATA/summary

    Loading data to table default.ncdc

    Table default.ncdc stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 17013314, raw_data_size: 0]

    OK

    Time taken: 2.231 seconds

    FAILED: RuntimeException java.net.UnknownHostException: unknown host: node1

    出现以上的错误

    修改配置文件,之前的配置文件是我直接从一台机器中拷贝出来的,没有修改hdfs的url

    1. 查询数据

    可以查询每一年的平均气温,最高气温,最低气温等等,也可以使用分组函数,和MySQL操作差不多

    select year,avg(air) from ncdc group by year;
    hive> select year,avg(air) from ncdc group by year;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks not specified. Estimated from input data size: 1
    In order to change the average load for a reducer (in bytes):
    set hive.exec.reducers.bytes.per.reducer=
    In order to limit the maximum number of reducers:
    set hive.exec.reducers.max=
    In order to set a constant number of reducers:
    set mapred.reduce.tasks=
    Starting Job = job_201507050950_0001, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201507050950_0001
    Kill Command = /opt/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -kill job_201507050950_0001
    Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
    2015-07-06 13:05:52,921 Stage-1 map = 0%, reduce = 0%
    2015-07-06 13:05:59,965 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 7.02 sec
    2015-07-06 13:06:00,972 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 7.02 sec
    2015-07-06 13:06:01,978 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 7.02 sec
    2015-07-06 13:06:02,985 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:03,992 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:04,998 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:06,005 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:07,012 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:08,019 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:09,025 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 11.62 sec
    2015-07-06 13:06:10,033 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.53 sec
    2015-07-06 13:06:11,040 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.53 sec
    2015-07-06 13:06:12,046 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.53 sec
    MapReduce Total cumulative CPU time: 14 seconds 530 msec
    Ended Job = job_201507050950_0001
    MapReduce Jobs Launched:
    Job 0: Map: 2 Reduce: 1 Cumulative CPU: 14.53 sec HDFS Read: 51040494 HDFS Write: 537 SUCCESS
    Total MapReduce CPU Time Spent: 14 seconds 530 msec
    OK
    1901 45.16831683168317
    1902 21.659558263518658
    1903 -17.67699115044248
    1904 33.32224247948952
    1905 43.3322664228014
    1906 47.0834855681403
    1907 28.09189090243456
    1908 28.80607441154138
    1909 25.24907112526539
    1910 29.00013071895425
    1911 28.088644112247575
    1912 16.801145236855803
    1913 8.191569568197396
    1914 26.378301131816624
    1915 2.811635615498914
    1916 21.42393787117405
    1917 22.895140080045742
    1918 27.712506047411708
    1919 23.67520250849229
    1920 43.508667830133795
    1921 31.834957020057306
    1922 -44.03716409376787
    1923 26.79247747159462
    Time taken: 33.348 seconds, Fetched: 23 row(s)

    select year,max(air) from ncdc group by year;
    结果:

    hive> select year,max(air) from ncdc group by year;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks not specified. Estimated from input data size: 1
    In order to change the average load for a reducer (in bytes):
    set hive.exec.reducers.bytes.per.reducer=
    In order to limit the maximum number of reducers:
    set hive.exec.reducers.max=
    In order to set a constant number of reducers:
    set mapred.reduce.tasks=
    Starting Job = job_201507050950_0002, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201507050950_0002
    Kill Command = /opt/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -kill job_201507050950_0002
    Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
    2015-07-06 13:07:28,648 Stage-1 map = 0%, reduce = 0%
    2015-07-06 13:07:34,675 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.57 sec
    2015-07-06 13:07:35,681 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.57 sec
    2015-07-06 13:07:36,687 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.57 sec
    2015-07-06 13:07:37,693 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.57 sec
    2015-07-06 13:07:38,699 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:39,705 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:40,711 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:41,716 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:42,722 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:43,727 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 11.26 sec
    2015-07-06 13:07:44,734 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.04 sec
    2015-07-06 13:07:45,740 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.04 sec
    2015-07-06 13:07:46,746 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.04 sec
    MapReduce Total cumulative CPU time: 14 seconds 40 msec
    Ended Job = job_201507050950_0002
    MapReduce Jobs Launched:
    Job 0: Map: 2 Reduce: 1 Cumulative CPU: 14.04 sec HDFS Read: 51040494 HDFS Write: 184 SUCCESS
    Total MapReduce CPU Time Spent: 14 seconds 40 msec
    OK
    1901 94
    1902 94
    1903 94
    1904 94
    1905 94
    1906 94
    1907 94
    1908 94
    1909 94
    1910 94
    1911 94
    1912 94
    1913 94
    1914 94
    1915 94
    1916 94
    1917 94
    1918 94
    1919 94
    1920 94
    1921 94
    1922 94
    1923 94
    Time taken: 26.002 seconds, Fetched: 23 row(s)

    select count(*) from ncdc;

    hive> select count(*) from ncdc;
    Total MapReduce jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks determined at compile time: 1
    In order to change the average load for a reducer (in bytes):
    set hive.exec.reducers.bytes.per.reducer=
    In order to limit the maximum number of reducers:
    set hive.exec.reducers.max=
    In order to set a constant number of reducers:
    set mapred.reduce.tasks=
    Starting Job = job_201507050950_0004, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201507050950_0004
    Kill Command = /opt/hadoop/hadoop-1.2.1/libexec/../bin/hadoop job -kill job_201507050950_0004
    Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
    2015-07-06 13:08:56,762 Stage-1 map = 0%, reduce = 0%
    2015-07-06 13:09:03,803 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.26 sec
    2015-07-06 13:09:04,809 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:05,816 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:06,825 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:07,831 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:08,838 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:09,844 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:10,850 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:11,857 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 10.21 sec
    2015-07-06 13:09:12,866 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 13.14 sec
    2015-07-06 13:09:13,872 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 13.14 sec
    2015-07-06 13:09:14,878 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 13.14 sec
    MapReduce Total cumulative CPU time: 13 seconds 140 msec
    Ended Job = job_201507050950_0004
    MapReduce Jobs Launched:
    Job 0: Map: 2 Reduce: 1 Cumulative CPU: 13.14 sec HDFS Read: 51040494 HDFS Write: 8 SUCCESS
    Total MapReduce CPU Time Spent: 13 seconds 140 msec
    OK
    1006038
    Time taken: 28.984 seconds, Fetched: 1 row(s)

  • 相关阅读:
    命令执行顺序控制与管道
    js获取返回首页
    手机站点击商务通无轨迹解决方法
    js文字向上滚动代码
    文字隐藏多余的文字
    QQ弹窗代码
    百度推送代码
    js手机站跳转
    js 判断时间,满足执行框架
    js切换换class
  • 原文地址:https://www.cnblogs.com/mrcharles/p/11879902.html
Copyright © 2020-2023  润新知