• hive learing_1


    1、什么是hive

    hive是基于Hadoop的一种数据库仓库工具,将结构化数据映射为一张数据表,并提供SQL查询,分析等功能

    2、hive的特点

    1. 将数据结构(shema)存储在数据库中,数据存储在HDFS中
    2. 适用于联机分析处理(OLAP:on-line analytic process)
    3. 支持HQL查询

    3、hive的缺点

    1. 不支持关系数据库
    2. 不支持联机事务处理(OLTP:on-line transaction process):如银行系统
    3. 不支持实时查询和行级更新

    4、hive的安装配置

    • 下载安装包,解压到需要安装的目录
    • 配置环境变量 vi /etc/profile,并立即生效 source /etc/profile

    • 修改配置文件

    • 修改hive-site.xml

    </property>
    
    <property>
     <name>datanucleus.autoCreateTables</name>
     <value>True</value>
     </property>
    
    
      <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>root</value>
        <description>password to use against metastore database</description>
      </property>
      <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
        <description>Username to use against metastore database</description>
      </property>
     <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://192.168.109.150:3306/hive_metastore?createDatabaseIfNotExist=true</value>
        <description>
          JDBC connect string for a JDBC metastore.
          To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
          For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
        </description>
      </property>
     <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
      </property>
      <property>
        <name>hive.metastore.ds.connection.url.hook</name>
        <value/>
        <description>Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used</description>
      </property>
    
    

     

     

    • 将${system的地方改成具体的地址

    • 将MySQL的驱动包mysql-connector-java-5.1.46.jar复制到/soft/hive/lib下面

    • 启动hive,初始化元数据

    [xiaoqiu@s150 /soft/hive/conf]$ schematool -initSchema -dbType mysql
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/soft/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Metastore connection URL:        jdbc:mysql://192.168.109.150:3306/hive_metastore?createDatabaseIfNotExist=true
    Metastore Connection Driver :    com.mysql.jdbc.Driver
    Metastore connection User:       root
    Starting metastore schema initialization to 2.3.0
    Initialization script hive-schema-2.3.0.mysql.sql
    Initialization script completed
    schemaTool completed
    
    • 创建数据库

    hive> create database incubator;
    OK
    Time taken: 0.338 seconds
    hive> show databases;
    OK
    default
    incubator
    mydb
    Time taken: 0.014 seconds, Fetched: 3 row(s)
    
    • 查看HDFS(可见hive创建的数据库直接同步于HDFS的hive数据仓库中了)

    [xiaoqiu@s150 /soft/hive/conf]$ hadoop fs -lsr /
    lsr: DEPRECATED: Please use 'ls -R' instead.
    drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 02:21 /tmp
    drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 10:20 /tmp/hive
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f/_tmp_space.db
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6/_tmp_space.db
    drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e
    drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e/_tmp_space.db
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse/incubator.db
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive/warehouse/mydb.db
    
    • 创建table

    hive> create table t(id int,name string,age int);
    OK
    Time taken: 1.641 seconds
    hive> show tables;
    OK
    t
    Time taken: 1.312 seconds, Fetched: 1 row(s)
    
    • 在MySQL中查看(其中TBL_TYPE为MANAGED_TABLE即托管表的意思,也就是删除表的时候,表的数据也会被删除掉,还有一种类型的表为EXTERNAL即外部表,删除表的时候不会删除数据)

    MariaDB [hive_metastore]> select * from TBLS;
    +--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
    | TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER   | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
    +--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
    |      1 |  1530721536 |     3 |                0 | xiaoqiu |         0 |     1 | t        | MANAGED_TABLE | NULL               | NULL               |                    |
    +--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
    1 row in set (0.00 sec)
    
    • 这个时候还可以在HDFS查看表

    [xiaoqiu@s150 /soft/hive/conf]$ hadoop fs -lsr /
    lsr: DEPRECATED: Please use 'ls -R' instead.
    drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 02:21 /tmp
    drwx-wx-wx   - xiaoqiu supergroup          0 2018-07-02 10:20 /tmp/hive
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f
    drwx------   - root    supergroup          0 2018-07-02 10:41 /tmp/hive/root/f5b7d613-fb3b-4125-b2d7-ea643b199c2f/_tmp_space.db
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6
    drwx------   - xiaoqiu supergroup          0 2018-07-04 10:58 /tmp/hive/xiaoqiu/9f4397b7-db00-4ad5-b5de-cc3fd0d72eb6/_tmp_space.db
    drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e
    drwx------   - xiaoqiu supergroup          0 2018-07-02 17:19 /tmp/hive/xiaoqiu/ae6a90bd-558d-482c-b408-dc0273bce93e/_tmp_space.db
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:34 /user/hive/warehouse
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 12:25 /user/hive/warehouse/incubator.db
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 12:25 /user/hive/warehouse/incubator.db/t
    drwxr-xr-x   - xiaoqiu supergroup          0 2018-07-04 11:32 /user/hive/warehouse/mydb.db
    
    • 向表t中插入数据(直接在hive中使用insert into语句进行插入,会先调用mapreduce)

     

     

    欢迎关注我的公众号:小秋的博客 CSDN博客:https://blog.csdn.net/xiaoqiu_cr github:https://github.com/crr121 联系邮箱:rongchen633@gmail.com 有什么问题可以给我留言噢~
  • 相关阅读:
    TP5.0 excel 导入导出
    整理关于web项目如何防止CSRF和XSS攻击的方法
    vue ajax获取数据的时候,如何保证传递参数的安全或者说如何保护api的安全
    vue.js打包后,接口安全问题
    JSON API免费接口
    tp5 $_ENV获取不到数据
    java cannot find the tag library descriptor for "http://java.sun.com/jsp/jstl/core"
    如何查找到文件以后,带目录一起拷贝到新的目录? cp --parents source destination
    台湾各个大学硕博论文链接,很全,有的可以全文下载。
    delete
  • 原文地址:https://www.cnblogs.com/flyingcr/p/10326882.html
Copyright © 2020-2023  润新知