• Hive的安装及交互方式


    记录一下Hive的安装和常用的三种交互方式的使用,参考文末博文和老王,需要提前安装好MySQL和配置好Hadoop集群。

    使用版本:

    (1)MySQL:5.7.28

    (2)Hadoop:2.6.0-cdh5.14.2

    (3)Hive:1.1.0-cdh5.14.2

    Hive的安装

    Hive的安装,需要完成MySQL的安装、Hadoop的配置、Hive中conf文件的配置、Hive中日志路径的配置。

    MySQL的安装

    参考博文https://www.cnblogs.com/youngchaolin/p/13702019.html 。

    Hadoop的安装

    参考博文https://www.cnblogs.com/youngchaolin/p/11444518.html

    Hive的安装

    (1)下载hive的安装包,地址:http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz

    # wget下载,下载到当前目录/kkb/install
    [hadoop@node01 /kkb/install]$ wget http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz

    (2)解压安装包到指定的目录

    # 解压到安装目录
    tar -zxvf hive-1.1.0-cdh5.14.2.tar.gz -C /kkb/install/

    (3)修改hive/conf/hive-env.sh

    进入hive安装目录的conf目录,修改hive-env.sh,配置HADOOP_HOME和HIVE_CONF_DIR路径。

    # Set HADOOP_HOME to point to a specific hadoop install directory
    export HADOOP_HOME=/kkb/install/hadoop-2.6.0-cdh5.14.2/
    
    # Hive Configuration Directory can be controlled by:
    export HIVE_CONF_DIR=/kkb/install/hive-1.1.0-cdh5.14.2/conf

    (4)vim新增hive/conf/hive-site.xml

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
            <property>
                    <name>javax.jdo.option.ConnectionURL</name>
                    <value>jdbc:mysql://node03:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=latin1&amp;useSSL=false</value>
            </property>
    
            <property>
                    <name>javax.jdo.option.ConnectionDriverName</name>
                    <value>com.mysql.jdbc.Driver</value>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionUserName</name>
                    <value>root</value>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionPassword</name>
                    <value>123456</value>
            </property>
            <property>
                    <name>hive.cli.print.current.db</name>
                    <value>true</value>
            </property>
            <property>
                    <name>hive.cli.print.header</name>
                <value>true</value>
            </property>
            <property>
                    <name>hive.server2.thrift.bind.host</name>
                    <value>node01.kaikeba.com</value>
            </property>
    </configuration>

    (5)修改hive/conf/hive-log4j.properties

    复制hive/conf/ hive-log4j.properties.template为hive-log4j.properties,配置hive日志文件存储的地址。

    # 配置日志文件存放地址
    hive.log.dir=/kkb/install/hive-1.1.0-cdh5.14.2/logs/

    (6)将mysql驱动包上传到hive的lib目录。

    [hadoop@node01 /kkb/soft]$ cp mysql-connector-java-5.1.38.jar /kkb/install/hive-1.1.0-cdh5.14.2/lib/

    这样hive的安装就完成了,接下来使用三种交互方式测试一下。

    Hive的交互方式

    hive有三种交互方式 ,使用之前需要先启动hadoop集群。

    Hive CLI

    直接执行hive/bin/hive脚本,从WARNING提示可以看出,一般不推荐这种方式。

    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 16:55:28,709 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/09/30 16:55:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
    Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
    # Hive CLI的方式不推荐
    WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
    # 查询数据库 hive
    > show databases; OK default Time taken: 8.535 seconds, Fetched: 1 row(s) hive>

    另外,hive CLI命令窗口下可以直接查看本地以及HDFS文件系统。

    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 17:24:31,509 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/09/30 17:24:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
    Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
    WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
    # 查看本地文件系统
    hive> !ls /kkb/install;
    hadoop-2.6.0-cdh5.14.2
    hbase-1.2.0-cdh5.14.2
    hive-1.1.0-cdh5.14.2
    hive-1.1.0-cdh5.14.2.tar.gz
    hive.sql
    jdk1.8.0_181
    zookeeper-3.4.5-cdh5.14.2
    # 可以查看hdfs文件系统
    hive> dfs -ls /;
    Found 8 items
    -rw-r--r--   3 hadoop supergroup      13612 2020-03-09 10:42 /dataskew.txt
    drwxr-xr-x   - root   supergroup          0 2020-03-09 10:48 /dataskewOutput
    drwxr-xr-x   - hadoop supergroup          0 2020-03-09 16:13 /ncdcDataWithTotalOrder
    drwxr-xr-x   - root   supergroup          0 2020-03-09 15:35 /ncdcDataclean
    drwxr-xr-x   - hadoop supergroup          0 2020-03-09 15:32 /ncdcdata
    -rw-r--r--   3 root   supergroup     212005 2020-03-09 14:21 /sequencefile
    drwx------   - hadoop supergroup          0 2020-09-30 16:52 /tmp
    drwx------   - hadoop supergroup          0 2020-09-30 17:19 /user
    hive> 

    beeline

    使用beeline方式连接,需先运行hiveserver2,启动后使用jps查看就是runjar,这种方式比较常用,在开发阶段会大量使用。

    # 本次是后台启动,也可以前台启动,使用./hive --service hiveserver2
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup ./hive --service hiveserver2 &
    [1] 10103
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup: ignoring input and appending output to ‘nohup.out’
    # 有RunJar即启动了hiveserver2,进程号为10103
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ jps
    8705 ResourceManager
    8228 NameNode
    10103 RunJar
    8476 SecondaryNameNode
    8812 NodeManager
    8334 DataNode
    10271 Jps

    使用beeline连接hiveserver2,如果是上面是后台启动,直接在当前窗口连接就行,前台启动的话就需要另外开一个窗口连接。

    # 启动beeline
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./beeline 
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 17:07:31,569 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Beeline version 1.1.0-cdh5.14.2 by Apache Hive
    # 使用jdbc连接
    beeline> !connect jdbc:hive2://node01:10000
    scan complete in 2ms
    Connecting to jdbc:hive2://node01:10000
    # 直接回车,不用输用户名
    Enter username for jdbc:hive2://node01:10000: 
    # 直接回车,不用输密码
    Enter password for jdbc:hive2://node01:10000: 
    Connected to: Apache Hive (version 1.1.0-cdh5.14.2)
    Driver: Hive JDBC (version 1.1.0-cdh5.14.2)
    Transaction isolation: TRANSACTION_REPEATABLE_READ
    # 查询数据库
    0: jdbc:hive2://node01:10000> show databases;
    INFO  : Compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases
    INFO  : Semantic Analysis Completed
    INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
    INFO  : Completed compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 1.72 seconds
    INFO  : Concurrency mode is disabled, not creating a lock manager
    INFO  : Executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases
    INFO  : Starting task [Stage-0:DDL] in serial mode
    INFO  : Completed executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 0.157 seconds
    INFO  : OK
    +----------------+--+
    | database_name  |
    +----------------+--+
    | default        |
    +----------------+--+
    1 row selected (8.47 seconds)

    hive命令

    (1)hive -e "hql"

    使用这种方式可以直接执行hql语句。

    # 直接查询数据库
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;"
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 17:15:00,248 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/09/30 17:15:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
    Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
    OK
    default
    Time taken: 7.221 seconds, Fetched: 1 row(s)

    (2)hive -f sql脚本文件

    一般在开发完成后,可以把hql写入到脚本里,然后使用这种方式来执行。

    可以在/kkb/install下使用vim命令新建一段脚本,内容如下。

    create database if not exists youngchaolin_hive;

    然后使用hive -f命令来执行这个脚本。

    # 执行脚本,文件l不需要为可执行文件
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -f /kkb/install/hive.sql 
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 17:19:48,072 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/09/30 17:19:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
    Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
    OK
    Time taken: 5.016 seconds
    # 查看结果,发现已成功创建数据库
    [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;"
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    2020-09-30 17:20:12,228 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    20/09/30 17:20:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
    Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties
    OK
    default
    youngchaolin_hive
    Time taken: 4.062 seconds, Fetched: 2 row(s)

    新建的数据库,默认路径在hdfs的 /user/hive/warehouse/youngchaolin_hive.db。

    以上,理解不一定正确,学习就是一个不断认识和纠错的过程,如果有误还请批评指正。

    参考博文

    (1)https://www.cnblogs.com/youngchaolin/p/13702019.html 安装MySQL

    (2)https://www.cnblogs.com/youngchaolin/p/11444518.html 配置3节点Hadoop集群

  • 相关阅读:
    L1和L2正则
    Python基础(一)
    消息分发
    StringList 自定义快速排序
    Delphi Length函数
    接口的委托实现(通过接口)
    接口委托实现--通过类的对象
    排序
    Socket编程(摘抄)
    Delphi线程同步
  • 原文地址:https://www.cnblogs.com/youngchaolin/p/13754463.html
Copyright © 2020-2023  润新知