记录一下Hive的安装和常用的三种交互方式的使用,参考文末博文和老王,需要提前安装好MySQL和配置好Hadoop集群。
使用版本:
(1)MySQL:5.7.28
(2)Hadoop:2.6.0-cdh5.14.2
(3)Hive:1.1.0-cdh5.14.2
Hive的安装
Hive的安装,需要完成MySQL的安装、Hadoop的配置、Hive中conf文件的配置、Hive中日志路径的配置。
MySQL的安装
参考博文https://www.cnblogs.com/youngchaolin/p/13702019.html 。
Hadoop的安装
参考博文https://www.cnblogs.com/youngchaolin/p/11444518.html
Hive的安装
(1)下载hive的安装包,地址:http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz
# wget下载,下载到当前目录/kkb/install [hadoop@node01 /kkb/install]$ wget http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.2.tar.gz
(2)解压安装包到指定的目录
# 解压到安装目录 tar -zxvf hive-1.1.0-cdh5.14.2.tar.gz -C /kkb/install/
(3)修改hive/conf/hive-env.sh
进入hive安装目录的conf目录,修改hive-env.sh,配置HADOOP_HOME和HIVE_CONF_DIR路径。
# Set HADOOP_HOME to point to a specific hadoop install directory export HADOOP_HOME=/kkb/install/hadoop-2.6.0-cdh5.14.2/ # Hive Configuration Directory can be controlled by: export HIVE_CONF_DIR=/kkb/install/hive-1.1.0-cdh5.14.2/conf
(4)vim新增hive/conf/hive-site.xml
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://node03:3306/hive?createDatabaseIfNotExist=true&characterEncoding=latin1&useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> <property> <name>hive.cli.print.header</name> <value>true</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>node01.kaikeba.com</value> </property> </configuration>
(5)修改hive/conf/hive-log4j.properties
复制hive/conf/ hive-log4j.properties.template为hive-log4j.properties,配置hive日志文件存储的地址。
# 配置日志文件存放地址
hive.log.dir=/kkb/install/hive-1.1.0-cdh5.14.2/logs/
(6)将mysql驱动包上传到hive的lib目录。
[hadoop@node01 /kkb/soft]$ cp mysql-connector-java-5.1.38.jar /kkb/install/hive-1.1.0-cdh5.14.2/lib/
这样hive的安装就完成了,接下来使用三种交互方式测试一下。
Hive的交互方式
hive有三种交互方式 ,使用之前需要先启动hadoop集群。
Hive CLI
直接执行hive/bin/hive脚本,从WARNING提示可以看出,一般不推荐这种方式。
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 16:55:28,709 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/09/30 16:55:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties # Hive CLI的方式不推荐 WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
# 查询数据库 hive> show databases; OK default Time taken: 8.535 seconds, Fetched: 1 row(s) hive>
另外,hive CLI命令窗口下可以直接查看本地以及HDFS文件系统。
[hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 17:24:31,509 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/09/30 17:24:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties WARNING: Hive CLI is deprecated and migration to Beeline is recommended. # 查看本地文件系统 hive> !ls /kkb/install; hadoop-2.6.0-cdh5.14.2 hbase-1.2.0-cdh5.14.2 hive-1.1.0-cdh5.14.2 hive-1.1.0-cdh5.14.2.tar.gz hive.sql jdk1.8.0_181 zookeeper-3.4.5-cdh5.14.2 # 可以查看hdfs文件系统 hive> dfs -ls /; Found 8 items -rw-r--r-- 3 hadoop supergroup 13612 2020-03-09 10:42 /dataskew.txt drwxr-xr-x - root supergroup 0 2020-03-09 10:48 /dataskewOutput drwxr-xr-x - hadoop supergroup 0 2020-03-09 16:13 /ncdcDataWithTotalOrder drwxr-xr-x - root supergroup 0 2020-03-09 15:35 /ncdcDataclean drwxr-xr-x - hadoop supergroup 0 2020-03-09 15:32 /ncdcdata -rw-r--r-- 3 root supergroup 212005 2020-03-09 14:21 /sequencefile drwx------ - hadoop supergroup 0 2020-09-30 16:52 /tmp drwx------ - hadoop supergroup 0 2020-09-30 17:19 /user hive>
beeline
使用beeline方式连接,需先运行hiveserver2,启动后使用jps查看就是runjar,这种方式比较常用,在开发阶段会大量使用。
# 本次是后台启动,也可以前台启动,使用./hive --service hiveserver2 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup ./hive --service hiveserver2 & [1] 10103 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ nohup: ignoring input and appending output to ‘nohup.out’ # 有RunJar即启动了hiveserver2,进程号为10103 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ jps 8705 ResourceManager 8228 NameNode 10103 RunJar 8476 SecondaryNameNode 8812 NodeManager 8334 DataNode 10271 Jps
使用beeline连接hiveserver2,如果是上面是后台启动,直接在当前窗口连接就行,前台启动的话就需要另外开一个窗口连接。
# 启动beeline [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 17:07:31,569 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Beeline version 1.1.0-cdh5.14.2 by Apache Hive # 使用jdbc连接 beeline> !connect jdbc:hive2://node01:10000 scan complete in 2ms Connecting to jdbc:hive2://node01:10000 # 直接回车,不用输用户名 Enter username for jdbc:hive2://node01:10000: # 直接回车,不用输密码 Enter password for jdbc:hive2://node01:10000: Connected to: Apache Hive (version 1.1.0-cdh5.14.2) Driver: Hive JDBC (version 1.1.0-cdh5.14.2) Transaction isolation: TRANSACTION_REPEATABLE_READ # 查询数据库 0: jdbc:hive2://node01:10000> show databases; INFO : Compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 1.72 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868): show databases INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hadoop_20200930170808_9f4ee6e8-2a37-4416-8c43-9bf8726a2868); Time taken: 0.157 seconds INFO : OK +----------------+--+ | database_name | +----------------+--+ | default | +----------------+--+ 1 row selected (8.47 seconds)
hive命令
(1)hive -e "hql"
使用这种方式可以直接执行hql语句。
# 直接查询数据库 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;" SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 17:15:00,248 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/09/30 17:15:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties OK default Time taken: 7.221 seconds, Fetched: 1 row(s)
(2)hive -f sql脚本文件
一般在开发完成后,可以把hql写入到脚本里,然后使用这种方式来执行。
可以在/kkb/install下使用vim命令新建一段脚本,内容如下。
create database if not exists youngchaolin_hive;
然后使用hive -f命令来执行这个脚本。
# 执行脚本,文件l不需要为可执行文件 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -f /kkb/install/hive.sql SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 17:19:48,072 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/09/30 17:19:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties OK Time taken: 5.016 seconds # 查看结果,发现已成功创建数据库 [hadoop@node01 /kkb/install/hive-1.1.0-cdh5.14.2/bin]$ ./hive -e "show databases;" SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/kkb/install/hbase-1.2.0-cdh5.14.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/kkb/install/hadoop-2.6.0-cdh5.14.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2020-09-30 17:20:12,228 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/09/30 17:20:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Logging initialized using configuration in file:/kkb/install/hive-1.1.0-cdh5.14.2/conf/hive-log4j.properties OK default youngchaolin_hive Time taken: 4.062 seconds, Fetched: 2 row(s)
新建的数据库,默认路径在hdfs的 /user/hive/warehouse/youngchaolin_hive.db。
以上,理解不一定正确,学习就是一个不断认识和纠错的过程,如果有误还请批评指正。
参考博文
(1)https://www.cnblogs.com/youngchaolin/p/13702019.html 安装MySQL
(2)https://www.cnblogs.com/youngchaolin/p/11444518.html 配置3节点Hadoop集群