hive进行mapreduce卡壳解决方法
在google搜索得出的解决方案是在执行的hive语句前添加以下几条参数值设定语句:
set mapreduce.job.reduces=512;
set hive.groupby.skewindata=true;
set hive.optimize.skewjoin=true;
set hive.skewjoin.key=5000;
set hive.groupby.mapaggr.checkinterval=5000;
链接:https://pan.baidu.com/s/10h4wyq5aKbnPgXaS0KhBoA
提取码:gxcw
复制这段内容后打开百度网盘手机App,操作更方便哦
第一步:安装hadoop
第二步:安装mysql的JDBC驱动程序:JDBC Driver for MySQL:https://www.mysql.com/products/connector/
下载地址http://mirrors.hust.edu.cn/apache/
选择合适的Hive版本进行下载,进到stable-2文件夹可以看到稳定的2.x的版本是2.3.4
cd apache-hive-2.3.3-bin/conf/ touch hive-site.xml vi hive-site.xml
<configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://hadoop1:3306/hivedb?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> <!-- 如果 mysql 和 hive 在同一个服务器节点,那么请更改 hadoop02 为 localhost --> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>root</value> <description>password to use against metastore database</description> </property> </configuration>
以下可选配置,该配置信息用来指定 Hive 数据仓库的数据存储在 HDFS 上的目录
<property> <name>hive.metastore.warehouse.dir</name> <value>/hive/warehouse</value> <description>hive default warehouse, if nessecory, change it</description> </property>
下载mysql-connector-java-8.0.16-1.el7.noarch.rpm
yum -y install mysql-connector-java-8.0.16-1.el7.noarch.rpm
[root@localhost ~]# rpm -ql mysql-connector-java-8.0.16-1.el7.noarch /usr/share/doc/mysql-connector-java-8.0.16 /usr/share/doc/mysql-connector-java-8.0.16/CHANGES /usr/share/doc/mysql-connector-java-8.0.16/INFO_BIN /usr/share/doc/mysql-connector-java-8.0.16/INFO_SRC /usr/share/doc/mysql-connector-java-8.0.16/LICENSE /usr/share/doc/mysql-connector-java-8.0.16/README /usr/share/java/mysql-connector-java.jar #将jar包复制到hive根目录下的lib目录里去。
vim ~/.bashrc
export HIVE_HOME=/usr/local/apache-hive-2.3.4-bin export HADOOP_HOME=/usr/local/hadoop-3.1.2 export PATH=$PATH:$HIVE_HOME/bin export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre
source ~/.bashrc
验证hive安装:
[root@localhost ~]# hive --help Usage ./hive <parameters> --service serviceName <service parameters> Service List: beeline cleardanglingscratchdir cli hbaseimport hbaseschematool help hiveburninclient hiveserver2 hplsql jar lineage llapdump llap llapstatus metastore metatool orcfiledump rcfilecat schemaTool version Parameters parsed: --auxpath : Auxiliary jars --config : Hive configuration directory --service : Starts specific service/component. cli is default Parameters used: HADOOP_HOME or HADOOP_PREFIX : Hadoop install directory HIVE_OPT : Hive options For help on a particular service: ./hive --service serviceName --help Debug help: ./hive --debug --help
初始化元数据库:
[root@localhost ~]# schematool -dbType mysql -initSchema SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Metastore connection URL: jdbc:mysql://localhost:3306/hivedb?createDatabaseIfNotExist=true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: root Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed
启动hive客户端
hive --service cli和hive效果一样
[root@localhost ~]# hive which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.3.4-bin/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive>
基本应用:
95002,刘晨,女,19,IS 95017,王风娟,女,18,IS 95018,王一,女,19,IS 95013,冯伟,男,21,CS 95014,王小丽,女,19,CS 95019,邢小丽,女,19,IS 95020,赵钱,男,21,IS 95003,王敏,女,22,MA 95004,张立,男,19,IS 95012,孙花,女,20,CS 95010,孔小涛,男,19,CS 95005,刘刚,男,18,MA 95006,孙庆,男,23,CS 95007,易思玲,女,19,MA 95008,李娜,女,18,CS 95021,周二,男,17,MA 95022,郑明,男,20,MA 95001,李勇,男,20,CS 95011,包小柏,男,18,MA 95009,梦圆圆,女,18,MA 95015,王君,男,18,MA
hive> create database myhive; #创建数据库 Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. OK Time taken: 5.433 seconds
hive> show databases; #查看有哪些数据库 OK default myhive Time taken: 0.182 seconds, Fetched: 2 row(s)
hive> use myhive; #进入数据库 OK Time taken: 0.082 seconds
hive> select current_database(); OK myhive Time taken: 0.163 seconds, Fetched: 1 row(s) hive> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
hive> load data local inpath "/home/hadoop/student.txt" into table student; hive> select * from student; OK 95002 刘晨 女 19 IS 95017 王风娟 女 18 IS 。。。。。。
hive 使用方法
desc student; 描述表 desc extended student ; 查看表的详细信息(表的类型(内部表、外部表),表压缩否) desc formatted student ; 格式化输出表信息 show create table student ;查看建表语句 show functions ; 查看hive中的函数 desc function upper; 描述函数 desc function extended upper; 描述方法具体使用方法 show tables; 查看所有的表 show databases; 查看所有的数据库 set hive.cli.print.header =true 设置参数,临时生效 minimal 是否跑mapreduce可配