• hive


    hive进行mapreduce卡壳解决方法

    在google搜索得出的解决方案是在执行的hive语句前添加以下几条参数值设定语句:

    set mapreduce.job.reduces=512;
    set hive.groupby.skewindata=true;
    set hive.optimize.skewjoin=true;
    set hive.skewjoin.key=5000;
    set hive.groupby.mapaggr.checkinterval=5000;

    链接:https://pan.baidu.com/s/10h4wyq5aKbnPgXaS0KhBoA
    提取码:gxcw
    复制这段内容后打开百度网盘手机App,操作更方便哦

    第一步:安装hadoop

    第二步:安装mysql的JDBC驱动程序:JDBC Driver for MySQL:https://www.mysql.com/products/connector/ 

    下载地址http://mirrors.hust.edu.cn/apache/

    选择合适的Hive版本进行下载,进到stable-2文件夹可以看到稳定的2.x的版本是2.3.4

    cd apache-hive-2.3.3-bin/conf/
    touch hive-site.xml
    vi hive-site.xml
    

      

    <configuration>
            <property>
                    <name>javax.jdo.option.ConnectionURL</name>
                    <value>jdbc:mysql://hadoop1:3306/hivedb?createDatabaseIfNotExist=true</value>
                    <description>JDBC connect string for a JDBC metastore</description>
                    <!-- 如果 mysql 和 hive 在同一个服务器节点,那么请更改 hadoop02 为 localhost -->
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionDriverName</name>
                    <value>com.mysql.jdbc.Driver</value>
                    <description>Driver class name for a JDBC metastore</description>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionUserName</name>
                    <value>root</value>
                    <description>username to use against metastore database</description>
            </property>
            <property>
                    <name>javax.jdo.option.ConnectionPassword</name>
                    <value>root</value>
            <description>password to use against metastore database</description>
            </property>
    </configuration>
    

    以下可选配置,该配置信息用来指定 Hive 数据仓库的数据存储在 HDFS 上的目录

            <property>
                    <name>hive.metastore.warehouse.dir</name>
                    <value>/hive/warehouse</value>
                    <description>hive default warehouse, if nessecory, change it</description>
            </property> 
    

      

    下载mysql-connector-java-8.0.16-1.el7.noarch.rpm

    yum -y install mysql-connector-java-8.0.16-1.el7.noarch.rpm

    [root@localhost ~]# rpm -ql mysql-connector-java-8.0.16-1.el7.noarch
    /usr/share/doc/mysql-connector-java-8.0.16
    /usr/share/doc/mysql-connector-java-8.0.16/CHANGES
    /usr/share/doc/mysql-connector-java-8.0.16/INFO_BIN
    /usr/share/doc/mysql-connector-java-8.0.16/INFO_SRC
    /usr/share/doc/mysql-connector-java-8.0.16/LICENSE
    /usr/share/doc/mysql-connector-java-8.0.16/README
    /usr/share/java/mysql-connector-java.jar  #将jar包复制到hive根目录下的lib目录里去。
    

    vim ~/.bashrc

    export HIVE_HOME=/usr/local/apache-hive-2.3.4-bin
    export HADOOP_HOME=/usr/local/hadoop-3.1.2
    export PATH=$PATH:$HIVE_HOME/bin
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre

    source ~/.bashrc

    验证hive安装:

    [root@localhost ~]# hive --help
    Usage ./hive <parameters> --service serviceName <service parameters>
    Service List: beeline cleardanglingscratchdir cli hbaseimport hbaseschematool help hiveburninclient hiveserver2 hplsql jar lineage llapdump llap llapstatus metastore metatool orcfiledump rcfilecat schemaTool version 
    Parameters parsed:
      --auxpath : Auxiliary jars 
      --config : Hive configuration directory
      --service : Starts specific service/component. cli is default
    Parameters used:
      HADOOP_HOME or HADOOP_PREFIX : Hadoop install directory
      HIVE_OPT : Hive options
    For help on a particular service:
      ./hive --service serviceName --help
    Debug help:  ./hive --debug --help
    

      初始化元数据库:

    [root@localhost ~]# schematool -dbType mysql -initSchema
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Metastore connection URL:	 jdbc:mysql://localhost:3306/hivedb?createDatabaseIfNotExist=true
    Metastore Connection Driver :	 com.mysql.jdbc.Driver
    Metastore connection User:	 root
    Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
    Starting metastore schema initialization to 2.3.0
    Initialization script hive-schema-2.3.0.mysql.sql
    Initialization script completed
    schemaTool completed
    

      

    启动hive客户端

      hive --service cli和hive效果一样

    [root@localhost ~]# hive
    which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin:/usr/local/apache-hive-2.3.4-bin/bin)
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    
    Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.3.4-bin/lib/hive-common-2.3.4.jar!/hive-log4j2.properties Async: true
    Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
    hive>
    

      

     基本应用:

    95002,刘晨,女,19,IS
    95017,王风娟,女,18,IS
    95018,王一,女,19,IS
    95013,冯伟,男,21,CS
    95014,王小丽,女,19,CS
    95019,邢小丽,女,19,IS
    95020,赵钱,男,21,IS
    95003,王敏,女,22,MA
    95004,张立,男,19,IS
    95012,孙花,女,20,CS
    95010,孔小涛,男,19,CS
    95005,刘刚,男,18,MA
    95006,孙庆,男,23,CS
    95007,易思玲,女,19,MA
    95008,李娜,女,18,CS
    95021,周二,男,17,MA
    95022,郑明,男,20,MA
    95001,李勇,男,20,CS
    95011,包小柏,男,18,MA
    95009,梦圆圆,女,18,MA
    95015,王君,男,18,MA
    

      

    hive> create database myhive; #创建数据库
    Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
    OK
    Time taken: 5.433 seconds
    
    hive> show databases;  #查看有哪些数据库 OK default myhive Time taken: 0.182 seconds, Fetched: 2 row(s)
    hive> use myhive;  #进入数据库 OK Time taken: 0.082 seconds
    hive> select current_database(); OK myhive Time taken: 0.163 seconds, Fetched: 1 row(s) hive> create table student(id int, name string, sex string, age int, department string) row format delimited fields terminated by ",";
    hive> load data local inpath "/home/hadoop/student.txt" into table student; hive> select * from student; OK 95002 刘晨 女 19 IS 95017 王风娟 女 18 IS 。。。。。。

      

    
    

    hive 使用方法

    desc student; 描述表
    desc extended student ; 查看表的详细信息(表的类型(内部表、外部表),表压缩否)
    desc formatted student ; 格式化输出表信息
    show create table student ;查看建表语句
    show functions ; 查看hive中的函数
    desc function upper; 描述函数
    desc function extended upper; 描述方法具体使用方法
    show tables; 查看所有的表
    show databases; 查看所有的数据库
    set hive.cli.print.header =true 设置参数,临时生效
    minimal 是否跑mapreduce可配
    

      

     
  • 相关阅读:
    最短路径BellmanFord , Dijsktra
    minhash
    eclipse 中使用tomcat
    http 服务
    MongoDB小记
    java post 请求
    hadoop拾遗(五)---- mapreduce 输出到多个文件 / 文件夹
    weka数据挖掘拾遗(二)---- 特征选择(IG、chi-square)
    weka数据挖掘拾遗(一)---- 生成Arff格式文件
    基于SimHash的微博去重
  • 原文地址:https://www.cnblogs.com/linuxws/p/10780966.html
Copyright © 2020-2023  润新知