• Spark 操作Hive 流程


    1.ubuntu 装mysql

    2.进入mysql:

    3.mysql>create database hive (这个将来是存 你在Hive中建的数据库以及表的信息的(也就是元数据))mysql=》hive 这里不存具体数值

    4.mysql> grant all on *.* to hive@localhost identified by 'hive' #将所有数据库的所有表的所有权限赋给hive用户,后面的hive是配置hive-site.xml中配置的连接密码

    5.mysql> flush privileges; #刷新mysql系统权限关系表

    要启动hive 需先启动hadoop,因为Hive是基于Hadoop的数据仓库,使用HiveQL语言撰写的查询语句,最终都会被Hive自动解析成MapReduce任务由Hadoop去具体执行,因此,需要启动Hadoop,然后再启动Hive

    start-dfs.sh  (hadoop)

    hive (这里你在~/.bashrc中配过hive,可以直接在shell中这样写)

    6.都成功的话,在hive建数据库,create database if not exists hive;

    use hive;

    7.hive 建表:

    hive> create table if not exists student(

    > id int,

    > name string,

    > gender string,

    > age int);                         

     8.插入数据:insert into student values(1,'xiaodou','B',28);

    9.select * from student;

    10.连接hive读写数据

    11.cd /usr/loacl2/spark/conf

    vim spark-env.sh:

    export SPARK_DIST_CLASSPATH=$(/usr/local2/hadoop/bin/hadoop classpath)
    export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
    export CLASSPATH=$CLASSPATH:/usr/local2/hive/lib
    export SCALA_HOME=/usr/local/scala
    export HADOOP_CONF_DIR=/usr/local2/hadoop/etc/hadoop
    export HIVE_CONF_DIR=/usr/local2/hive/conf
    export SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/local2/hive/lib/mysql-connector-java-5.1.40-bin.jar 这里并没有起作用(以后再看原因吧)

    12.为了让Spark能够访问Hive,需要把Hive的配置文件hive-site.xml拷贝到Spark的conf目录下
    hive-site.xml:(这个是在hive中自己建的源码中没有,记得将hive-default.xml.template重命名为hive-default.xml)
    
    
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <configuration>
      <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
        <description>JDBC connect string for a JDBC metastore</description>
      </property>
      <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
      </property>
      <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>username to use against metastore database</description>
      </property>
      <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
        <description>password to use against metastore database</description>
      </property>
    </configuration>
    13. 这下你就可以顺利的在spark-shell中操作hive
     ./spark-shell --driver-class-path /usr/local2/hive/lib/mysql-connector-java-5.1.44-bin.jar
  • 相关阅读:
    JavaScript cookie详解
    Javascript数组的排序:sort()方法和reverse()方法
    javascript中write( ) 和 writeln( )的区别
    div做表格
    JS 盒模型 scrollLeft, scrollWidth, clientWidth, offsetWidth 详解
    Job for phpfpm.service failed because the control process exited with error code. See "systemctl status phpfpm.service" and "journalctl xe" for details.
    orm查询存在价格为空问题
    利用救援模式破解系统密码
    SSH服务拒绝了密码
    C# 调用 C++ DLL 中的委托,引发“对XXX::Invoke类型的已垃圾回收委托进行了回调”错误的解决办法
  • 原文地址:https://www.cnblogs.com/soyo/p/7668124.html
Copyright © 2020-2023  润新知