1、解压
[root@cluster3 hadoop]# tar -zxvf apache-hive-0.13.1-bin.tar.gz
2、修改环境变量
export HIVE_HOME=/usr/local/hadoop/apache-hive-0.13.1-bin export PATH=$HIVE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH /usr/local/hadoop/apache-hive-0.13.1-bin/conf cp hive-default.xml.template hive-default.xml cp hive-default.xml.template hive-site.xml cp hive-env.sh.template hive-env.sh cp hive-log4j.properties.template hive-log4j.properties
3、安装mysql并启动对应服务
yum install mysql yum install mysql-server yum install mysql-devel service mysqld start
4、为Hive建立相应的MySQL帐号,并赋予足够的权限
1.mysql的root命令行:mysql -uroot -p 2.创建hive数据库:create database hive; mysql> create database hive; Query OK, 1 row affected (0.00 sec) 3.创建用户hive,它只能从cluster3连接到数据库并可以连接到wordpress数据库 mysql> grant all on hive.* to hive@cluster3 identified by '123456'; Query OK, 0 rows affected (0.00 sec)
5、配置hive-env.sh
在hive-env.sh中添加HADOOP_HOME的安装目录地址 HADOOP_HOME=/usr/local/hadoop/hadoop-2.5.1 在hive-log4j.properties中将log4j.appender.EventCounter的值修改为org.apache.hadoop.log.metrics.EventCounter, 这样就不会报WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.的警告了。 export HIVE_CONF_DIR=/usr/local/hadoop/apache-hive-0.13.1-bin/conf # Folder containing extra ibraries required for hive compilation/execution can be controlled by: export HIVE_AUX_JARS_PATH=/usr/local/hadoop/apache-hive-0.13.1-bin/lib [root@cluster3 conf]# vi hive-default.xml <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/local/hadoop/apache-hive-0.13.1-bin/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/usr/local/hadoop/apache-hive-0.13.1-bin/tmp</value> <description>Scratch space for Hive jobs</description> </property> <property> <name>hive.querylog.location</name> <value>/usr/local/hadoop/apache-hive-0.13.1-bin/log</value> <description> Location of Hive run time structured log file </description> </property>
6、配置hive-site.xml
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://cluster3:9000/hadoop/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
这里就与前面的hdfs dfs -mkdir -p /usr/hive/warehouse相对应
<property>
<name>hive.exec.scratchdir</name>
<value>hdfs://cluster3:9000/hadoop/hive/scratchdir</value>
<description>Scratch space for Hive jobs</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/hadoop/apache-hive-0.13.1-bin/log</value>
<description>
Location of Hive run time structured log file
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.300.3:3306/hive?createDatabaseIfNoExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
这个javax.jdo.option.ConnectionUserName
是用来设置hive存放的元数据的数据库(这里是mysql数据库)的用户名称的。
而这个‘hive‘可以根据用户自己的需要来自行设置
--------------------------------------
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
这个javax.jdo.option.ConnetionPassword是用来设置,
用户登录数据库的时候需要输入的密码的。
而这个‘hive’同样可以根据用户自己的需要来进行设置。
<property>
<name>hive.aux.jars.path</name>
<value>file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/hive-hbase-handler-0.13.1.jar,file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/protobuf-java-2.5.0.jar,file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/hbase-client-0.96.0-hadoop2.jar,file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/hbase-common-0.96.0-hadoop2.jar,file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/zookeeper-3.4.5.jar,file:///usr/local/hadoop/apache-hive-0.13.1-bin/lib/guava-11.0.2.jar</value>
</property>
4个jar需要从hbase里拷贝到hive/lib中
-rw-r--r--. 1 root root 1795932 Oct 2 02:39 guava-12.0.1.jar
-rw-r--r--. 1 root root 940735 Oct 9 06:58 hbase-client-0.98.7-hadoop2.jar
-rw-r--r--. 1 root root 443500 Oct 9 06:58 hbase-common-0.98.7-hadoop2.jar
-rw-r--r--. 1 root root 533455 Oct 2 02:18 protobuf-java-2.5.0.jar
ll guava-12.0.1.jar hbase-client-0.98.7-hadoop2.jar hbase-common-0.98.7-hadoop2.jar protobuf-java-2.5.0.jar
cp /usr/local/hadoop/hbase-0.98.7-hadoop2/lib/guava-12.0.1.jar /usr/local/hadoop/apache-hive-0.13.1-bin/lib/
cp /usr/local/hadoop/hbase-0.98.7-hadoop2/lib/hbase-client-0.98.7-hadoop2.jar /usr/local/hadoop/apache-hive-0.13.1-bin/lib/
cp /usr/local/hadoop/hbase-0.98.7-hadoop2/lib/hbase-common-0.98.7-hadoop2.jar /usr/local/hadoop/apache-hive-0.13.1-bin/lib/
cp /usr/local/hadoop/hbase-0.98.7-hadoop2/lib/protobuf-java-2.5.0.jar /usr/local/hadoop/apache-hive-0.13.1-bin/lib/
/*********************************
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.300.3:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
******************************************/
7、启动Hive
启动之前先要创建hive元数据存放的路径文件:
[root@cluster3 conf]# hadoop fs -mkdir -p /hadoop/hive/scratchdir
8、debug
hive -hiveconf hive.root.logger=DEBUG,console
9、测试
show tables; CREATE TABLE my(id INT,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '; hive> show tables; OK my Time taken: 0.653 seconds, Fetched: 1 row(s) hive> select name from my; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1416022192895_0001, Tracking URL = http://cluster3:8088/proxy/application_1416022192895_0001/ Kill Command = /usr/local/hadoop/hadoop-2.5.1/bin/hadoop job -kill job_1416022192895_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2014-11-17 15:52:02,989 Stage-1 map = 0%, reduce = 0% 2014-11-17 15:52:41,663 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.08 sec MapReduce Total cumulative CPU time: 1 seconds 80 msec Ended Job = job_1416022192895_0001 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 1.08 sec HDFS Read: 270 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 80 msec OK Time taken: 102.677 seconds mysql> select * from TBLS; +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+ | TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+ | 1 | 1416210559 | 1 | 0 | root | 0 | 1 | my | MANAGED_TABLE | NULL | NULL | +--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+ 1 row in set (0.02 sec)