1 系统环境
Ubuntu 14.10
JDK-7
Hadoop 2.6.0
2 安装步骤
2.1 下载Hive
我第一次安装的时候,下载的是Hive-1.2.1,配置好之后,总是报错
[ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected at jline.TerminalFactory.create(TerminalFactory.java:101) at jline.TerminalFactory.get(TerminalFactory.java:158) at jline.console.ConsoleReader.<init>(ConsoleReader.java:229) at jline.console.ConsoleReader.<init>(ConsoleReader.java:221) at jline.console.ConsoleReader.<init>(ConsoleReader.java:209) at org.apache.hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Exception in thread "main" java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected at jline.console.ConsoleReader.<init>(ConsoleReader.java:230) at jline.console.ConsoleReader.<init>(ConsoleReader.java:221) at jline.console.ConsoleReader.<init>(ConsoleReader.java:209) at org.apache.hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
对于这个错误,很多人的做法都是将hadoop/share/hadoop/yarn/lib中的jline 替换成高版本的,如hive中的2.×
1.Delete jline from the Hadoop lib directory (it's only pulled in transitively from ZooKeeper).
2.export HADOOP_USER_CLASSPATH_FIRST=true
但是我替换之后,还是报错,后来换了一个低版本的HIve-1.0.1,就好了,所以如果出现如上错误,不一定非的是jar问题,也可能是版本问题。
下载地址:http://mirrors.hust.edu.cn/apache/hive/hive-1.0.1/apache-hive-1.0.1-bin.tar.gz
2.2 配置hive
2.2.1 拷贝一份hive-env.sh,设置HADOOP_HOME
2.2.2 拷贝一份hive-site.xml
主要修改如下参数
<property> <name>javax.jdo.option.ConnectionURL </name> <value>jdbc:mysql://localhost:3306/hive </value> </property> <property> <name>javax.jdo.option.ConnectionDriverName </name> <value>com.mysql.jdbc.Driver </value> </property> <property> <name>javax.jdo.option.ConnectionPassword </name> <value>hive </value> </property> <property> <name>hive.hwi.listen.port </name> <value>10000</value> <description>This is the port the Hive Web Interface will listen on </descript ion> </property> <property> <name>datanucleus.autoCreateSchema </name> <value>true</value> </property> <property> <name>datanucleus.fixedDatastore </name> <value>false</value> </property> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>Username to use against metastore database</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/home/hdpsrc/hive/iotmp</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/home/hdpsrc/hive/iotmp</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>/home/hdpsrc/hive/iotmp</value> <description>Location of Hive run time structured log file</description> </property>
2.3 拷贝一份hive-log4j.properties
这里我只是设置了log文件的路径
2.4 建立相关文件夹,就是在2.2中设置的
2.5 在HDFS中建立相关文件夹
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
2.3 安装MySQL
因为这里的关系型数据库是用的mysql,那么需要安装mysql,同时按照上面配置的新建hive用户,新建数据库。
安装mysql参考:http://www.cnblogs.com/liuchangchun/p/4099003.html
新建用户参考:http://www.cnblogs.com/liuchangchun/p/4431426.html
因为之前安装mysql后将编码设置成了utf8的,在启动hive的时候报了如下错:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes
这个错误,需要修改hive数据库的编码:
alter database hive character set latin1;
2.4 拷贝mysql的驱动jar包到hive/lib下面
如果不拷贝,那么会报错:找不到jar包
Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. at org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:58) at org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238) ... 67 more
2.5 启动hive
启动之前要确保hdfs已经起来。
在命令行执行:hive,即可进入hive命令行界面,如果没有错误,恭喜你,OK了。
如果HDFS没起来,那么可能报错.。。
2.6 启动hwi WEB客户端
hive提供了几种交互方式,有命令行,还有web界面的。如果要启动web服务,可以通过如下命令
hive --service hwi
但是之前需要设置一些属性
<property> <name>hive.hwi.listen.host</name> <value>192.168.1.102</value> <description>This is the host address the Hive Web Interface will listen on</description> </property> <property> <name>hive.hwi.listen.port</name> <value>10000</value> <description>This is the port the Hive Web Interface will listen on</description> </property> <property> <name>hive.hwi.war.file</name> <value>lib/hive-hwi-1.0.1.war</value> <description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description> </property>
启动之后,进入http://hostname:10000/hwi/。但是,我这里遇到了几个问题
a. 找不到war包
ls: cannot access /home/grid2/apache-hive-0.13.1-bin/lib/hive-hwi-*.war: No such file or directory
14/09/14 21:07:10 INFO hwi.HWIServer: HWI is starting up
14/09/14 21:07:11 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
14/09/14 21:07:11 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead
14/09/14 21:07:11 FATAL hwi.HWIServer: HWI WAR file not found at /home/grid2/apache-hive-0.13.1-bin/lib/hive-hwi-@VERSION@.war
因为在hive/lib下面确实没有hive-hwi-*.war包,那我们可以自己下载hive的源码,然后打包成war放大hive/lib下面。
可以参考:
http://blog.csdn.net/bluishglc/article/details/41652111
http://blog.csdn.net/wulantian/article/details/38271803
b. 启动hwi后没有错误,但是访问http://hostname:10000/hwi/就会报错
15/08/27 14:13:40 ERROR mortbay.log: /hwi/ java.lang.IllegalStateException: No Java compiler available at org.apache.jasper.JspCompilationContext.createCompiler(JspCompilationContext.java:225) at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:560) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:299) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315) …………………………………………………………………………
这个问题目前没解决o(╯□╰)o,我的替代方案是使用HUE
a. HUE中配置HIVE
找到beeswax标签,不叫hive,配置如下属性,其中端口号要和hive-site.xml中的保持一致
hive-site.xml中配置thrift端口号
<property> <name>hive.server2.thrift.port</name> <value>19999</value> <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description> </property>
hive_server_host=192.168.1.102 # Port where HiveServer2 Thrift server runs on. hive_server_port=19999 # Hive configuration directory, where hive-site.xml is located hive_conf_dir=/home/hadoop/software/cloud/apache-hive-1.0.1-bin/conf
使用hue之前需要先启动hive thrift server
hive --service hiveserver2
我这里配置好之后遇到一个错误,费了好大劲才找到问题所在
java.lang.RuntimeException: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:84) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy21.fetchResults(Unknown Source) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:450) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:587) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:312) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:442) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:588) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1621) at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:337) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:248) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:654) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) ... 19 more
如上错误,访问错误,我找到com.google.common.base.Stopwatch这个类的jar包,是guava.jar,这个jar包在hadoop和hive中都有,版本是11.0.2,同时我在JAVA_HOME中也有一个在jre/lib/ext中,但是版本是18.0(这个估计是我当时安装ambari时候放进去的,因为一个问题,要将guava换成最新版),这样估计造成了版本不一致,所以出现了如上的错误。解决办法就是删了那个jar包,或是换成11.02,所以这里也要考虑一个程序加载顺序。
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
参考:
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-InstallationandConfiguration
http://blog.csdn.net/an342647823/article/details/46048403
http://blog.csdn.net/keljony/article/details/43371995
http://sunjia-704471770-qq-com.iteye.com/blog/1631430
http://blog.csdn.net/jdplus/article/details/46493553