Beeline,它其实是HiveServer2的JDBC客户端,基于SQLLine命令行接口。Beeline Shell可以工作在嵌入式模式和远程模式,在嵌入式模式中,它运行一个嵌入式的Hive(类似于Hive CLI),在远程模式中,通过Thrift连接到一个单独的HiveServer2进程,从Hive 0.14开始,当Beeline和HiveServer2一起使用时,它会从HiveServer2打印执行查询的日志信息到STDERR。建议在生产环境使用远程HiveServer2模式,因为这样更安全,不需要为用户授予直接的HDFS/Metastore访问权限。
一 Hive环境
hive> select version();
OK
2.3.3 r8a511e3f79b43d4be41cd231cf5c99e43b248383
Time taken: 11.166 seconds, Fetched: 1 row(s)
二 运行Beeline
Hive的运行依赖于Hadoop,所以首先启动Hadoop,而Beeline的运行必须首先启动HiveServer2,下面将分别介绍:
1 启动HDFS
[hadoop@strong ~]$ start-all.sh
注:该脚本将会被弃用,建议使用start-dfs.sh and start-yarn.sh启动HADOOP。
2 启动HiveServer2
1)方法一
[hadoop@strong ~]$ hiveserver2
2)方法二
[hadoop@strong ~]$ hive --service hiveserver2
3 启动Beeline
1)方法一
[hadoop@strong ~]$ beeline
2)方法二
[hadoop@strong ~]$ hive --service beeline
beeline> !connect jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
Enter username for jdbc:hive2://localhost:10000/default: hadoop
Enter password for jdbc:hive2://localhost:10000/default: ******
Connected to: Apache Hive (version 2.3.3)
Driver: Hive JDBC (version 2.3.3)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000/default>
注:进行连接出现如下错误时,需要对/usr/local/hadoop/etc/hadoop/core-site.xml进行配置
apache.hadoop.security.authorize.AuthorizationException): User: hadoop is not allowed to impersonate root (state=08S01,code=0)
增加配置内容为:
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
三 Beeline介绍
1 Beeline示例
[hadoop@strong ~]$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.3.3-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.6/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.3.3 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/hive
Connecting to jdbc:hive2://localhost:10000/hive
Enter username for jdbc:hive2://localhost:10000/hive: hadoop
Enter password for jdbc:hive2://localhost:10000/hive: ******
Connected to: Apache Hive (version 2.3.3)
Driver: Hive JDBC (version 2.3.3)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000/hive> show tables;
+-----------+
| tab_name |
+-----------+
| city |
| emp |
| t_emp |
| test |
+-----------+
4 rows selected (0.432 seconds)
0: jdbc:hive2://localhost:10000/hive>
注:也可以通过如下方式进行连接和访问。
[hadoop@strong ~]$ beeline -u jdbc:hive2://localhost:10000/hive -n hadoop -p hadoop
2 Beeline命令
通过!<SQLLine 命令>的方式执行,本篇将演示部分SQLLine命令。
1)查看帮助信息
0: jdbc:hive2://localhost:10000/hive> !help
!addlocaldriverjar Add driver jar file in the beeline client side.
!addlocaldrivername Add driver name that needs to be supported in the beeline
client side.
!all Execute the specified SQL against all the current connections
!autocommit Set autocommit mode on or off
!batch Start or execute a batch of statements
!brief Set verbose mode off
!call Execute a callable statement
!close Close the current connection to the database
!closeall Close all current open connections
!columns List all the columns for the specified table
!commit Commit the current transaction (if autocommit is off)
!connect Open a new connection to the database.
!dbinfo Give metadata information about the database
!describe Describe a table
!dropall Drop all tables in the current database
!exportedkeys List all the exported keys for the specified table
!go Select the current connection
!help Print a summary of command usage
!history Display the command history
!importedkeys List all the imported keys for the specified table
!indexes List all the indexes for the specified table
!isolation Set the transaction isolation for this connection
!list List the current connections
!manual Display the BeeLine manual
!metadata Obtain metadata information
!nativesql Show the native SQL for the specified statement
!nullemptystring Set to true to get historic behavior of printing null as
empty string. Default is false.
!outputformat Set the output format for displaying results
(table,vertical,csv2,dsv,tsv2,xmlattrs,xmlelements, and
deprecated formats(csv, tsv))
!primarykeys List all the primary keys for the specified table
!procedures List all the procedures
!properties Connect to the database specified in the properties file(s)
!quit Exits the program
!reconnect Reconnect to the database
!record Record all output to the specified file
!rehash Fetch table and column names for command completion
!rollback Roll back the current transaction (if autocommit is off)
!run Run a script from the specified file
!save Save the current variabes and aliases
!scan Scan for installed JDBC drivers
!script Start saving a script to a file
!set Set a beeline variable
!sh Execute a shell command
!sql Execute a SQL command
!tables List all the tables in the database
!typeinfo Display the type map for the current connection
!verbose Set verbose mode on
2)列出当前的连接信息
0: jdbc:hive2://localhost:10000/hive> !list
1 active connection:
#0 open jdbc:hive2://localhost:10000/hive
3)执行清屏命令
0: jdbc:hive2://localhost:10000/hive> !sh clear
0: jdbc:hive2://localhost:10000/hive>
4)格式化输出
0: jdbc:hive2://localhost:10000/hive> !outputformat vertical
0: jdbc:hive2://localhost:10000/hive> show tables;
tab_name city
tab_name emp
tab_name t_emp
tab_name test
4 rows selected (0.251 seconds)
0: jdbc:hive2://localhost:10000/hive> !outputformat table
0: jdbc:hive2://localhost:10000/hive> show tables;
+-----------+
| tab_name |
+-----------+
| city |
| emp |
| t_emp |
| test |
+-----------+
4 rows selected (0.205 seconds)
3 Beeline Hive命令
当使用Hive JDBC驱动时,可以在Beeline运行Hive特定的命令(和Hive CLI命令一样)。
具体参考:Hive CLI初探
4 Beeline命令选项
[hadoop@strong ~]$ beeline --help
Usage: java org.apache.hive.cli.beeline.BeeLine
-u <database url> the JDBC URL to connect to
-r reconnect to last saved connect url (in conjunction with !save)
-n <username> the username to connect as
-p <password> the password to connect as
-d <driver class> the driver class to use
-i <init file> script file for initialization
-e <query> query that should be executed
-f <exec file> script file that should be executed
-w (or) --password-file <password file> the password file to read password from
--hiveconf property=value Use value for given property
--hivevar name=value hive variable name and value
This is Hive specific settings in which variables
can be set at session level and referenced in Hive
commands or queries.
--property-file=<property-file> the file to read connection properties (url, driver, user, password) from
--color=[true/false] control whether color is used for display
--showHeader=[true/false] show column names in query results
--headerInterval=ROWS; the interval between which heades are displayed
--fastConnect=[true/false] skip building table/column list for tab-completion
--autoCommit=[true/false] enable/disable automatic transaction commit
--verbose=[true/false] show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showDbInPrompt=[true/false] display the current database name in the prompt
--showNestedErrs=[true/false] display nested errors
--numberFormat=[pattern] format numbers using DecimalFormat pattern
--force=[true/false] continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns
--silent=[true/false] be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display
Note that csv, and tsv are deprecated - use csv2, tsv2 instead
--incremental=[true/false] Defaults to false. When set to false, the entire result set
is fetched and buffered before being displayed, yielding optimal
display column sizing. When set to true, result rows are displayed
immediately as they are fetched, yielding lower latency and
memory usage at the price of extra display column padding.
Setting --incremental=true is recommended if you encounter an OutOfMemory
on the client side (due to the fetched result set size being large).
Only applicable if --outputformat=table.
--incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
defaults to 1000; only applicable if --incremental=true
and --outputformat=table
--truncateTable=[true/false] truncate table column when it exceeds length
--delimiterForDSV=DELIMITER specify the delimiter for delimiter-separated values output format (default: |)
--isolation=LEVEL set the transaction isolation level
--nullemptystring=[true/false] set to true to get historic behavior of printing null as empty string
--maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
--help display this message
Example:
1. Connect using simple authentication to HiveServer2 on localhost:10000
$ beeline -u jdbc:hive2://localhost:10000 username password
2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
$ beeline -n username -p password -u jdbc:hive2://hs2.local:10012
3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
$ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"
4. Connect using SSL connection to HiveServer2 on localhost at 10000
$ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"
5. Connect using LDAP authentication
$ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>