参考相关文章,并整理了遇到的问题
https://blog.csdn.net/qq_31922231/article/details/98056113
https://baijiahao.baidu.com/s?id=1619270849703818962&wfr=spider&for=pc
一、环境信息
组件 | 版本 |
---|---|
python | 2.7 |
cdh | 5.13 |
kerberos | true |
二、安装相关包及配置
- python相关依赖包
pip install krbcontext==0.9
pip install thrift==0.9.3
pip install thrift-sasl==0.2.1
pip install impyla==0.14.1
pip install impala==0.2.0
pip install pykerberos==1.2.1
- 配置krb5.conf
安装了kerberos相关的包,/etc下面有个krb5.conf文件,换成自己集群的krb5.conf文件
三、连接hive的示例代码:
import os
from impala.dbapi import connect
from krbcontext import krbcontext
def explain_sql():
# impala连接
try:
krbcontext(using_keytab=True,
principal='hive/bi-hadoop02.xxxx.com',
keytab_file='/opt/hive.keytab')
conn = connect(host='impala-jdbc.xxxx.com',
port=21050,
auth_mechanism='GSSAPI',
kerberos_service_name='hive')
cur = conn.cursor()
cur.execute('show tables')
fetchData = cur.fetchall()
# 关闭连接
cur.close()
conn.close()
except Exception as e:
print(e)
四、报错内容
本以为根据参考文章就已经大功告成了,没想到居然报错了。报错内容贴下:
Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_0' not found)
Traceback (most recent call last):
File "./impala_jdbc.py", line 72, in <module>
parse_database(fetchData)
File "./impala_jdbc.py", line 38, in parse_database
for data in fetchData:
TypeError: 'NoneType' object is not iterable
看了下报错,大致原因是因为跑python脚本这台服务器,没有认证kerberos权限或者权限失效了。
参考https://baijiahao.baidu.com/s?id=1619270849703818962&wfr=spider&for=pc
五、解决方法
1. klist
查看是否有kerberos认证
果然没有权限
2. 添加代码
import os
os.system("kinit -kt /opt/hive.keytab hive/admin")
添加上述代码,就是每次使用jdbc时,都先认证一下防止principal过期或者跟换成别的principal,导致权限不够,没有访问impala的权限。
== 贴下完整代码 ==:
import os
from impala.dbapi import connect
from krbcontext import krbcontext
def explain_sql():
os.system("kinit -kt /opt/hive.keytab hive/admin")
# impala连接
try:
krbcontext(using_keytab=True,
principal='hive/bi-hadoop02.xxxx.com',
keytab_file='/opt/hive.keytab')
conn = connect(host='impala-jdbc.xxxx.com',
port=21050,
auth_mechanism='GSSAPI',
kerberos_service_name='hive')
cur = conn.cursor()
cur.execute('show tables')
fetchData = cur.fetchall()
# 关闭连接
cur.close()
conn.close()
except Exception as e:
print(e)
六、总结
之前java写的较多,java代码认证kerberos是在代码级别的,而python是服务器级别,所以需要手动认证一下,还是对python 不太了解。