• 大叔问题定位分享(46)spark2.4升级3.1后执行sparksql报错


    背景

    2个节点分别升级spark版本,从2.4升级到3.1,升级后一个节点执行spark-sql正常,另一个节点执行报错,报错信息如下:

    spark-sql> select * from $table where dt = '$dt' limit 5;
    Error in query: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table $table. Invalid method name: 'get_table_req'
    org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table $table. Invalid method name: 'get_table_req'
            at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
            at org.apache.spark.sql.hive.HiveExternalCatalog.tableExists(HiveExternalCatalog.scala:854)
            at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.tableExists(ExternalCatalogWithListener.scala:146)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:462)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.requireTableExists(SessionCatalog.scala:197)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableRawMetadata(SessionCatalog.scala:488)
            at org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableMetadata(SessionCatalog.scala:474)
            at org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.loadTable(V2SessionCatalog.scala:65)
            at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:282)
    ...
    Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table $table. Invalid method name: 'get_table_req'
            at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1282)
            at org.apache.spark.sql.hive.client.HiveClientImpl.getRawTableOption(HiveClientImpl.scala:392)
            at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$tableExists$1(HiveClientImpl.scala:406)
            at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
            at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:291)
            at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:224)
            at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:223)
            at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:273)
            at org.apache.spark.sql.hive.client.HiveClientImpl.tableExists(HiveClientImpl.scala:406)
            at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$tableExists$1(HiveExternalCatalog.scala:854)
            at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
            at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)
            ... 126 more
    Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req'
            at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
            at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:1567)
            at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:1554)
            at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1350)
            at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:127)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
            at com.sun.proxy.$Proxy31.getTable(Unknown Source)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2336)
            at com.sun.proxy.$Proxy31.getTable(Unknown Source)
            at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1274)
            ... 137 more
    

    定位

    以上报错说明spark连接hive metastore时存在版本兼容问题。
    在两个节点上分别使用spark-sql执行同样的sql,对比两个spark应用后发现,有个配置不一样

    正常的节点有一行配置如下:

    spark.sql.hive.metastore.jars=/opt/cloudera/parcels/CDH/lib/hive/lib/*

    在异常节点上修改spark-default.conf后,测试正常

    原因

    spark有一个buildin的hive,3.1.1中对应的是2.3.6,如果想连接其他版本的hive metastore,不需要重新打包,只需要修改变量配置

    官方文档参考

  • 相关阅读:
    【转】二叉树中两个节点的最近的公共父节点
    查找最小的k个元素
    字符串的排列
    php字符串操作
    Android手机app启动的时候第一个Activity必须是MainActivity吗
    ASP.NET网站前端页面的复制
    MySQL字段类型说明
    转:Zend Server Community Edition(CE) 安装手记
    数据库远程导入导出步骤
    转:两种转换mysql数据编码的方法latin1转utf8
  • 原文地址:https://www.cnblogs.com/barneywill/p/16289189.html
Copyright © 2020-2023  润新知