• 配置两个Hadoop集群Kerberos认证跨域互信


     

    两个Hadoop集群开启Kerberos验证后,集群间不能够相互访问,需要实现Kerberos之间的互信,使用Hadoop集群A的客户端访问Hadoop集群B的服务(实质上是使用Kerberos Realm A上的Ticket实现访问Realm B的服务)。
    先决条件:
    1)两个集群(IDC.COM和HADOOP.COM)均开启Kerberos认证
    2)Kerberos的REALM分别设置为IDC.COM和HADOOP.COM
    步骤如下:

    1 配置KDC之间的信任ticket

    实现DIDC.COM和HADOOP.COM之间的跨域互信,例如使用IDC.COM的客户端访问HADOOP.COM中的服务,两个REALM需要共同拥有名为krbtgt/HADOOP.COM@IDC.COM的principal,两个Keys需要保证密码,version number和加密方式一致。默认情况下互信是单向的, HADOOP.COM的客户端访问IDC.COM的服务,两个REALM需要有krbtgt/IDC.COM@HADOOP.COM的principal。
    向两个集群中添加krbtgt principal

      #IDC CLUSTER
      kadmin.local: addprinc –e “aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal ” krbtgt/HADOOP.COM@IDC.COM
      kadmin.local: addprinc –e “aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal ”  krbtgt/IDC.COM@HADOOP.COM
    
      #HADOOP CLUSTER
       kadmin.local: addprinc –e “aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal ” krbtgt/HADOOP.COM@IDC.COM
       kadmin.local: addprinc –e “aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal ”  krbtgt/IDC.COM@HADOOP.COM
    
    要验证两个entries具有匹配的kvno和加密type,查看命令使用getprinc <principal_name>
    
    kadmin.local:  getprinc  krbtgt/IDC.COM@HADOOP.COM
    Principal: krbtgt/IDC.COM@HADOOP.COM
    Expiration date: [never]
    Last password change: Wed Jul 05 14:18:11 CST 2017
    Password expiration date: [none]
    Maximum ticket life: 1 day 00:00:00
    Maximum renewable life: 30 days 00:00:00
    Last modified: Wed Jul 05 14:18:11 CST 2017 (admin/admin@IDC.COM)
    Last successful authentication: [never]
    Last failed authentication: [never]
    Failed password attempts: 0
    Number of keys: 7
    Key: vno 1, aes128-cts-hmac-sha1-96
    Key: vno 1, des3-cbc-sha1
    Key: vno 1, arcfour-hmac
    Key: vno 1, camellia256-cts-cmac
    Key: vno 1, camellia128-cts-cmac
    Key: vno 1, des-hmac-sha1
    Key: vno 1, des-cbc-md5
    MKey: vno 1
    Attributes:
    Policy: [none]
    kadmin.local:  getprinc  addprinc krbtgt/HADOOP.COM@IDC.COM
    usage: get_principal [-terse] principal
    kadmin.local:  getprinc  krbtgt/HADOOP.COM@IDC.COM
    Principal: krbtgt/HADOOP.COM@IDC.COM
    Expiration date: [never]
    Last password change: Wed Jul 05 14:17:47 CST 2017
    Password expiration date: [none]
    Maximum ticket life: 1 day 00:00:00
    Maximum renewable life: 30 days 00:00:00
    Last modified: Wed Jul 05 14:17:47 CST 2017 (admin/admin@IDC.COM)
    Last successful authentication: [never]
    Last failed authentication: [never]
    Failed password attempts: 0
    Number of keys: 7
    Key: vno 1, aes128-cts-hmac-sha1-96
    Key: vno 1, des3-cbc-sha1
    Key: vno 1, arcfour-hmac
    Key: vno 1, camellia256-cts-cmac
    Key: vno 1, camellia128-cts-cmac
    Key: vno 1, des-hmac-sha1
    Key: vno 1, des-cbc-md5
    MKey: vno 1
    Attributes:
    Policy: [none]
    
    
    
     
    
    2 在core-site中配置principal和user的映射RULES
    

    Paste_Image.png

    设置hadoop.security.auth_to_local参数,该参数用于将principal转变为user,一个需要注意的问题是SASL RPC客户端需要远程Server的Kerberos principal在本身的配置中匹配该principal。相同的pricipal name需要分配给源和目标cluster的服务,例如Source Cluster中的NameNode的kerbeors principal name为nn/h @IDC.COM,在Destination cluster中NameNode的pricipal设置为nn/h@HADOOP.COM(不能设置为nn2/h***@HADOOP.COM),例如:
    在IDC Cluster和 HADOOP Cluster的core-site中增加:
    <property>
    <name>hadoop.security.auth_to_local</name>
    <value>
    RULE:[1:$1@$0](^.*@HADOOP\.COM$)s/^(.*)@HADOOP\.COM$/$1/g
    RULE:[2:$1@$0](^.*@HADOOP\.COM$)s/^(.*)@HADOOP\.COM$/$1/g
    RULE:[1:$1@$0](^.*@IDC\.COM$)s/^(.*)@IDC\.COM$/$1/g
    RULE:[2:$1@$0](^.*@IDC\.COM$)s/^(.*)@IDC\.COM$/$1/g 
    DEFAULT             
    </value>
    </property>
    

    使用hadoop org.apache.hadoop.security.HadoopKerberosName 来实现验证,例如:

    [root@node1a141 ~]#  hadoop org.apache.hadoop.security.HadoopKerberosName hdfs/nodea1a141@IDC.COM
    
    Name: hdfs/nodea1a141@IDC.COM to hdfs 
    

    3 在krb5.conf中配置信任关系

    3.1 配置capaths

    第一种方式是配置shared hierarchy of names,这个是默认及比较简单的方式,第二种方式是在krb5.conf文件中改变capaths,复杂但是比较灵活,这里采用第二种方式。
    在两个集群的节点的/etc/krb5.conf文件配置domain和realm的映射关系,例如:在IDC cluster中配置:

    [capaths]
           IDC.COM = {
                  HADOOP.COM = .
           }
    

    在HADOOP Cluster中配置:

    [capaths]
           HADOOP.COM = {
                  IDC.COM = .
           }
    

    配置成'.'是表示没有intermediate realms

    3.2 配置realms

    为了是IDC 可以访问HADOOP的KDC,需要将HADOOP的KDC Server配置到IDC cluster中,如下,反之相同:

    [realms]
      IDC.COM = {
        kdc = {host}.IDC.COM:88
        admin_server = {host}.IDC.COM:749
        default_domain = IDC.COM
      }
      HADOOP.COM = {
        kdc = {host}.HADOOP.COM:88
        admin_server = {host}.HADOOP.COM:749
        default_domain = HADOOP.COM
      }
    

    3.3 配置domain_realm

    在domain_realm中,一般配置成'.IDC.COM'和'IDC.COM'的格式,'.'前缀保证kerberos将所有的IDC.COM的主机均映射到IDC.COM realm。但是如果集群中的主机名不是以IDC.COM为后缀的格式,那么需要在domain_realm中配置主机与realm的映射关系,例IDC.nn.local映射为IDC.COM,需要增加IDC.nn.local = IDC.COM。

    [domain_realm]
    .hadoop.com=HADOOP.COM
     hadoop.com=HADOOP.COM
     .IDC.com=IDC.COM
     IDC.com=IDC.COM
     node1a141 = IDC.COM
     node1a143 = IDC.COM
     node1a210 = HADOOP.COM
     node1a202 = HADOOP.COM
     node1a203 = HADOOP.COM 
    

    重启kerberos服务

    3.4 配置hdfs-site.xml

    在hdfs-site.xml,设置允许的realms
    在hdfs-site.xml中设置dfs.namenode.kerberos.principal.pattern为"*"

    Paste_Image.png

    这个是客户端的匹配规则用于控制允许的认证realms,如果该参数不配置,会有下面的异常:
    java.io.IOException: Failed on local exception: java.io.IOException:
    java.lang.IllegalArgumentException:
           Server has invalid Kerberosprincipal:nn/ HADOOP.COM@ IDC.COM;
           Host Details : local host is: "host1.IDC.COM/10.181.22.130";
                            destination host is: "host2.HADOOP.COM":8020;
    

    4 测试

    1)使用hdfs命令测试IDC 和HADOOP 集群间的数据访问
    例如在IDC Cluster中kinit admin@IDC.COM,然后运行hdfs命令,查看本机群和对方集群得hdfs目录:
    如果未开启跨域互信,访问对方hdfs目录时会报认证错误

    [root@node1a141 ~]# kdestroy
    
    在本机群客户端登陆admin用户,通过kerberos认证
    [root@node1a141 ~]# kinit admin
    Password for admin@IDC.COM:
    
    访问本集群hdfs 
    [root@node1a141 ~]# hdfs dfs -ls /
    Found 3 items
    drwxrwxrwx+  - hdfs supergroup          0 2017-06-13 15:13 /tmp
    drwxrwxr-x+  - hdfs supergroup          0 2017-06-22 15:55 /user
    drwxrwxr-x+  - hdfs supergroup          0 2017-06-14 14:11 /wa
    
    访问对方集群hdfs
    [root@node1a141 ~]# hdfs dfs -ls hdfs://node1a202:8020/
    Found 9 items
    drwxr-xr-x   - root  supergroup          0 2017-05-27 18:55 hdfs://node1a202:8020/cdtest
    drwx------   - hbase hbase               0 2017-05-22 18:51 hdfs://node1a202:8020/hbase
    drwx------   - hbase hbase               0 2017-07-05 19:16 hdfs://node1a202:8020/hbase1
    drwxr-xr-x   - hbase hbase               0 2017-05-11 10:46 hdfs://node1a202:8020/hbase2
    drwxr-xr-x   - root  supergroup          0 2016-12-01 17:30 hdfs://node1a202:8020/home
    drwxr-xr-x   - mdss  supergroup          0 2016-12-13 18:30 hdfs://node1a202:8020/idfs
    drwxr-xr-x   - hdfs  supergroup          0 2017-05-22 18:51 hdfs://node1a202:8020/system
    drwxrwxrwt   - hdfs  supergroup          0 2017-05-31 17:37 hdfs://node1a202:8020/tmp
    drwxrwxr-x+  - hdfs  supergroup          0 2017-05-04 15:48 hdfs://node1a202:8020/user
    
    [root@node1a141 ~]# kdestroy
    
    在本机群客户端登陆admin用户,通过kerberos认证
    [root@node1a141 ~]# kinit admin
    Password for admin@IDC.COM:
    
    访问本集群hdfs 
    [root@node1a141 ~]# hdfs dfs -ls /
    Found 3 items
    drwxrwxrwx+  - hdfs supergroup          0 2017-06-13 15:13 /tmp
    drwxrwxr-x+  - hdfs supergroup          0 2017-06-22 15:55 /user
    drwxrwxr-x+  - hdfs supergroup          0 2017-06-14 14:11 /wa
    
    访问对方集群hdfs
    [root@node1a141 ~]# hdfs dfs -ls hdfs://node1a202:8020/
    Found 9 items
    drwxr-xr-x   - root  supergroup          0 2017-05-27 18:55 hdfs://node1a202:8020/cdtest
    drwx------   - hbase hbase               0 2017-05-22 18:51 hdfs://node1a202:8020/hbase
    drwx------   - hbase hbase               0 2017-07-05 19:16 hdfs://node1a202:8020/hbase1
    drwxr-xr-x   - hbase hbase               0 2017-05-11 10:46 hdfs://node1a202:8020/hbase2
    drwxr-xr-x   - root  supergroup          0 2016-12-01 17:30 hdfs://node1a202:8020/home
    drwxr-xr-x   - mdss  supergroup          0 2016-12-13 18:30 hdfs://node1a202:8020/idfs
    drwxr-xr-x   - hdfs  supergroup          0 2017-05-22 18:51 hdfs://node1a202:8020/system
    drwxrwxrwt   - hdfs  supergroup          0 2017-05-31 17:37 hdfs://node1a202:8020/tmp
    drwxrwxr-x+  - hdfs  supergroup          0 2017-05-04 15:48 hdfs://node1a202:8020/user
    

    在HADOOP.COM中进行相同的操作
    2)运行distcp程序将IDC的数据复制到HADOOP集群,命令如下:

    [root@node1a141 ~]# hadoop distcp hdfs://node1a141:8020/tmp/test.sh  hdfs://node1a202:8020/tmp/
    

    5 附录

    两集群的/etc/krb5.conf完整文件内容如下:

    [root@node1a141 IDC]# cat /etc/krb5.conf
    [logging]
     default = FILE:/var/log/krb5libs.log
     kdc = FILE:/var/log/krb5kdc.log
     admin_server = FILE:/var/log/kadmind.log
    
    [libdefaults]
     default_realm = IDC.COM
     dns_lookup_realm = false
     dns_lookup_kdc = false
     ticket_lifetime = 7d
     renew_lifetime = 30
     forwardable = true
     renewable=true
     #default_ccache_name = KEYRING:persistent:%{uid}
    
    [realms]
     HADOOP.COM = {
       kdc = node1a198
       admin_server = node1a198
       default_realm = HADOOP.COM
       supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
     }
     IDC.COM = {
       kdc = node1a141
       admin_server = node1a141
       default_realm = IDC.COM
       supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
     }
    
    [domain_realm]
     .hadoop.com=HADOOP.COM
     hadoop.com=HADOOP.COM
     .IDC.com=IDC.COM
     IDC.com=IDC.COM
     node1a141 = IDC.COM
     node1a143 = IDC.COM
     node1a210 = HADOOP.COM
     node1a202 = HADOOP.COM
     node1a203 = HADOOP.COM
    
    [capaths]
    IDC.COM = {
     HADOOP.COM = .
    }
    
  • 相关阅读:
    python day01学习
    标准化体系建设(下):如何建立基础架构标准化及服务化体系?
    用EL表达式与Java代码的共享数据
    String和StringBuilder的相互转化
    博客园的一个bug_修改文章标签
    int const *p和int *const的区别
    C++ assert()的用法
    java中,为什么char类型数组可以直接用数组名打印,而int型数组打印结果是地址值!
    传说中的栈溢出
    Trello
  • 原文地址:https://www.cnblogs.com/erlou96/p/16878481.html
Copyright © 2020-2023  润新知