hadoop文件系统浅析
1.什么是分布式文件系统?
管理网络中跨多台计算机存储的文件系统称为分布式文件系统。
2.为什么需要分布式文件系统了?
原因很简单,当数据集的大小超过一台独立物理计算机的存储能力时候,就有必要对它进行分区(partition)并存储到若干台单独计算机上。
3.分布式系统比传统的文件的系统更加复杂
因为分布式文件系统架构在网络之上,因此分布式系统引入了网络编程的复杂性,所以分布式文件系统比普通文件系统更加复杂。
4.Hadoop的文件系统
很多童鞋会把hdfs等价于hadoop的文件系统,其实hadoop是一个综合文件系统抽象,而hdfs是hadoop旗舰级文件系统,hadoop除了hdfs还能集成其他文件系统。Hadoop的这个特点充分体现了hadoop的优良的可扩展性。
在hadoop里,hadoop定义了一个抽象的文件系统的概念,具体就是hadoop里面定义了一个java的抽象类:org.apache.hadoop.fs.FileSystm,这个抽象类用来定义hadoop中的一个文件系统接口,只要某个文件系统实现了这个接口,那么它就可以作为hadoop支持的文件系统。下面是目前实现了hadoop抽象文件类的文件系统,如下表所示:
文件系统 |
URI方案 |
Java实现 (org.apache.hadoop) |
定义 |
Local |
file |
fs.LocalFileSystem |
支持有客户端校验和本地文件系统。带有校验和的本地系统文件在fs.RawLocalFileSystem中实现。 |
HDFS |
hdfs |
hdfs.DistributionFileSystem |
Hadoop的分布式文件系统。 |
HFTP |
hftp |
hdfs.HftpFileSystem |
支持通过HTTP方式以只读的方式访问HDFS,distcp经常用在不同的HDFS集群间复制数据。 |
HSFTP |
hsftp |
hdfs.HsftpFileSystem |
支持通过HTTPS方式以只读的方式访问HDFS。 |
HAR |
har |
fs.HarFileSystem |
构建在Hadoop文件系统之上,对文件进行归档。Hadoop归档文件主要用来减少NameNode的内存使用。 |
KFS |
kfs |
fs.kfs.KosmosFileSystem |
Cloudstore(其前身是Kosmos文件系统)文件系统是类似于HDFS和Google的GFS文件系统,使用C++编写。 |
FTP |
ftp |
fs.ftp.FtpFileSystem |
由FTP服务器支持的文件系统。 |
S3(本地) |
s3n |
fs.s3native.NativeS3FileSystem |
基于Amazon S3的文件系统。 |
S3(基于块) |
s3 |
fs.s3.NativeS3FileSystem |
基于Amazon S3的文件系统,以块格式存储解决了S3的5GB文件大小的限制。 |
最后我要强调一点:在hadoop里有一个文件系统概念,例如上面的FileSystem抽象类,它是位于hadoop的Common项目里,主要是定义一组分布式文件系统和通用的I/O组件和接口,hadoop的文件系统准确的应该称作hadoop I/O。而HDFS是实现该文件接口的hadoop自带的分布式文件项目,hdfs是对hadoop I/O接口的实现。
下面我给大家展示一张表,这样大家对hadoop的FileSystem里的相关API操作就比较清晰了,表如下所示:
Hadoop的FileSystem |
Java操作 |
Linux操作 |
描述 |
URL.openSteam FileSystem.open FileSystem.create FileSystem.append |
URL.openStream |
open |
打开一个文件 |
FSDataInputStream.read |
InputSteam.read |
read |
读取文件中的数据 |
FSDataOutputStream.write |
OutputSteam.write |
write |
向文件写入数据 |
FSDataInputStream.close FSDataOutputStream.close |
InputSteam.close OutputSteam.close |
close |
关闭一个文件 |
FSDataInputStream.seek |
RandomAccessFile.seek |
lseek |
改变文件读写位置 |
FileSystem.getFileStatus FileSystem.get* |
File.get* |
stat |
获取文件/目录的属性 |
FileSystem.set* |
File.set* |
Chmod等 |
改变文件的属性 |
FileSystem.createNewFile |
File.createNewFile |
create |
创建一个文件 |
FileSystem.delete |
File.delete |
remove |
从文件系统中删除一个文件 |
FileSystem.rename |
File.renameTo |
rename |
更改文件/目录名 |
FileSystem.mkdirs |
File.mkdir |
mkdir |
在给定目录下创建一个子目录 |
FileSystem.delete |
File.delete |
rmdir |
从一个目录中删除一个空的子目录 |
FileSystem.listStatus |
File.list |
readdir |
读取一个目录下的项目 |
FileSystem.getWorkingDirectory |
|
getcwd/getwd |
返回当前工作目录 |
FileSystem.setWorkingDirectory |
|
chdir |
更改当前工作目录 |
有了这张表,大家对FileSystem的理解应该会清晰多了吧。
大家从对照表里会发现,hadoop的FileSystem里有两个类:FSDataInputStream和FSDataOutputStream类,它们相当于java I/O里的InputStream和Outputsteam,而事实上这两个类是继承java.io.DataInputStream和java.io.DataOutputStream。
至于关于hadoop I/O本文今天不做介绍,以后也许会专门写篇文章讲讲我自己的理解,不过为了给大家一个清晰的印象,我在博客园里找到了两篇文章,有兴趣的童鞋可以好好看看看,连接如下:
http://www.cnblogs.com/xuqiang/archive/2011/06/03/2042526.html
http://www.cnblogs.com/xia520pi/archive/2012/05/28/2520813.html
5.数据的完整性
数据完整性也就是检测数据是否损坏的技术。Hadoop用户肯定都希望系统在存储和处理数据时候,数据不会有任何的丢失或损坏,尽管磁盘或网络上的每个I/O操作都不太可能将错误引入到自己正在读写的数据里,但是如果系统需要处理的数据量大到hadoop能够处理的极限,数据被损坏的概率就很高了。Hadoop引入了数据完整性校验的功能,下面我将其原理描述如下:
检测数据是否损坏的措施是,在数据第一次引入系统时候计算校验和(checksum),并在数据通过一个不可靠的通道时候进行传输时再次计算校验和,这样就能发现数据是否损坏了,如果两次计算的校验和不匹配,你就认为数据已经损坏了,但是该技术不能修复数据,它只能检测出错误。常用的错误检测码是CRC-32(循环冗余校验),任何大小的数据输入均计算得到一个32位的整数校验和。
6.压缩与输入分片
文件压缩有两大好处:一是可以减少存储文件所需要的磁盘空间,二是可以加速数据在网络和磁盘上的传输。对于处理海量数据的hadoop而言,这两个好处就变得相当重要了,所以理解hadoop的压缩是很有必要的,下表列出了hadoop支持的压缩格式,如下表:
压缩格式 |
工具 |
算法 |
文件扩展名 |
多文件 |
可分割性 |
DEFLATE |
无 |
DEFLATE |
.deflate |
不 |
不 |
gzip |
gzip |
DEFLATE |
.gz |
不 |
不 |
ZIP |
zip |
DEFLATE |
.zip |
是 |
是,在文件范围内 |
bzip2 |
bzip2 |
bzip2 |
.bz2 |
不 |
是 |
LZO |
lzop |
LZO |
.lzo |
不 |
是 |
在hadoop对于压缩有两个指标很重要一个是压缩率还有就是压缩速度,下表列出一些压缩格式在此方面表现的性能,如下所示:
压缩算法 |
原始文件大小 |
压缩后的文件大小 |
压缩速度 |
解压缩速度 |
gzip |
8.3GB |
1.8GB |
17.5MB/s |
58MB/s |
bzip2 |
8.3GB |
1.1GB |
2.4MB/s |
9.5MB/s |
LZO-bset |
8.3GB |
2GB |
4MB/s |
60.6MB/s |
LZO |
8.3GB |
2.9GB |
49.3MB/S |
74.6MB/s |
在hadoop支持压缩里,是否支持切分(splitting)文件的特性也是相当重要的,下面我将讲述切分的问题,也就是我标题写的输入分片的问题:
压缩格式是否可以切分的特性是针对mapreduce处理数据而言的,比如我们有一个压缩为1GB的文件,如果hdfs块大小设置为(hdfs块我的文章里没有讲解,不理解的童鞋可以先查查百度,以后我在写hdfs时候会重点讲这个的)64mb,那么这个文件将存储在16个块里,如果把这个文件作为mapreduce作业的输入数据,mapreduce会根据这16个数据块,产生16个map操作,每个块都是其中一个map操作的输入,那么mapreduce执行效率会非常的高,但是这个前提就是该压缩格式要支持切分。假如压缩格式不支持切分的话,那么mapreduce也是可以做出正确处理,这时候它会将16个数据块放到一个map任务里面,这时候map任务数少了,作业粒度也变大了,那么执行效率就会大大下降。
由于本人知识还是有限,关于压缩和切入分片的问题我就讲述到这里,下面提供一篇相关的文章,有兴趣的童鞋可以看看,链接如下:
http://www.cnblogs.com/ggjucheng/archive/2012/04/22/2465580.html
7.hadoop序列化
我们先看两个定义:
序列化:是指将结构化对象转化为字节流,以便在网络上传输或写到磁盘上进行永久存储。
反序列化:是指将字节流转向结构化对象的逆过程。
序列化在分布式数据处理量大领域经常出现:进程通信和永久存储。
Hadoop中,各个节点的通信是通过远程调用(RPC)实现的,RPC将数据序列化成二进制后发送给远程节点,远程节点收到数据后将二进制字节流反序列化为原始数据。序列化在RPC应用中有着自己的特点,RPC序列化的特点是:
- 紧凑:紧凑的格式能让我们能充分利用网络带宽,而带宽是数据中心最稀缺的资源;
- 快速:进程通信形成了分布式系统的骨架,所以需要尽量减少序列化和反序列化的性能开销,这是基本的
- 可扩展:协议为了满足新的需求变化,所以控制客户端和服务器过程中,需要直接引进相应的协议,这些事新协议,原序列化方式能支持心得协议报文
- 互操作:能支持不同语言写的客户端和服务端进行交互
在hadoop里面有自己定义的序列化格式:writable,它是hadoop的核心之一。
Writable是一个接口,要实现hadoop的序列化就得实现该接口。因为时间原因,序列化我也不展开了,我下面也推荐一篇文章,里面讲述了hadoop的序列化,虽然讲的简单点,而且不全面,但是看完后对hadoop序列化的具体实现会有个初步的了解,链接如下:
[解决]Solr在Weblogic中部署遇到的问题
1、环境
- Solr版本:Solr 4.2.1
- Weblgoic版本:Weblogic 10.3.6
2、问题1
- Weblgoic安装部署时出现如下错误:
weblogic.descriptor.DescriptorException: VALIDATION PROBLEMS WERE FOUND problem: cvc-complex-type.2.4a: Expected elements 'mapped-name@http://java.sun.com/xml/ns/javaee injection-target@http://java.sun.com/xml/ns/javaee' instead of 'env-entry-type@http://java.sun.com/xml/ns/javaee' here in element env-entry@http://java.sun.com/xml/ns/javaee:<null>
- 此问题是由于web.xml中的<env-entry>引起的,如下:
<env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-value>/put/your/solr/home/here</env-entry-value> <env-entry-type>java.lang.String</env-entry-type> </env-entry>
- 解决方法,将<env-entry-type>移到<env-entry-value>之前,如下:
<env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-type>java.lang.String</env-entry-type> <env-entry-value>/put/your/solr/home/here</env-entry-value> </env-entry>
3、问题2
- 报找不到方法:
java.lang.NoSuchMethodError: org.apache.commons.lang.StringUtils.replaceEach(Ljava.lang.String;[Ljava.lang.String;[Ljava.lang.String;)Ljava.lang.String;
- 解决方法,修改WEB-INF/weblgoic.xml文件,加入<prefer-web-inf-classes>true</prefer-web-inf-classes>配置,如下:
<?xml version='1.0' encoding='UTF-8'?> <weblogic-web-app xmlns="http://www.bea.com/ns/weblogic/90" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.bea.com/ns/weblogic/90 http://www.bea.com/ns/weblogic/90/weblogic-web-app.xsd"> <container-descriptor> <prefer-web-inf-classes>true</prefer-web-inf-classes> <filter-dispatched-requests-enabled>false</filter-dispatched-requests-enabled> </container-descriptor> <context-root>solr</context-root> </weblogic-web-app>
4、问题3
- Solr 应用启用时控制台不停的打印以下信息:
2013-5-11 15:51:09 org.apache.zookeeper.ClientCnxn$SendThread run
警告: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.lang.IllegalArgumentException: No Configuration was registered that can handle the configuration named Client
at com.bea.common.security.jdkutils.JAASConfiguration.getAppConfigurationEntry(JAASConfiguration.java:130)
at org.apache.zookeeper.client.ZooKeeperSaslClient.<init>(ZooKeeperSaslClient.java:97)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:943)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993)
- 解决方法,是修改源码org.apache.zookeeper.client.ZooKeeperSaslClient.java,并编译成class文件放到zookeeper-3.4.5.jar包相应的目录下覆盖之前的class文件,将修改后的包覆盖掉WEB-INF/lib下的zookeerper-3.4.5.jar包;详见以下修改后的源码(搜索 by duanbo 为修改的地方):
org.apache.zookeeper.client.ZooKeeperSaslClient
/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.zookeeper.client; import org.apache.zookeeper.AsyncCallback; import org.apache.zookeeper.ClientCnxn; import org.apache.zookeeper.Login; import org.apache.zookeeper.Watcher.Event.KeeperState; import org.apache.zookeeper.ZooDefs; import org.apache.zookeeper.Environment; import org.apache.zookeeper.data.Stat; import org.apache.zookeeper.proto.GetSASLRequest; import org.apache.zookeeper.proto.SetSASLResponse; import org.apache.zookeeper.server.auth.KerberosName; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.io.IOException; import java.security.Principal; import java.security.PrivilegedActionException; import java.security.PrivilegedExceptionAction; import javax.security.auth.Subject; import javax.security.auth.callback.Callback; import javax.security.auth.callback.CallbackHandler; import javax.security.auth.callback.NameCallback; import javax.security.auth.callback.PasswordCallback; import javax.security.auth.callback.UnsupportedCallbackException; import javax.security.auth.login.AppConfigurationEntry; import javax.security.auth.login.Configuration; import javax.security.auth.login.LoginException; import javax.security.sasl.AuthorizeCallback; import javax.security.sasl.RealmCallback; import javax.security.sasl.Sasl; import javax.security.sasl.SaslClient; import javax.security.sasl.SaslException; /** * This class manages SASL authentication for the client. It * allows ClientCnxn to authenticate using SASL with a Zookeeper server. */ public class ZooKeeperSaslClient { public static final String LOGIN_CONTEXT_NAME_KEY = "zookeeper.sasl.clientconfig"; private static final Logger LOG = LoggerFactory.getLogger(ZooKeeperSaslClient.class); private static Login login = null; private SaslClient saslClient; private byte[] saslToken = new byte[0]; public enum SaslState { INITIAL,INTERMEDIATE,COMPLETE,FAILED } private SaslState saslState = SaslState.INITIAL; private boolean gotLastPacket = false; /** informational message indicating the current configuration status */ private final String configStatus; public SaslState getSaslState() { return saslState; } public String getLoginContext() { if (login != null) return login.getLoginContextName(); return null; } public ZooKeeperSaslClient(final String serverPrincipal) throws LoginException { /** * ZOOKEEPER-1373: allow system property to specify the JAAS * configuration section that the zookeeper client should use. * Default to "Client". */ String clientSection = System.getProperty(ZooKeeperSaslClient.LOGIN_CONTEXT_NAME_KEY, "Client"); // Note that 'Configuration' here refers to javax.security.auth.login.Configuration. AppConfigurationEntry entries[] = null; // SecurityException securityException = null; // delete by duanbo 2013-05-11 RuntimeException securityException = null; // add by duanbo 2013-05-11 try { entries = Configuration.getConfiguration().getAppConfigurationEntry(clientSection); } catch (/*SecurityException*/RuntimeException e) { // modify by duanbo 2013-05-11 // handle below: might be harmless if the user doesn't intend to use JAAS authentication. securityException = e; } if (entries != null) { this.configStatus = "Will attempt to SASL-authenticate using Login Context section '" + clientSection + "'"; this.saslClient = createSaslClient(serverPrincipal, clientSection); } else { // Handle situation of clientSection's being null: it might simply because the client does not intend to // use SASL, so not necessarily an error. saslState = SaslState.FAILED; String explicitClientSection = System.getProperty(ZooKeeperSaslClient.LOGIN_CONTEXT_NAME_KEY); if (explicitClientSection != null) { // If the user explicitly overrides the default Login Context, they probably expected SASL to // succeed. But if we got here, SASL failed. if (securityException != null) { throw new LoginException("Zookeeper client cannot authenticate using the " + explicitClientSection + " section of the supplied JAAS configuration: '" + System.getProperty(Environment.JAAS_CONF_KEY) + "' because of a " + "SecurityException: " + securityException); } else { throw new LoginException("Client cannot SASL-authenticate because the specified JAAS configuration " + "section '" + explicitClientSection + "' could not be found."); } } else { // The user did not override the default context. It might be that they just don't intend to use SASL, // so log at INFO, not WARN, since they don't expect any SASL-related information. String msg = "Will not attempt to authenticate using SASL "; if (securityException != null) { msg += "(" + securityException.getLocalizedMessage() + ")"; } else { msg += "(unknown error)"; } this.configStatus = msg; } if (System.getProperty(Environment.JAAS_CONF_KEY) != null) { // Again, the user explicitly set something SASL-related, so they probably expected SASL to succeed. if (securityException != null) { throw new LoginException("Zookeeper client cannot authenticate using the '" + System.getProperty(ZooKeeperSaslClient.LOGIN_CONTEXT_NAME_KEY, "Client") + "' section of the supplied JAAS configuration: '" + System.getProperty(Environment.JAAS_CONF_KEY) + "' because of a " + "SecurityException: " + securityException); } else { throw new LoginException("No JAAS configuration section named '" + System.getProperty(ZooKeeperSaslClient.LOGIN_CONTEXT_NAME_KEY, "Client") + "' was found in specified JAAS configuration file: '" + System.getProperty(Environment.JAAS_CONF_KEY) + "'."); } } } } /** * @return informational message indicating the current configuration status. */ public String getConfigStatus() { return configStatus; } public boolean isComplete() { return (saslState == SaslState.COMPLETE); } public boolean isFailed() { return (saslState == SaslState.FAILED); } public static class ServerSaslResponseCallback implements AsyncCallback.DataCallback { public void processResult(int rc, String path, Object ctx, byte data[], Stat stat) { // processResult() is used by ClientCnxn's sendThread to respond to // data[] contains the Zookeeper Server's SASL token. // ctx is the ZooKeeperSaslClient object. We use this object's respondToServer() method // to reply to the Zookeeper Server's SASL token ZooKeeperSaslClient client = ((ClientCnxn)ctx).zooKeeperSaslClient; if (client == null) { LOG.warn("sasl client was unexpectedly null: cannot respond to Zookeeper server."); return; } byte[] usedata = data; if (data != null) { LOG.debug("ServerSaslResponseCallback(): saslToken server response: (length="+usedata.length+")"); } else { usedata = new byte[0]; LOG.debug("ServerSaslResponseCallback(): using empty data[] as server response (length="+usedata.length+")"); } client.respondToServer(usedata, (ClientCnxn)ctx); } } synchronized private SaslClient createSaslClient(final String servicePrincipal, final String loginContext) throws LoginException { try { if (login == null) { if (LOG.isDebugEnabled()) { LOG.debug("JAAS loginContext is: " + loginContext); } // note that the login object is static: it's shared amongst all zookeeper-related connections. // createSaslClient() must be declared synchronized so that login is initialized only once. login = new Login(loginContext, new ClientCallbackHandler(null)); login.startThreadIfNeeded(); } Subject subject = login.getSubject(); SaslClient saslClient; // Use subject.getPrincipals().isEmpty() as an indication of which SASL mechanism to use: // if empty, use DIGEST-MD5; otherwise, use GSSAPI. if (subject.getPrincipals().isEmpty()) { // no principals: must not be GSSAPI: use DIGEST-MD5 mechanism instead. LOG.info("Client will use DIGEST-MD5 as SASL mechanism."); String[] mechs = {"DIGEST-MD5"}; String username = (String)(subject.getPublicCredentials().toArray()[0]); String password = (String)(subject.getPrivateCredentials().toArray()[0]); // "zk-sasl-md5" is a hard-wired 'domain' parameter shared with zookeeper server code (see ServerCnxnFactory.java) saslClient = Sasl.createSaslClient(mechs, username, "zookeeper", "zk-sasl-md5", null, new ClientCallbackHandler(password)); return saslClient; } else { // GSSAPI. final Object[] principals = subject.getPrincipals().toArray(); // determine client principal from subject. final Principal clientPrincipal = (Principal)principals[0]; final KerberosName clientKerberosName = new KerberosName(clientPrincipal.getName()); // assume that server and client are in the same realm (by default; unless the system property // "zookeeper.server.realm" is set). String serverRealm = System.getProperty("zookeeper.server.realm",clientKerberosName.getRealm()); KerberosName serviceKerberosName = new KerberosName(servicePrincipal+"@"+serverRealm); final String serviceName = serviceKerberosName.getServiceName(); final String serviceHostname = serviceKerberosName.getHostName(); final String clientPrincipalName = clientKerberosName.toString(); try { saslClient = Subject.doAs(subject,new PrivilegedExceptionAction<SaslClient>() { public SaslClient run() throws SaslException { LOG.info("Client will use GSSAPI as SASL mechanism."); String[] mechs = {"GSSAPI"}; LOG.debug("creating sasl client: client="+clientPrincipalName+";service="+serviceName+";serviceHostname="+serviceHostname); SaslClient saslClient = Sasl.createSaslClient(mechs,clientPrincipalName,serviceName,serviceHostname,null,new ClientCallbackHandler(null)); return saslClient; } }); return saslClient; } catch (Exception e) { LOG.error("Error creating SASL client:" + e); e.printStackTrace(); return null; } } } catch (LoginException e) { // We throw LoginExceptions... throw e; } catch (Exception e) { // ..but consume (with a log message) all other types of exceptions. LOG.error("Exception while trying to create SASL client: " + e); return null; } } public void respondToServer(byte[] serverToken, ClientCnxn cnxn) { if (saslClient == null) { LOG.error("saslClient is unexpectedly null. Cannot respond to server's SASL message; ignoring."); return; } if (!(saslClient.isComplete())) { try { saslToken = createSaslToken(serverToken); if (saslToken != null) { sendSaslPacket(saslToken, cnxn); } } catch (SaslException e) { LOG.error("SASL authentication failed using login context '" + this.getLoginContext() + "'."); saslState = SaslState.FAILED; gotLastPacket = true; } } if (saslClient.isComplete()) { // GSSAPI: server sends a final packet after authentication succeeds // or fails. if ((serverToken == null) && (saslClient.getMechanismName() == "GSSAPI")) gotLastPacket = true; // non-GSSAPI: no final packet from server. if (saslClient.getMechanismName() != "GSSAPI") { gotLastPacket = true; } // SASL authentication is completed, successfully or not: // enable the socket's writable flag so that any packets waiting for authentication to complete in // the outgoing queue will be sent to the Zookeeper server. cnxn.enableWrite(); } } private byte[] createSaslToken() throws SaslException { saslState = SaslState.INTERMEDIATE; return createSaslToken(saslToken); } private byte[] createSaslToken(final byte[] saslToken) throws SaslException { if (saslToken == null) { // TODO: introspect about runtime environment (such as jaas.conf) saslState = SaslState.FAILED; throw new SaslException("Error in authenticating with a Zookeeper Quorum member: the quorum member's saslToken is null."); } Subject subject = login.getSubject(); if (subject != null) { synchronized(login) { try { final byte[] retval = Subject.doAs(subject, new PrivilegedExceptionAction<byte[]>() { public byte[] run() throws SaslException { LOG.debug("saslClient.evaluateChallenge(len="+saslToken.length+")"); return saslClient.evaluateChallenge(saslToken); } }); return retval; } catch (PrivilegedActionException e) { String error = "An error: (" + e + ") occurred when evaluating Zookeeper Quorum Member's " + " received SASL token."; // Try to provide hints to use about what went wrong so they can fix their configuration. // TODO: introspect about e: look for GSS information. final String UNKNOWN_SERVER_ERROR_TEXT = "(Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)"; if (e.toString().indexOf(UNKNOWN_SERVER_ERROR_TEXT) > -1) { error += " This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's" + " hostname correctly. You may want to try to adding" + " '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment."; } error += " Zookeeper Client will go to AUTH_FAILED state."; LOG.error(error); saslState = SaslState.FAILED; throw new SaslException(error); } } } else { throw new SaslException("Cannot make SASL token without subject defined. " + "For diagnosis, please look for WARNs and ERRORs in your log related to the Login class."); } } private void sendSaslPacket(byte[] saslToken, ClientCnxn cnxn) throws SaslException{ if (LOG.isDebugEnabled()) { LOG.debug("ClientCnxn:sendSaslPacket:length="+saslToken.length); } GetSASLRequest request = new GetSASLRequest(); request.setToken(saslToken); SetSASLResponse response = new SetSASLResponse(); ServerSaslResponseCallback cb = new ServerSaslResponseCallback(); try { cnxn.sendPacket(request,response,cb, ZooDefs.OpCode.sasl); } catch (IOException e) { throw new SaslException("Failed to send SASL packet to server.", e); } } private void sendSaslPacket(ClientCnxn cnxn) throws SaslException { if (LOG.isDebugEnabled()) { LOG.debug("ClientCnxn:sendSaslPacket:length="+saslToken.length); } GetSASLRequest request = new GetSASLRequest(); request.setToken(createSaslToken()); SetSASLResponse response = new SetSASLResponse(); ServerSaslResponseCallback cb = new ServerSaslResponseCallback(); try { cnxn.sendPacket(request,response,cb, ZooDefs.OpCode.sasl); } catch (IOException e) { throw new SaslException("Failed to send SASL packet to server due " + "to IOException:", e); } } // used by ClientCnxn to know whether to emit a SASL-related event: either AuthFailed or SaslAuthenticated, // or none, if not ready yet. Sets saslState to COMPLETE as a side-effect. public KeeperState getKeeperState() { if (saslClient != null) { if (saslState == SaslState.FAILED) { return KeeperState.AuthFailed; } if (saslClient.isComplete()) { if (saslState == SaslState.INTERMEDIATE) { saslState = SaslState.COMPLETE; return KeeperState.SaslAuthenticated; } } } // No event ready to emit yet. return null; } // Initialize the client's communications with the Zookeeper server by sending the server the first // authentication packet. public void initialize(ClientCnxn cnxn) throws SaslException { if (saslClient == null) { saslState = SaslState.FAILED; throw new SaslException("saslClient failed to initialize properly: it's null."); } if (saslState == SaslState.INITIAL) { if (saslClient.hasInitialResponse()) { sendSaslPacket(cnxn); } else { byte[] emptyToken = new byte[0]; sendSaslPacket(emptyToken, cnxn); } saslState = SaslState.INTERMEDIATE; } } // The CallbackHandler interface here refers to // javax.security.auth.callback.CallbackHandler. // It should not be confused with Zookeeper packet callbacks like // org.apache.zookeeper.server.auth.SaslServerCallbackHandler. public static class ClientCallbackHandler implements CallbackHandler { private String password = null; public ClientCallbackHandler(String password) { this.password = password; } public void handle(Callback[] callbacks) throws UnsupportedCallbackException { for (Callback callback : callbacks) { if (callback instanceof NameCallback) { NameCallback nc = (NameCallback) callback; nc.setName(nc.getDefaultName()); } else { if (callback instanceof PasswordCallback) { PasswordCallback pc = (PasswordCallback)callback; if (password != null) { pc.setPassword(this.password.toCharArray()); } else { LOG.warn("Could not login: the client is being asked for a password, but the Zookeeper" + " client code does not currently support obtaining a password from the user." + " Make sure that the client is configured to use a ticket cache (using" + " the JAAS configuration setting 'useTicketCache=true)' and restart the client. If" + " you still get this message after that, the TGT in the ticket cache has expired and must" + " be manually refreshed. To do so, first determine if you are using a password or a" + " keytab. If the former, run kinit in a Unix shell in the environment of the user who" + " is running this Zookeeper client using the command" + " 'kinit <princ>' (where <princ> is the name of the client's Kerberos principal)." + " If the latter, do" + " 'kinit -k -t <keytab> <princ>' (where <princ> is the name of the Kerberos principal, and" + " <keytab> is the location of the keytab file). After manually refreshing your cache," + " restart this client. If you continue to see this message after manually refreshing" + " your cache, ensure that your KDC host's clock is in sync with this host's clock."); } } else { if (callback instanceof RealmCallback) { RealmCallback rc = (RealmCallback) callback; rc.setText(rc.getDefaultText()); } else { if (callback instanceof AuthorizeCallback) { AuthorizeCallback ac = (AuthorizeCallback) callback; String authid = ac.getAuthenticationID(); String authzid = ac.getAuthorizationID(); if (authid.equals(authzid)) { ac.setAuthorized(true); } else { ac.setAuthorized(false); } if (ac.isAuthorized()) { ac.setAuthorizedID(authzid); } } else { throw new UnsupportedCallbackException(callback,"Unrecognized SASL ClientCallback"); } } } } } } } public boolean clientTunneledAuthenticationInProgress() { // TODO: Rather than checking a disjunction here, should be a single member // variable or method in this class to determine whether the client is // configured to use SASL. (see also ZOOKEEPER-1455). try { if ((System.getProperty(Environment.JAAS_CONF_KEY) != null) || ((javax.security.auth.login.Configuration.getConfiguration() != null) && (javax.security.auth.login.Configuration.getConfiguration(). getAppConfigurationEntry(System. getProperty(ZooKeeperSaslClient.LOGIN_CONTEXT_NAME_KEY,"Client")) != null))) { // Client is configured to use a valid login Configuration, so // authentication is either in progress, successful, or failed. // 1. Authentication hasn't finished yet: we must wait for it to do so. if ((isComplete() == false) && (isFailed() == false)) { return true; } // 2. SASL authentication has succeeded or failed.. if (isComplete() || isFailed()) { if (gotLastPacket == false) { // ..but still in progress, because there is a final SASL // message from server which must be received. return true; } } } // Either client is not configured to use a tunnelled authentication // scheme, or tunnelled authentication has completed (successfully or // not), and all server SASL messages have been received. return false; } catch (/*SecurityException*/RuntimeException e) { // modify by duanbo 2013-05-11 // Thrown if the caller does not have permission to retrieve the Configuration. // In this case, simply returning false is correct. if (LOG.isDebugEnabled() == true) { LOG.debug("Could not retrieve login configuration: " + e); } return false; } } }