http://www.cnblogs.com/bonelee/p/6078947.html 里分析了ES bulk实现,其中路由代码:
ShardId shardId = clusterService.operationRouting().indexShards(clusterState, concreteIndex, request.id(), request.routing()).shardId();
其实现: https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/cluster/routing/OperationRouting.java
public ShardIterator indexShards(ClusterState clusterState, String index, String id, @Nullable String routing) {
return shards(clusterState, index, id, routing).shardsIt();
}
protected IndexShardRoutingTable shards(ClusterState clusterState, String index, String id, String routing) {
int shardId = generateShardId(indexMetaData(clusterState, index), id, routing);
return clusterState.getRoutingTable().shardRoutingTable(index, shardId);
}
static int generateShardId(IndexMetaData indexMetaData, String id, @Nullable String routing) {
final int hash;
if (routing == null) {
hash = Murmur3HashFunction.hash(id);
} else {
hash = Murmur3HashFunction.hash(routing);
}
// we don't use IMD#getNumberOfShards since the index might have been shrunk such that we need to use the size
// of original index to hash documents
return Math.floorMod(hash, indexMetaData.getRoutingNumShards()) / indexMetaData.getRoutingFactor();
}
可以看到最新的Es代码实现路由是:
Math.floorMod(hash, indexMetaData.getRoutingNumShards()) / indexMetaData.getRoutingFactor();
在https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/cluster/metadata/IndexMetaData.java 里可以看到getRoutingFactor实现:
/**
* Returns the routing factor for this index. The default is <tt>1</tt>.
*
* @see #getRoutingFactor(IndexMetaData, int) for details
*/
public int getRoutingFactor() {
return routingFactor;
}
构造函数里有:
assert numberOfShards * routingFactor == routingNumShards : routingNumShards + " must be a multiple of " + numberOfShards;
反正默认是1,也就是所有的shard节点都会负责路由!
当心,ES2.4版本的路由实现:https://github.com/elastic/elasticsearch/blob/2.4/core/src/main/java/org/elasticsearch/cluster/routing/
@SuppressForbidden(reason = "Math#abs is trappy") private int generateShardId(ClusterState clusterState, String index, String type, String id, @Nullable String routing) { IndexMetaData indexMetaData = clusterState.metaData().index(index); if (indexMetaData == null) { throw new IndexNotFoundException(index); } final Version createdVersion = indexMetaData.getCreationVersion(); final HashFunction hashFunction = indexMetaData.getRoutingHashFunction(); final boolean useType = indexMetaData.getRoutingUseType(); final int hash; if (routing == null) { if (!useType) { hash = hash(hashFunction, id); } else { hash = hash(hashFunction, type, id); } } else { hash = hash(hashFunction, routing); } if (createdVersion.onOrAfter(Version.V_2_0_0_beta1)) { return MathUtils.mod(hash, indexMetaData.getNumberOfShards()); } else { return Math.abs(hash % indexMetaData.getNumberOfShards()); } }
@Deprecated protected int hash(HashFunction hashFunction, String type, String id) { if (type == null || "_all".equals(type)) { throw new IllegalArgumentException("Can't route an operation with no type and having type part of the routing (for backward comp)"); } return hashFunction.hash(type, id); }
而该hash function实现由:
DjbHashFunction.java
SimpleHashFunction.java
Murmur3HashFunction.java
三种。
hash相关设置如下:
#分片数 index.number_of_shards #副本数 index.number_of_replicas
#该index各索引的routing规则,采用何种Hash方式,默认使用Murmur3,还有一种普通的Hash算法 index.legacy.routing.hash.type #routing计算是否使用type,内部计算shard id的方法已经废弃,建议不使用,不设置,默认false即可 index.legacy.routing.use_type