• REDIS源码中一些值得学习的技术细节02


    Redis中散列函数的实现:

    Redis针对整数key和字符串key,采用了不同的散列函数

    对于整数key,redis使用了 Thomas Wang的 32 bit Mix Function,实现了dict.c/dictIntHashFunction函数:

     1 /* Thomas Wang's 32 bit Mix Function */
     2 unsigned int dictIntHashFunction(unsigned int key)
     3 {
     4     key += ~(key << 15);
     5     key ^=  (key >> 10);
     6     key +=  (key << 3);
     7     key ^=  (key >> 6);
     8     key += ~(key << 11);
     9     key ^=  (key >> 16);
    10     return key;
    11 }

    这段代码的妙处我还没来得及仔细研究,等研究好了会在这里补上,不过找到了两个初看还不错的链接:

    首先是Thomas Wang大神本人的链接:

    http://web.archive.org/web/20071223173210/http://www.concentric.net/~Ttwang/tech/inthash.htm

    再者是他人根据上面链接和其他资料写的总结

    http://blog.csdn.net/jasper_xulei/article/details/18364313

    对于字符串形式的key,redis使用了MurmurHash2算法和djb算法:

    MurmurHash2算法对于key是大小写敏感的,而且在大端机器和小端机器上生成结果不一致

    redis的dict.c/dictGenHashFunction是MurmurHash2算法的C语言实现:

     1 unsigned int dictGenHashFunction(const void *key, int len) {
     2     /* 'm' and 'r' are mixing constants generated offline.
     3      They're not really 'magic', they just happen to work well.  */
     4     uint32_t seed = dict_hash_function_seed;
     5     const uint32_t m = 0x5bd1e995;
     6     const int r = 24;
     7 
     8     /* Initialize the hash to a 'random' value */
     9     uint32_t h = seed ^ len;
    10 
    11     /* Mix 4 bytes at a time into the hash */
    12     const unsigned char *data = (const unsigned char *)key;
    13 
    14     while(len >= 4) {
    15         uint32_t k = *(uint32_t*)data;
    16 
    17         k *= m;
    18         k ^= k >> r;
    19         k *= m;
    20 
    21         h *= m;
    22         h ^= k;
    23 
    24         data += 4;
    25         len -= 4;
    26     }
    27 
    28     /* Handle the last few bytes of the input array  */
    29     switch(len) {
    30     case 3: h ^= data[2] << 16;
    31     case 2: h ^= data[1] << 8;
    32     case 1: h ^= data[0]; h *= m;
    33     };
    34 
    35     /* Do a few final mixes of the hash to ensure the last few
    36      * bytes are well-incorporated. */
    37     h ^= h >> 13;
    38     h *= m;
    39     h ^= h >> 15;
    40 
    41     return (unsigned int)h;
    42 }

    而redis则借助djb函数实现了不区分大小写的散列函数dict.c/dictGenCaseHashFunction:

    1 unsigned int dictGenCaseHashFunction(const unsigned char *buf, int len) {
    2     unsigned int hash = (unsigned int)dict_hash_function_seed;
    3 
    4     while (len--)
    5         hash = ((hash << 5) + hash) + (tolower(*buf++)); /* hash * 33 + c */
    6     return hash;
    7 }

    以上三个散列函数(dictIntHashFunction, dictIntHashFunction, dictGenCaseHashFunction)分别用在了redis的不同地方,用以实现了不同场合下的散列需求,接下来的文章将会详细介绍。

     

  • 相关阅读:
    什么是 rel="shortlink" ?
    HTML5 怎么兼容 XHTML
    Connection to `ssl://pecl.php.net:443' failed: mac系统
    记录小程序中经常犯的错误---在wxml中使用js方法(打脸不)
    从js对象数组中删除某一个对象
    彻底搞懂字符串提取方法 slice,substr,substring
    vue 项目中笔记-持续更新
    CF526G Spiders Evil Plan
    BZOJ2178 圆的面积并
    计算几何入门
  • 原文地址:https://www.cnblogs.com/viggoxskingdom/p/5183659.html
Copyright © 2020-2023  润新知