Java数据结构: java.util.BitSet源码学习

　　接着上一篇Blog：一道面试题与Java位操作和 BitSet 库的使用，分析下Java源码中BitSet类的源码。

　　位图（Bitmap），即位（Bit）的集合，是一种常用的数据结构，可用于记录大量的0-1状态，在很多地方都会用到，比如Linux内核（如inode，磁盘块）、Bloom Filter算法等，其优势是可以在一个非常高的空间利用率下保存大量0-1状态。在Java中，直接面向程序员的最小数据操作粒度是byte，并不提供能够直接操作某个bit的途径，但是程序员可以通过使用位运算符（& | ~ << >> 等等）自己封装位操作。如果不想自己动手，可以使用Java内置的BitSet类，其实现了位图数据结构并提供了一系列有用的接口。

　　java.util.BitSet这个类不大，代码不到1200行，理解起来也不困难，下面分析一下关键的几处代码。（注意下面的代码是基于Oracle jdk1.7.0_45，或者点击这里看源码）：

1.一些属性

 1     /*
 2      * BitSets are packed into arrays of "words."  Currently a word is
 3      * a long, which consists of 64 bits, requiring 6 address bits.
 4      * The choice of word size is determined purely by performance concerns.
 5      */
 6     private final static int ADDRESS_BITS_PER_WORD = 6;
 7     private final static int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
 8     private final static int BIT_INDEX_MASK = BITS_PER_WORD - 1;
 9 
10     /* Used to shift left or right for a partial word mask */
11     private static final long WORD_MASK = 0xffffffffffffffffL;

　　其实注释已经写得很清楚，BitSet是用long[]来存储数据，一个long是64个bit，所以ADDRESS_BITS_PER_WORD就是6（2^6=64，即表示64个值需要6个地址线的意思）。BITS_PER_WORD是1算数左移6位，即1 × 2^6 = 64，意为一个“字”（long）包含64个bit。BIT_INDEX_MASK是63，即16进制的0x3f，可理解成低6位全为1。WORD_MASK，全1，用于掩码。

　　至于为什么选择long这种数据类型，注释的解析是基于性能的原因，现在64位CPU已经非常普及，可以一次把一个64bit长度的long放进寄存器作计算。

1     **
2      * The internal field corresponding to the serialField "bits".
3      */
4     private long[] words;

　　属性words即为实际存储数据的地方.

2.一些公共函数

1     /**
2      * Given a bit index, return word index containing it.
3      */
4     private static int wordIndex(int bitIndex) {
5         return bitIndex >> ADDRESS_BITS_PER_WORD;
6     }

　　这个静态函数在很多其它函数中会用到，用途是传入一个bit的索引值bitIndex，返回这个bit所在的那个long在long[]中的索引值。就是把bitIndex算数右移6位，也就是bitIndex除以64，因为long长度是64bit。比如第50个bit所对应的long就是50 / 64 = 0，即words中的第0个long。

3.构造函数

 1     /**
 2      * Creates a new bit set. All bits are initially {@code false}.
 3      */
 4     public BitSet() {
 5         initWords(BITS_PER_WORD);
 6         sizeIsSticky = false;
 7     }
 8 
 9     /**
10      * Creates a bit set whose initial size is large enough to explicitly
11      * represent bits with indices in the range {@code 0} through
12      * {@code nbits-1}. All bits are initially {@code false}.
13      *
14      * @param  nbits the initial size of the bit set
15      * @throws NegativeArraySizeException if the specified initial size
16      *         is negative
17      */
18     public BitSet(int nbits) {
19         // nbits can't be negative; size 0 is OK
20         if (nbits < 0)
21             throw new NegativeArraySizeException("nbits < 0: " + nbits);
22 
23         initWords(nbits);
24         sizeIsSticky = true;
25     }
26 
27     private void initWords(int nbits) {
28         words = new long[wordIndex(nbits-1) + 1];
29     }

　　如果用户调用默认构造函数，则会分配一个长度为64bit的BitSet，如果BitSet(int nbits)，则会分配一个大于等于nbits并且是64的整数倍的BitSet，比如调用BitSet(100)，则会分配长度为128的BitSet（即2个long）。

public static BitSet valueOf(long[] longs)
public static BitSet valueOf(LongBuffer lb)
public static BitSet valueOf(byte[] bytes)
public static BitSet valueOf(ByteBuffer bb)

　　BitSet也提供了一些静态函数让用户从一些已有的数据结构中直接构造BitSet。注意上面4个函数都是会把传入参数拷贝一个副本以供BitSet自己使用，所以并不会改变传入参数的数据。

4.动态扩展容量

　　上一篇Blog提到过，BitSet能够在一些操作（如Set()）的时候，如果传入参数大于BitSet本身已有的长度，则它会自动扩展到所需长度。主要以来下面的函数：

 1     /**
 2      * Ensures that the BitSet can hold enough words.
 3      * @param wordsRequired the minimum acceptable number of words.
 4      */
 5     private void ensureCapacity(int wordsRequired) {
 6         if (words.length < wordsRequired) {
 7             // Allocate larger of doubled size or required size
 8             int request = Math.max(2 * words.length, wordsRequired);
 9             words = Arrays.copyOf(words, request);
10             sizeIsSticky = false;
11         }
12     }

　　这个函数的传入参数wordsRequired表示需要多少个“字”，它会与当前words的长度作比较，如果wordsRequired比较大的话，则会新建一个long[]，长度取当前words长度的2倍与wordsRequired中较大的那个值，最后把当前words的内容拷贝到新long[]中，并把这个words指向这个新long[]。这就完成了动态扩容，跟ArrayList的实现方式非常类似，另一方面也看到这份代码不是线程安全的，多线程竞争下必须用户手动同步。

5.flip反转某一位

 1     /**
 2      * Sets the bit at the specified index to the complement of its
 3      * current value.
 4      *
 5      * @param  bitIndex the index of the bit to flip
 6      * @throws IndexOutOfBoundsException if the specified index is negative
 7      * @since  1.4
 8      */
 9     public void flip(int bitIndex) {
10         if (bitIndex < 0)
11             throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);
12 
13         int wordIndex = wordIndex(bitIndex);
14         expandTo(wordIndex);
15 
16         words[wordIndex] ^= (1L << bitIndex);
17 
18         recalculateWordsInUse();
19         checkInvariants();
20     }

　　flip函数提供反转某一个位的功能。做法是先找到bitIndex所在的long，然后把这个long跟（1L << bitIndex）做“异或”操作（XOR）。注意bitIndex是可以大于63的，左移运算符(<<)支持循环移位，即实际左移位数为（bitIndex%64）这么多。假设用户调用flip(66)，则代码先找到wordIndex = 1，即words[1]这个long。然后（1L << bitIndex）就是（1L << (66%64)）即（1L << 2）= 0b0100，从低位数起第3个位为1，其余都为0。最后把words[1]跟0b0100做XOR，因为布尔运算中一个值与1做XOR的结果就是这个值的反，而与0做异或则不变，所以words[1]的第3位被取反了。

6.clear清除某一个位的值

 1     /**
 2      * Sets the bit specified by the index to {@code false}.
 3      *
 4      * @param  bitIndex the index of the bit to be cleared
 5      * @throws IndexOutOfBoundsException if the specified index is negative
 6      * @since  JDK1.0
 7      */
 8     public void clear(int bitIndex) {
 9         if (bitIndex < 0)
10             throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);
11 
12         int wordIndex = wordIndex(bitIndex);
13         if (wordIndex >= wordsInUse)
14             return;
15 
16         words[wordIndex] &= ~(1L << bitIndex);
17 
18         recalculateWordsInUse();
19         checkInvariants();
20     }

　　其实也就是把某一个位设为0。过程与上面flip类似，但进行的位运算不一样，这里是把(1L << bitIndex)取反再跟words[wordIndex]进行“与”运算（AND）。原理其实很简单，布尔运算中一个值和1做AND运算，则其值不变；而如果和0做AND运算，则结果为0。比如：1100 & ~(0100) 等于 1100 & 1011 = 1000.

　　另外BitSet还提供了get, set接口、跟另一个BitSet对象做AND/OR/XOR运算的接口,这些都是用到位运算,比较好理解,不再赘述,请自行参考API.

相关阅读:
《构建之法》读后感二
 求数组最大子数组的和（循环数组）
求数组最大子数组的和
 《构造之法》阅读笔记一
 2019春季学期学习进度报告（一）
软件工程开课博客
 个人NABCD
《构建之法》阅读笔记02
大二下学期学习进度（六）
《构建之法》阅读笔记01
原文地址：https://www.cnblogs.com/yellowb/p/3661476.html