参考：

查看类 ArrayList 中 removeIf 方法源码时，发现其使用 BitSet 类来存储待删除的元素下标

之前没有接触过这个类，了解之后发现其在数据查询和存储方面有很大用处

主要内容：

BitSet 浅析
类变量和常量
构造器
set
clear -（2）
get
flip - (3)
valueOf
位运算（and, andNot, or, xor）
next
previous - (4)
判空 / 判断交集
大小（length, size, cardinality）
toByteArray 和 toLongArray
BitSet 分析
Java 素数查找
小结

`clear`

在前面的 set 方法中，如果设置位值为 false，那么会调用函数 clear

函数 clear 的功能是设置位值为 false，类 BitSet 提供了 3 种重载方式：

public void clear(int bitIndex)
public void clear(int fromIndex, int toIndex)
public void clear()

clear(int bitIndex)

源码如下：

public void clear(int bitIndex) {
    if (bitIndex < 0)
        throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

    int wordIndex = wordIndex(bitIndex);
    if (wordIndex >= wordsInUse)
        return;

    words[wordIndex] &= ~(1L << bitIndex);

    recalculateWordsInUse();
    checkInvariants();
}

输入参数 bitIndex 表示位集下标

该函数功能是设置位集中下标为 bitIndex 的位值为 false

第一步：判断参数 bitIndex 是否小于 0，如果成立，抛出 IndexOutOfBoundsException 异常

第二步：定义变量 wordIndex 计算数组 words 中相对应的下标；并判断是否变量 wordIndex 在当前数组已使用个数范围内，如果不是，直接退出（因为默认为 false）

第三步：利用位左移运算符和位取反运算符计算掩码，然后对数组中下标 wordIndex 的值进行位与操作，设置该位值为 false

第四步：调用函数 recalculateWordsInUse 重新计算当前数组已使用个数；调用函数 checkInvariants 判断类变量是否符合条件

clear(int fromIndex, int toIndex)

源码如下：

public void clear(int fromIndex, int toIndex) {
    checkRange(fromIndex, toIndex);

    if (fromIndex == toIndex)
        return;

    int startWordIndex = wordIndex(fromIndex);
    if (startWordIndex >= wordsInUse)
        return;

    int endWordIndex = wordIndex(toIndex - 1);
    if (endWordIndex >= wordsInUse) {
        toIndex = length();
        endWordIndex = wordsInUse - 1;
    }

    long firstWordMask = WORD_MASK << fromIndex;
    long lastWordMask  = WORD_MASK >>> -toIndex;
    if (startWordIndex == endWordIndex) {
        // Case 1: One word
        words[startWordIndex] &= ~(firstWordMask & lastWordMask);
    } else {
        // Case 2: Multiple words
        // Handle first word
        words[startWordIndex] &= ~firstWordMask;

        // Handle intermediate words, if any
        for (int i = startWordIndex+1; i < endWordIndex; i++)
            words[i] = 0;

        // Handle last word
        words[endWordIndex] &= ~lastWordMask;
    }

    recalculateWordsInUse();
    checkInvariants();
}

该函数用于设置一段范围内的位值为 false

Note：该范围为 [fromIndex, toIndex)，不包括下标 toIndex 的位

使用方式和 set(int fromIndex, int toIndex) 类似

第一步：调用函数 checkRange 判断输入参数是否符合条件

第二步：判断输入参数 fromIndex 和 toIndex 是否相等，如果相等，直接退出

第三步：计算数组 words 中的起始下标 startWordIndex，如果该下标已超出当前数组 words 中已使用的个数，那么直接退出

第四步：计算数组 words 中的结束下标 endWordIndex，同样的，如果该下标已超出当前数组 words 中已使用的个数，那么调用函数 length() 计算当前已使用位集长度，赋值给 toIndex，同时设置 endWordIndex 值为数组已使用的最大下标

第五步：计算起始掩码 firstWordMask 和结束掩码 lastWordMask，将其使用于位操作中

第六步：调用函数 recalculateWordsInUse 重新计算当前数组已使用个数；调用函数 checkInvariants 判断类变量是否符合条件

clear()

源码如下：

public void clear() {
    while (wordsInUse > 0)
        words[--wordsInUse] = 0;
}

该函数功能是设置位集所有位为 false

`recalculateWordsInUse`

源码如下：

private void recalculateWordsInUse() {
    // Traverse the bitset until a used word is found
    int i;
    for (i = wordsInUse-1; i >= 0; i--)
        if (words[i] != 0)
            break;

    wordsInUse = i+1; // The new logical size
}

该函数用于重新计算数组 words 中已使用的个数

`get`

如果想要获取位值，可以调用函数 get，类 BitSet 提供了 2 种重载方式：

public boolean get(int bitIndex)
public BitSet get(int fromIndex, int toIndex)

get(int bitIndex)

源码如下：

public boolean get(int bitIndex) {
    if (bitIndex < 0)
        throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

    checkInvariants();

    int wordIndex = wordIndex(bitIndex);
    return (wordIndex < wordsInUse)
        && ((words[wordIndex] & (1L << bitIndex)) != 0);
}

该函数功能是得到位集中对应下标 bitIndex 的位值

第一步：判断输入参数是否符合条件，以及调用函数 checkInvariants 判断类变量是否符合条件（不太明白为啥要在这里弄这个）

第二步：计算位集下标对应的数组下标 wordIndex，进行逻辑与操作，如果下标 wordIndex 超出数组已使用范围，那么直接返回 false；否则，返回该位值

get(int fromIndex, int toIndex)

源码如下：

public BitSet get(int fromIndex, int toIndex) {
    checkRange(fromIndex, toIndex);

    checkInvariants();

    int len = length();

    // If no set bits in range return empty bitset
    if (len <= fromIndex || fromIndex == toIndex)
        return new BitSet(0);

    // An optimization
    if (toIndex > len)
        toIndex = len;

    BitSet result = new BitSet(toIndex - fromIndex);
    int targetWords = wordIndex(toIndex - fromIndex - 1) + 1;
    int sourceIndex = wordIndex(fromIndex);
    boolean wordAligned = ((fromIndex & BIT_INDEX_MASK) == 0);

    // Process all words but the last word
    for (int i = 0; i < targetWords - 1; i++, sourceIndex++)
        result.words[i] = wordAligned ? words[sourceIndex] :
            (words[sourceIndex] >>> fromIndex) |
            (words[sourceIndex+1] << -fromIndex);

    // Process the last word
    long lastWordMask = WORD_MASK >>> -toIndex;
    result.words[targetWords - 1] =
        ((toIndex-1) & BIT_INDEX_MASK) < (fromIndex & BIT_INDEX_MASK)
        ? /* straddles source words */
        ((words[sourceIndex] >>> fromIndex) |
         (words[sourceIndex+1] & lastWordMask) << -fromIndex)
        :
        ((words[sourceIndex] & lastWordMask) >>> fromIndex);

    // Set wordsInUse correctly
    result.wordsInUse = targetWords;
    result.recalculateWordsInUse();
    result.checkInvariants();

    return result;
}

该函数返回一个新的 BitSet 对象，用于保存当前位集中范围在 [fromIndex, toIndex) 的位值

第一步：调用函数 checkRange 检查输入参数是否符合条件，同时调用函数 checkInvariants 判断类变量是否符合条件

第二步：定义变量 len，计算位集大小，如果当前位集长度不大于起始下标 fromIndex 或者两个输入参数相等，那么创建一个空的 BitSet 对象并返回；如果终止下标大于当前位集大小，则设置终止下标 toIndex 大小为当前位集大小 len

第三步：创建类 BitSet 对象 result，初始位集大小为待取值的位集范围；定义变量 targetWords，赋值为待取值的位集转换成的长整型整数个数；定义变量 sourceIndex，计算当前数组 words 中对应的起始下标；定义变量 wordAligned，计算起始下标是否是 64 的整数倍

第四步：前面已经计算得出，待取值的位集可转换为 targetWords 个 long 型整数。先从当前数组 words 中取出前 targetWors-1 个位值，然后再取出剩余部分的位值

第五步：设置 result 的变量并返回

实例测试

下面输入具体参数来验证函数 get(int fromIndex, int toIndex) 的计算过程

首先创建类对象 bitSet，设置位集 [3, 200) 为 true

调用函数 get，输入参数 fromIndex = 5, toIndex = 147

0 < fromIndex < toIndex，所以符合函数 checkRange 判断条件

变量 len = 200，经过判断后，fromIndex 和 toIndex 的值不变

创建类对象 result = new BitSet(142)，定义变量 targetWords = wordIndex(147-5-1)+1 = 3，定义变量 sourceIndex = 0，定义变量 wordAligned = false

for 循环如下所示：

for (int i=0; i<2; i++,sourceIndex ++)
    ...

其中

result.words[0] = (words[0] >>> fromIndex) | (words[1] << -fromIndex) 
    = (0xfffffffffffffff8L >>> 00 0101) | (0xffffffffffffffffL << 11 1011) 
    = (words[0] >>> 5) | (words[1] << 59)
    = 0x7fffffffffffffffL | 0xf800000000000000L
    = 0xffffffffffffffffL

result.words[1] = (words[1] >>> fromIndex) | (words[2] << -fromIndex) 
    = (0xffffffffffffffffL >>> 00 0101) | (0xffffffffffffffffL << 11 1011) 
    = (words[0] >>> 5) | (words[1] << 59)
    = 0xffffffffffffffffL

现在处理最后一个 word

定义变量

lastWordMask = WORD_MASK >>> -toIndex = 0xffffffffffffffffL >>> 10 1101 = WORD_MASK >>> 45 = 0x00000000000effffL

(toIndex-1) & BIT_INDEX_MASK) = 146 & 63 = 2^7+2^4+2 & 111111 = 1001 0010 & 0011 1111 = 0001 0010 = 18

fromIndex & BIT_INDEX_MASK = 5 & 63 = 0000 0101 & 0011 1111 = 0000 0101 = 5

所以最后一个值的计算公式为

((words[sourceIndex] & lastWordMask) >>> fromIndex) = (0xffffffffffffffffL & 0x00000000000effffL) >>> 5
 = 0x00000000000effffL >>> 5 = 0x000000000003fffL

完整代码如下：

public static void main(String args[]) {
    BitSet bitSet = new BitSet();

    bitSet.set(3, 200);
    System.out.println(bitSet.cardinality());

    BitSet result = bitSet.get(5, 147);
    System.out.println(result.cardinality());

    long[] res = result.toLongArray();
    for (Long l : res) {
        System.out.println(Long.toHexString(l));
    }
}

秒客网

Java BitSet 源码解析(2)

`clear`

`recalculateWordsInUse`

`get`

实例测试

相关文章