最快的方法在一个字符数组中找到6个连续的0位。

时间:2022-03-14 03:12:24

This is what I'm currently doing:

这就是我目前正在做的事情:

int dataLen = 500;
char data[dataLen];
int desired = 1; // between 1 and 6, inclusive
...
char bits[dataLen * 8];
for (int32 j = 0; j < dataLen; j++) {
    for (int i = 0; i < 8; i++) {
        bits[j*8+i] = ( (data[j] & (1 << (7-i))) ? '1' : '0' );
    }
}
int offset = std::search_n(bits, bits + dataLen*8, desired, '0') - bits;

Really nasty, I know, and it's killing performance.

真的很恶心,我知道,而且还会杀了人。

What's the fastest way to find the bit offset of the first set of x consecutive 0 bits in a char array, where 0 < x < 7? I'm on GCC with SSE 4.2 so builtins like __builtin_ctz, __builtin_popcountl are an option, I just can't figure out the best way to use them.

在一个char数组中,找到第一个x连续0位的位偏移量的最快方法是什么,在这里0 < x < 7?我在GCC中使用SSE 4.2,所以构建像__builtin_ctz, __builtin_popcountl是一个选项,我只是想不出使用它们的最佳方法。

8 个解决方案

#1


5  

How many numbers have 6 consecutive 0 bits (even when considering 2 consecutive bytes)?

有多少个数有6个连续的0位(即使考虑2个连续的字节)?

Byte 1
XXXXXXXX
000000??             0/1/2/3
?000000?             0/1/128/129
??000000             0/64/128/192

So if we consider 1 byte at a time then 0/1/2/3/64/128/192

如果我们考虑1个字节,那么0/1/2/3/64/128/192。

Byte a   Byte b
XXXXXXXX XXXXXXXX
??100000 0???????    (a & 31 == 0) && (b & 128 == 0)
???10000 00??????    (a & 15 == 0) && (b & 192 == 0)
????1000 000?????    (a & 7  == 0) && (b & 224 == 0)
?????100 0000????    (a & 3  == 0) && (b & 240 == 0)
??????10 00000???    (a & 1  == 0) && (b & 248 == 0)

So an absolute maximum of 12 tests gives you all combinations.
I am sure if you do the comparisons smartly you can reduce that.

所以12个测试的绝对最大值给了你所有的组合。我相信,如果你做的比较聪明,你可以减少它。

If we steel @Michael Burr solution below for using a table driven approach. Then we can organize it so that you can do one or two comparisons per byte.

如果我们使用一个表驱动的方法来解决下面的@Michael Burr解决方案。然后我们可以组织它,这样你就可以对每个字节进行一到两个比较。

struct TestStruct
{
    bool    is6Consecutive;
    bool    hasTrailing;
    int     maskNextByte;
    int     offset;
};
TestStruct   testData[256];

std::size_t findOffsetOf_6ConsecutiveZero(char const* data, size_t size)
{
    for(int loop = 0;loop < (size-1); ++loop)
    {
        char const&           val  = data[loop];
        TestStructure const&  test = testData[val];

        if (test.is6Consecutive)
        {   return loop*8 + test.offset;
        }

        if (test.hasTrailing)
        {
            if ((data[loop + 1] & test.maskNextByte) == 0)
            {   return loop*8 + test.offset;
            }
        }
    }
    // Test last byte
    TestStructure const&  test = testData[data[size-1]];
    if (test.is6Consecutive)
    {   return (size-1)*8 + test.offset;
    }

    return -1;
}

First few table entries:

前几个表条目:

TestStruct   testData[256] =
{
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {false, true,  0xf0, 6},  // 4 => 00000100 <Next 4 bytes are zero we hit>
    {false, false, 0x00, 0},  // 5 => 00000101 <Ignore and move on>
    {false, true,  0xf8, 7},  // 6 => 00000110 <Next 5 bytes are zero we hit>
    etc...

};

#2


3  

Here's a function that matches the output of the one provided in the question (at least under limited testing). It uses a table lookup , with the table having been generated by a one-off script. I'm honestly not sure if its performance is competitive with the suggestions that use bit testing hackery or GCC builtins, but I'll bet it's not too far off.

这是一个与问题中提供的输出相匹配的函数(至少在有限的测试中)。它使用一个表查找,表是由一个一次性的脚本生成的。老实说,我不确定它的性能是否与使用bit测试hackery或GCC内置的建议相竞争,但我敢打赌它不会太离谱。

struct zeros {
    unsigned char leading;
    unsigned char internal;
    unsigned char trailing;
};

// forward declaration so the long, non-interesting table is at the 
//  end of this 
static struct zeros const zero_table[256];

int find_zero_bits_offset( char const* data, int datalen, int desired)
{
    int offset = -1;
    int found = 0;
    char const* dataptr = &data[0];
    char const* endptr  = &data[datalen];


    // first, find which byte the sequence of zeros begins

    while (!found && (dataptr != endptr)) {
        unsigned char ch1 = *dataptr++;
        unsigned char ch2 = (dataptr != endptr) ? *dataptr : 0xff;

        int internal = zero_table[ch1].internal;
        int trailing = zero_table[ch1].trailing;
        int leading =  zero_table[ch2].leading;

        found = (desired <= internal) || 
                (trailing && (desired <= (trailing + leading)));
    }


    // now zero in on where the sequence starts within the byte

    if (found) {
        char ch = 0;
        unsigned int mask = 0;

        --dataptr;

        // dataptr points to the byte where the sequence of zeros starts.
        //  figure out exactly where the sequence is...

        // there's possibly some opportunity for optimization, if neccesary,
        //  by testing if the sequence was found in the "leading", "internal", or
        //  "trailing" cases. But the explicit loop will only iterate at most
        //  8 times (and will early-out on the first iteration if the match is 
        //  for the "leading" case) that it didn't seem too important

        ch = *dataptr;
        offset = (dataptr - data) * 8;

        // figure out where the appropriate internal run starts
        // note that the offset we need to return isn't necessarily the
        //  offset for the run of zeros counted by zero_table[tmp].internal_offset
        //  since a sufficient shorter run might come first

        // there may be a more efficient bithack for this, but the
        //  loop will iterate at most 8 times...
        mask = ((1 << desired) - 1);
        mask <<= (8 - desired);

        while ((ch & mask) != 0) {
            ++offset;
            mask >>= 1;
        }
    }
    else {
        // however you want to handle the "not found" situation. 
        //  This is equivalent to what was in the question:
        offset = (endptr - data) * 8;
    }

    return offset;
}



static struct zeros const zero_table[256] = {
    { 8, 8, 8 },  // 0000 0000
    { 7, 7, 0 },  // 0000 0001
    { 6, 6, 1 },  // 0000 0010
    { 6, 6, 0 },  // 0000 0011
    { 5, 5, 2 },  // 0000 0100
    { 5, 5, 0 },  // 0000 0101
    { 5, 5, 1 },  // 0000 0110
    { 5, 5, 0 },  // 0000 0111
    { 4, 4, 3 },  // 0000 1000
    { 4, 4, 0 },  // 0000 1001
    { 4, 4, 1 },  // 0000 1010
    { 4, 4, 0 },  // 0000 1011
    { 4, 4, 2 },  // 0000 1100
    { 4, 4, 0 },  // 0000 1101
    { 4, 4, 1 },  // 0000 1110
    { 4, 4, 0 },  // 0000 1111
    { 3, 4, 4 },  // 0001 0000
    { 3, 3, 0 },  // 0001 0001
    { 3, 3, 1 },  // 0001 0010
    { 3, 3, 0 },  // 0001 0011
    { 3, 3, 2 },  // 0001 0100
    { 3, 3, 0 },  // 0001 0101
    { 3, 3, 1 },  // 0001 0110
    { 3, 3, 0 },  // 0001 0111
    { 3, 3, 3 },  // 0001 1000
    { 3, 3, 0 },  // 0001 1001
    { 3, 3, 1 },  // 0001 1010
    { 3, 3, 0 },  // 0001 1011
    { 3, 3, 2 },  // 0001 1100
    { 3, 3, 0 },  // 0001 1101
    { 3, 3, 1 },  // 0001 1110
    { 3, 3, 0 },  // 0001 1111
    { 2, 5, 5 },  // 0010 0000
    { 2, 4, 0 },  // 0010 0001
    { 2, 3, 1 },  // 0010 0010
    { 2, 3, 0 },  // 0010 0011
    { 2, 2, 2 },  // 0010 0100
    { 2, 2, 0 },  // 0010 0101
    { 2, 2, 1 },  // 0010 0110
    { 2, 2, 0 },  // 0010 0111
    { 2, 3, 3 },  // 0010 1000
    { 2, 2, 0 },  // 0010 1001
    { 2, 2, 1 },  // 0010 1010
    { 2, 2, 0 },  // 0010 1011
    { 2, 2, 2 },  // 0010 1100
    { 2, 2, 0 },  // 0010 1101
    { 2, 2, 1 },  // 0010 1110
    { 2, 2, 0 },  // 0010 1111
    { 2, 4, 4 },  // 0011 0000
    { 2, 3, 0 },  // 0011 0001
    { 2, 2, 1 },  // 0011 0010
    { 2, 2, 0 },  // 0011 0011
    { 2, 2, 2 },  // 0011 0100
    { 2, 2, 0 },  // 0011 0101
    { 2, 2, 1 },  // 0011 0110
    { 2, 2, 0 },  // 0011 0111
    { 2, 3, 3 },  // 0011 1000
    { 2, 2, 0 },  // 0011 1001
    { 2, 2, 1 },  // 0011 1010
    { 2, 2, 0 },  // 0011 1011
    { 2, 2, 2 },  // 0011 1100
    { 2, 2, 0 },  // 0011 1101
    { 2, 2, 1 },  // 0011 1110
    { 2, 2, 0 },  // 0011 1111
    { 1, 6, 6 },  // 0100 0000
    { 1, 5, 0 },  // 0100 0001
    { 1, 4, 1 },  // 0100 0010
    { 1, 4, 0 },  // 0100 0011
    { 1, 3, 2 },  // 0100 0100
    { 1, 3, 0 },  // 0100 0101
    { 1, 3, 1 },  // 0100 0110
    { 1, 3, 0 },  // 0100 0111
    { 1, 3, 3 },  // 0100 1000
    { 1, 2, 0 },  // 0100 1001
    { 1, 2, 1 },  // 0100 1010
    { 1, 2, 0 },  // 0100 1011
    { 1, 2, 2 },  // 0100 1100
    { 1, 2, 0 },  // 0100 1101
    { 1, 2, 1 },  // 0100 1110
    { 1, 2, 0 },  // 0100 1111
    { 1, 4, 4 },  // 0101 0000
    { 1, 3, 0 },  // 0101 0001
    { 1, 2, 1 },  // 0101 0010
    { 1, 2, 0 },  // 0101 0011
    { 1, 2, 2 },  // 0101 0100
    { 1, 1, 0 },  // 0101 0101
    { 1, 1, 1 },  // 0101 0110
    { 1, 1, 0 },  // 0101 0111
    { 1, 3, 3 },  // 0101 1000
    { 1, 2, 0 },  // 0101 1001
    { 1, 1, 1 },  // 0101 1010
    { 1, 1, 0 },  // 0101 1011
    { 1, 2, 2 },  // 0101 1100
    { 1, 1, 0 },  // 0101 1101
    { 1, 1, 1 },  // 0101 1110
    { 1, 1, 0 },  // 0101 1111
    { 1, 5, 5 },  // 0110 0000
    { 1, 4, 0 },  // 0110 0001
    { 1, 3, 1 },  // 0110 0010
    { 1, 3, 0 },  // 0110 0011
    { 1, 2, 2 },  // 0110 0100
    { 1, 2, 0 },  // 0110 0101
    { 1, 2, 1 },  // 0110 0110
    { 1, 2, 0 },  // 0110 0111
    { 1, 3, 3 },  // 0110 1000
    { 1, 2, 0 },  // 0110 1001
    { 1, 1, 1 },  // 0110 1010
    { 1, 1, 0 },  // 0110 1011
    { 1, 2, 2 },  // 0110 1100
    { 1, 1, 0 },  // 0110 1101
    { 1, 1, 1 },  // 0110 1110
    { 1, 1, 0 },  // 0110 1111
    { 1, 4, 4 },  // 0111 0000
    { 1, 3, 0 },  // 0111 0001
    { 1, 2, 1 },  // 0111 0010
    { 1, 2, 0 },  // 0111 0011
    { 1, 2, 2 },  // 0111 0100
    { 1, 1, 0 },  // 0111 0101
    { 1, 1, 1 },  // 0111 0110
    { 1, 1, 0 },  // 0111 0111
    { 1, 3, 3 },  // 0111 1000
    { 1, 2, 0 },  // 0111 1001
    { 1, 1, 1 },  // 0111 1010
    { 1, 1, 0 },  // 0111 1011
    { 1, 2, 2 },  // 0111 1100
    { 1, 1, 0 },  // 0111 1101
    { 1, 1, 1 },  // 0111 1110
    { 1, 1, 0 },  // 0111 1111
    { 0, 7, 7 },  // 1000 0000
    { 0, 6, 0 },  // 1000 0001
    { 0, 5, 1 },  // 1000 0010
    { 0, 5, 0 },  // 1000 0011
    { 0, 4, 2 },  // 1000 0100
    { 0, 4, 0 },  // 1000 0101
    { 0, 4, 1 },  // 1000 0110
    { 0, 4, 0 },  // 1000 0111
    { 0, 3, 3 },  // 1000 1000
    { 0, 3, 0 },  // 1000 1001
    { 0, 3, 1 },  // 1000 1010
    { 0, 3, 0 },  // 1000 1011
    { 0, 3, 2 },  // 1000 1100
    { 0, 3, 0 },  // 1000 1101
    { 0, 3, 1 },  // 1000 1110
    { 0, 3, 0 },  // 1000 1111
    { 0, 4, 4 },  // 1001 0000
    { 0, 3, 0 },  // 1001 0001
    { 0, 2, 1 },  // 1001 0010
    { 0, 2, 0 },  // 1001 0011
    { 0, 2, 2 },  // 1001 0100
    { 0, 2, 0 },  // 1001 0101
    { 0, 2, 1 },  // 1001 0110
    { 0, 2, 0 },  // 1001 0111
    { 0, 3, 3 },  // 1001 1000
    { 0, 2, 0 },  // 1001 1001
    { 0, 2, 1 },  // 1001 1010
    { 0, 2, 0 },  // 1001 1011
    { 0, 2, 2 },  // 1001 1100
    { 0, 2, 0 },  // 1001 1101
    { 0, 2, 1 },  // 1001 1110
    { 0, 2, 0 },  // 1001 1111
    { 0, 5, 5 },  // 1010 0000
    { 0, 4, 0 },  // 1010 0001
    { 0, 3, 1 },  // 1010 0010
    { 0, 3, 0 },  // 1010 0011
    { 0, 2, 2 },  // 1010 0100
    { 0, 2, 0 },  // 1010 0101
    { 0, 2, 1 },  // 1010 0110
    { 0, 2, 0 },  // 1010 0111
    { 0, 3, 3 },  // 1010 1000
    { 0, 2, 0 },  // 1010 1001
    { 0, 1, 1 },  // 1010 1010
    { 0, 1, 0 },  // 1010 1011
    { 0, 2, 2 },  // 1010 1100
    { 0, 1, 0 },  // 1010 1101
    { 0, 1, 1 },  // 1010 1110
    { 0, 1, 0 },  // 1010 1111
    { 0, 4, 4 },  // 1011 0000
    { 0, 3, 0 },  // 1011 0001
    { 0, 2, 1 },  // 1011 0010
    { 0, 2, 0 },  // 1011 0011
    { 0, 2, 2 },  // 1011 0100
    { 0, 1, 0 },  // 1011 0101
    { 0, 1, 1 },  // 1011 0110
    { 0, 1, 0 },  // 1011 0111
    { 0, 3, 3 },  // 1011 1000
    { 0, 2, 0 },  // 1011 1001
    { 0, 1, 1 },  // 1011 1010
    { 0, 1, 0 },  // 1011 1011
    { 0, 2, 2 },  // 1011 1100
    { 0, 1, 0 },  // 1011 1101
    { 0, 1, 1 },  // 1011 1110
    { 0, 1, 0 },  // 1011 1111
    { 0, 6, 6 },  // 1100 0000
    { 0, 5, 0 },  // 1100 0001
    { 0, 4, 1 },  // 1100 0010
    { 0, 4, 0 },  // 1100 0011
    { 0, 3, 2 },  // 1100 0100
    { 0, 3, 0 },  // 1100 0101
    { 0, 3, 1 },  // 1100 0110
    { 0, 3, 0 },  // 1100 0111
    { 0, 3, 3 },  // 1100 1000
    { 0, 2, 0 },  // 1100 1001
    { 0, 2, 1 },  // 1100 1010
    { 0, 2, 0 },  // 1100 1011
    { 0, 2, 2 },  // 1100 1100
    { 0, 2, 0 },  // 1100 1101
    { 0, 2, 1 },  // 1100 1110
    { 0, 2, 0 },  // 1100 1111
    { 0, 4, 4 },  // 1101 0000
    { 0, 3, 0 },  // 1101 0001
    { 0, 2, 1 },  // 1101 0010
    { 0, 2, 0 },  // 1101 0011
    { 0, 2, 2 },  // 1101 0100
    { 0, 1, 0 },  // 1101 0101
    { 0, 1, 1 },  // 1101 0110
    { 0, 1, 0 },  // 1101 0111
    { 0, 3, 3 },  // 1101 1000
    { 0, 2, 0 },  // 1101 1001
    { 0, 1, 1 },  // 1101 1010
    { 0, 1, 0 },  // 1101 1011
    { 0, 2, 2 },  // 1101 1100
    { 0, 1, 0 },  // 1101 1101
    { 0, 1, 1 },  // 1101 1110
    { 0, 1, 0 },  // 1101 1111
    { 0, 5, 5 },  // 1110 0000
    { 0, 4, 0 },  // 1110 0001
    { 0, 3, 1 },  // 1110 0010
    { 0, 3, 0 },  // 1110 0011
    { 0, 2, 2 },  // 1110 0100
    { 0, 2, 0 },  // 1110 0101
    { 0, 2, 1 },  // 1110 0110
    { 0, 2, 0 },  // 1110 0111
    { 0, 3, 3 },  // 1110 1000
    { 0, 2, 0 },  // 1110 1001
    { 0, 1, 1 },  // 1110 1010
    { 0, 1, 0 },  // 1110 1011
    { 0, 2, 2 },  // 1110 1100
    { 0, 1, 0 },  // 1110 1101
    { 0, 1, 1 },  // 1110 1110
    { 0, 1, 0 },  // 1110 1111
    { 0, 4, 4 },  // 1111 0000
    { 0, 3, 0 },  // 1111 0001
    { 0, 2, 1 },  // 1111 0010
    { 0, 2, 0 },  // 1111 0011
    { 0, 2, 2 },  // 1111 0100
    { 0, 1, 0 },  // 1111 0101
    { 0, 1, 1 },  // 1111 0110
    { 0, 1, 0 },  // 1111 0111
    { 0, 3, 3 },  // 1111 1000
    { 0, 2, 0 },  // 1111 1001
    { 0, 1, 1 },  // 1111 1010
    { 0, 1, 0 },  // 1111 1011
    { 0, 2, 2 },  // 1111 1100
    { 0, 1, 0 },  // 1111 1101
    { 0, 1, 1 },  // 1111 1110
    { 0, 0, 0 },  // 1111 1111
};

#3


2  

Iterate through the array word by word (32-bit or 64-bit depend on your arch). Use __builtin_clz and __builtin_ctz to calculate the leading and trailing zeros for each word.

逐字迭代数组(32位或64位依赖于您的arch)。使用__builtin_clz和__builtin_ctz来计算每个单词的前导和尾随零。

There are two cases of consecutive zeros:

有两个连续的零的情况:

  • Within a word
  • 在一个词
  • Across adjective words.
  • 在形容词的单词。

The first case is easy to check. The second case requires to check if leading zeros of this item + trailing zeros of previous item is >= 6.

第一个病例很容易检查。第二种情况需要检查这个项的前导零和前一项的尾随零是>= 6。

#4


1  

Note these arithmetic tricks:

注意这些算术技巧:

// remove the trailing one bits
y = x & (x + 1);

x       11100011
        +      1
        --------
x+1     11100100
x&(x+1) 11100000

// remove the trailing zero bits
z = y | (y - 1);

y       11100000
        -      1
        --------
y-1     11011111
y|(y-1) 11111111

// isolate the hole
hole = z - x;
hole = z ^ x;

z       11111111
x       11100011
        --------
z^x     00011100

// Now you can count the set bits of the hole.
length = bitcount(hole);

// Or you can make it by computing highbit only (log2).
length = highbit(z^y) - highbit(x^y);

So a possible algorithm would be to use these tricks with large integer arithmetic and loop until length==0 (no more hole) or length>=n (you start next loop with x=z;).

因此,一种可能的算法是使用大整数算术和循环的这些技巧,直到长度=0(不再有洞)或长度>=n(你从x=z开始下一个循环)。

You can emulate the large integer yourself, acting on the collection byte after byte, and stopping when there is no more carry.

您可以自己模拟大整数,在字节后的收集字节上执行,并且在没有更多进位时停止。

  • x+1 has a carry only if byte==0xFF
  • 当字节==0xFF时,x+1有一个进位。
  • y-1 has a carry only if byte==0x00
  • y-1只有在字节==0x00时才有携带。
  • highbit is easy to program on a single byte
  • highbit可以在单个字节上编程。

This would give something like this:

它会给出这样的东西:

// return 1-based position of highest bit set in a byte
int highbit(unsigned char c)
{
    unsigned char n;
    int position = 0;
    n = c >> 4;
    if( n > 0 ) { c=n; position+=4; };
    n = c >> 2;
    if( n > 0 ) { c=n; position+=2; };
    n = c >> 1;
    if( n > 0 ) { c=n; position+=1; };
    position += c;
    return position;
}

int find_consecutive_zeros( unsigned char *bits , int nbytes , int nzero )
{
    int i,nTrailingOnes,nTrailingZero,sizeOfNextHole;
    unsigned char x,y;
    for(i=0 , x=bits[0]; 1; )
    {
        // perform y=x&(x+1) to count and remove trailing ones
        for(;x==0xFF && i<nbytes-1;x=bits[++i]);
        y = x&(x+1);
        nTrailingOnes = 8*i + highbit( x^y );
        // perform x=y|(y-1); to count and remove trailing zeros
        for(;y==0 && i<nbytes-1;y=bits[++i]);
        x = y|(y-1);
        nTrailingZero = 8*i + highbit( x^y );
        sizeOfNextHole = nTrailingZero - nTrailingOnes;
        // if size of hole is long enough, return it's low bit rank (0-based)
        if( sizeOfNextHole >= nzero ) return nTrailingOnes;
        // or return -1 if no more hole
        if( sizeOfNextHole == 0 ) return -1;
    }
}

You can make it more efficient by using longer than a byte for the base collection...

您可以使用比一个字节更长的字节来提高它的效率。

EDIT: speed up when you have a fixed size for nzero like 6
The algorithm above iterates on all the holes, and may loose time on small holes.
You can avoid that with a precomputed table with small holes filled.

编辑:当你有一个n0的固定大小的时候,你可以加快速度,上面的算法会在所有的孔上迭代,并且可以在小的洞上浪费时间。您可以避免使用预先计算的表,其中填充了小洞。

For example 10010101 has 3 holes that are too short, so you can replace it with 11111111.
But you must keep leading and trailing zeros unchanged.

例如10010101有3个孔太短,所以你可以用11111111替换它。但是你必须保持领先和尾随零不变。

To initialize such a table, you simply take 0xFF and xor with a mask containing 1 bits in place of trailing zeros (x|(x-1))^x and a mask containing 1 bits in place of leading zeros ((1<<highbit(x))-1)^0xFF.
Add an exception for 10000001, the sole byte containing 6 zeros between ones.

初始化这样一个表,你只需用面具把0 xff xor包含1位代替落后0(x |(x - 1))^ x和一个面具包含1位代替前导零((1 < < highbit 1)^ 0 xff(x))。在10000001中添加一个异常,这个唯一字节包含6个0。

EDIT2 : I have treated the bit sequence with least significant bit of firt byte first which fits well the arithmetic approach. The question explicitely use most significant bit of first byte first. So I have to reverse the bits to fit the question, which is done while initializing the table...

EDIT2:首先,我已经对这个位序列进行了最少的处理,这与算法的方法非常吻合。这个问题明确地使用了最重要的第一个字节。因此,我必须倒转位以适应问题,这是在初始化表时完成的…

int reversebits(unsigned char c)
{
    c = ((c & 0x0F) << 4) | ((c & 0xF0) >> 4);
    c = ((c & 0x33) << 2) | ((c & 0xCC) >> 2);
    c = ((c & 0x55) << 1) | ((c & 0xAA) >> 1);
    return c;
}
void initializeFillShortHoles(unsigned char fillShortHoles[256])
    for(unsigned int x=0;x<256;x++) {
        fillShortHoles[reversebits(x)] = ((1<<highbit(x))-1) ^ (x|(x-1)) ^ x;
    }
    fillShortHoles[0]=0;     // no use to reverse bits for those two
    fillShortHoles[129]=129;
}

Then you just replace occurrences of x=bits[ i ] with x=fillShortHoles[ bits[ i ] ], and you're done:

然后将x=bits (i)替换成x=fillShortHoles[bits[i]],你就完成了:

int find_6_consecutive_zeros( unsigned char *bits , int nbytes )
{
    static unsigned char fillShortHoles[256];
    static unsigned char highbitTable[256];
    static first=1;
    int i,nTrailingOnes,nTrailingZero,sizeOfNextHole;
    unsigned char x,y;

    if (first)
    {
        first = 0;
        initializeFillShortHoles( fillShortHoles );
        for(i=0;i<256;i++) highbitTable[i]=highbit(i);
    }
    for(x=fillShortHoles[bits[i=0]]; 1; )
    {
        // perform y=x&(x+1) to count trailing ones
        for(;x==0xFF && i<nbytes-1;x=fillShortHoles[bits[++i]]);
        y = x&(x+1);
        nTrailingOnes = 8*i + highbitTable[ x^y ];
        // perform z=y|(y-1); to count trailing zeros
        for(;y==0 && i<nbytes-1;y=fillShortHoles[bits[++i]]);
        x = y|(y-1);
        nTrailingZero = 8*i + highbitTable[ x^y ];
        sizeOfNextHole = nTrailingZero - nTrailingOnes;
        // if size of hole is long enough, return it's low bit rank (0-based)
        if( sizeOfNextHole >= 6 ) return nTrailingOnes;
        // or return -1 if no more hole
        if( sizeOfNextHole == 0 ) return -1;
    }
}

EDIT3: Finally, for nzero<=9, a faster way would be to cache the position for each pair of bytes.

EDIT3:最后,对于nzero<=9,更快的方法是缓存每一对字节的位置。

int find_n_consecutive_zeros_bypair( unsigned char *bits , int nbytes , int nzero)
{
    static int first=1;
    static signed char holepositionbypair[8][65536];
    signed char position;
    int i;
    unsigned short x;
    if (first)
    {
        first = 0;
        for(i=0;i<8;i++) {
            initializeHolePositionByPair( holepositionbypair[i] , i+1 );
        }
    }
    for (i=0 , x=0xFF; i<nbytes; i++) {
        x = (x << 8) + bits[i];
        if( (position = holepositionbypair[nzero-1][x]) >= 0) return (i-1)*8 + position;
    }
    return -1;
}

Note the initialization x=0xFF will handle the cases of nbytes<2.

注意,初始化x=0xFF将处理nbytes<2的情况。

No matter how you fill the cache holepositionbypair, it will be called only at initialization. I propose the arithmetic way of course:

无论您如何填充缓存holepositionbypair,它将只在初始化时调用。我提出的算法当然是:

int highbit16(unsigned short c)
{
    unsigned short n;
    int position = 0;
    n = c >> 8;
    if( n ) { c=n; position+=8; };
    n = c >> 4;
    if( n ) { c=n; position+=4; };
    n = c >> 2;
    if( n ) { c=n; position+=2; };
    n = c >> 1;
    if( n ) { c=n; position+=1; };
    position += c;
    return position;
}
unsigned short reversebits16(unsigned short c)
{
    c = ((c & 0x00FF) << 8) | ((c & 0xFF00) >> 8);
    c = ((c & 0x0F0F) << 4) | ((c & 0xF0F0) >> 4);
    c = ((c & 0x3333) << 2) | ((c & 0xCCCC) >> 2);
    c = ((c & 0x5555) << 1) | ((c & 0xAAAA) >> 1);
    return c;
}
initializeHolePositionByPair(signed char holepositionbypair[65536],int n)
{
    int i,n1,n0;
    unsigned short x,y;
    signed char position;
    for(i=0;i<65536;i++) {
        position = -1;
        x = i;
        while(x != 0xFFFF) {
            /* remove trailing ones */
            y = x&(x+1);
            n1 = highbit16(x^y);
            /* remove trailing zeros */
            x = y|(y-1);
            n0 = highbit16(x^y);
            if(n0-n1>=n) {
                position = n1; break;
            }
        }
        holepositionbypair[reversebits16(i)] = position;
    }
}

#5


0  

Here, try this code.

在这里,试试这个代码。

int dataLen = 500;
char data[dataLen];
//Get the data in place with whatever is your algorithm.
int i,j;
unsigned int dataSample;
unsigned int mask;
for(i=0;i<dataLen-1;i++){
    dataSample=((unsigned int)(data[i+1]) << 8) | (unsigned int) (data[i]);
    mask=(1<<6) - 1 ; //6 consecutive 1's
    for(j=0;j<8;j++){
        if((dataSample & (mask << j)) == 0){
            printf("Found consecutive 6 zeros at byte %d offset %d\n",i,j);
            j+=5; // Followed by j++ makes it j+=6.
        }
    }
}

#6


0  

For a byte, you only need to do 3 tests:

对于一个字节,您只需要做3个测试:

if( (byte & 0x3F) == 0) { /* Found */ }
if( (byte & 0x7E) == 0) { /* Found */ }
if( (byte & 0xFC) == 0) { /* Found */ }

It should be relatively easy to expand this to wider values. For example, for uint32_t:

将其扩展到更广泛的值应该相对容易些。例如,对于uint32_t:

tempValue1 = value & 0x3F3F3F3F;
tempValue2 = value & 0x7E7E7E7E;
tempValue3 = value & 0xFCFCFCFC;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }

The above code doesn't look like the best way of doing it (because it probably isn't). I deliberate wrote it like this to make it easier to see how it can be done with SSE. For SSE, it'd be similar to above (but wider).

上面的代码看起来并不是最好的方法(因为它可能不是)。我故意这样写,使它更容易理解如何使用SSE。对于SSE来说,它与上面的相似(但更宽)。

However, for SSE you can do all the comparisons in parallel and get rid of lots of branches. For example, you could AND with each of the 3 masks and use PCMPEQB 3 times and then OR these results together, then do one PMOVMSKB; which would give you a 16 bit value (that represents 16 bytes - one bit per source byte) that can be tested with a single if(result_flags == 0) { /* None of the 16 bytes matched */ }, where this final test could be at the end of a "do/while" loop.

但是,对于SSE,您可以并行地进行所有的比较,并去掉许多分支。例如,您可以使用PCMPEQB 3次,并使用PCMPEQB 3次,然后将这些结果组合在一起,然后执行一个PMOVMSKB;这将给你一个16位值(代表16字节——一位字节),可以用一个测试如果(result_flags = = 0){ / *匹配没有一个16字节* / },这最终的测试可能是最后一个“/“循环。

#7


0  

Suppose you're looking for exactly 6 consecutive zeros. You could use a code snippet like this:

假设你正在寻找6个连续的零。您可以使用如下的代码片段:

unsigned d1 = 0xff;
for (int i = 0; i < dataLen; i += 3) {
  unsigned d1 = (d1 << 24) | (data[i] << 16) | (data [i+1] << 8) | (data[i+2]);
  unsigned d2 = ~d1; // ones
  unsigned d3 = d2 & (d2 << 2) & (d2 << 4);
  unsigned d4 = d3 & (d3 << 1);
  if (!d4) continue;
  doSomethingWithTheSequence(i, d4);
}

d1 will combine the last byte of the previous run with three new bytes. So you can iterate over your data in multiples of 3. You could try multiples of 2, which might be more efficient as you can treat your data as 16 bit atomic quantities. You could also try multiples of 4 and do the subsequent computation on 64 bit numbers, particularly if you're on 64 bit platform. Or you introduce a special case for zero sequences spanning bytes.

d1将前一次运行的最后一个字节与三个新字节结合起来。你可以用3的倍数来迭代你的数据。您可以尝试使用2的倍数,这可能更有效,因为您可以将您的数据视为16位原子数量。您还可以尝试4的倍数,并在64位数字上进行后续计算,特别是在64位平台上。或者你为零序列的字节引入一个特殊的例子。

d2 reverses the bit pattern, which is useful because shifts will introduce artificial zeros but never artificial ones. d3 looks for three matching ones, at offsets 0, 2 and 4. d4 then adds another bit of offset, thus combining all offsets from 0 through 5. So d4 will be non-zero if and only if there is a run of 6 consecutive zeros in d1. You can then use __builtin_clz to identify the location of the most significant one in d4, which is also the position of the first of these 6 bits in d1. From that you can get the position in data.

d2颠倒了位模式,这是有用的,因为转换将引入人工的零,而不是人工的零。d3寻找3个匹配的值,在偏移值0、2和4。然后,d4添加了另一个偏移量,从而将所有偏移量从0到5进行组合。所以d4是非零的如果且仅当d1中有连续6个0。然后可以使用__builtin_clz来确定d4中最重要的位置,也就是d1中第一个6位的位置。这样你就能得到数据的位置。

You can adapt the code to other run lengths, either by adding a loop and hoping that the compiler will optimize it away, or by providing an inline template function which computes d4 from d2 in a way suitable for the desired run length.

您可以通过添加一个循环并希望编译器将其优化,或者通过提供一个内联模板函数(从d2中计算出适合于所需的运行长度的方式)来将代码调整为其他运行长度。

#8


0  

Let me try - how about:

让我试试,怎么样:

class bitIterator{
    unsigned char* farray;
    int foffset;
public:
    bitIterator(unsigned char* array, int offset)
        :farray(array), foffset(offset)
    {}

    bool operator*() const {
        return (farray[foffset/8] >> (7 - foffset%8)) & 1;
    }

    bitIterator& operator++(){
        foffset++;
        return *this;
    }

    int offset() const {
        return foffset; 
    }

};

// Just to demonstrate how to call it
int find_first_n(unsigned char* buffer, int length, int n){
    return std::search_n(bitIterator(buffer, 0), bitIterator(buffer, length*8), n, 0).offset();
}

That was just for fun...

那只是为了好玩……

Now if you really want to squeeze some performance out of it, I will suggset

现在,如果你真的想要挤出一些性能,我会建议你。

int find_first_n(unsigned char* buffer, int length, int n){
    int prev_trail = 0;
    unsigned int* buf = reinterpret_cast<unsigned int*>(buffer);
    int len = length/sizeof(int); 
    // Last few bytes will need special treatment if your array is not multple of int-size length.
    // However last bytes should not influence overall performance of code - assuming it is used on rather long arrays.
    // If you plan using rather short arrays (20-50 bytes?) you will probably be better off just using plain char.
    for (int i=0; i<len; i++){
        if (!buf[i]) return i*sizeof(int)*8-prev_trail; // __builtin_clz and __builtin_ctz are undefined on zero ;
        // EDIT:
        if (!~buf[i]){
            prev_trail = 0;
            continue;
        }
        // END EDIT!


        int shft = __builtin_clz(buf[i]);
        if (shft + prev_trail >= n) return i*sizeof(int)*8-prev_trail; // Leading + previous trailing <= n
                                                           // Not enough leading zeros, continue search.
        prev_trail = __builtin_ctz(buf[i]); // Store it to use for next int in array
        unsigned int tmp =0;               
        while(shft < sizeof(int)*8-prev_trail){   // While we haven't got to trailing zeros in this uint
            tmp = buf[i] << shft;               // Shift-out leading zeros;
            shft += (__builtin_clz(~tmp));
            tmp = buf[i] << shft;               // Shift-out leading ones;

            lead_zeros = __builtin_clz(tmp);
            if (lead_zeros >= n)                // and see if have enough leading zeros.
                return i*sizeof(int)*8+shft;

            shft += lead_zeros;
        }
        return length*8; // Not found :(
    }

This is rather difficult to read, but concept is simple - just iterate over every int-sized block, and for each see if leading zeros + trailing zeros from previous one are >= n. If not just repeatedly shift-out leading zeros and following ones (set bits), and check again for trailing zeroes >= n, as long as we didn't got to trailing bytes.

这是非常难以阅读,但概念很简单——只是遍历每个int-sized块,每个看到如果前导零+来自前一个落后0 > = n。如果不是不断移出前导零后的(比特),并检查再次落后于0 > = n,只要我们不落后于字节。

And just one more idea:

还有一个想法:

int find_first_n(unsigned char* buffer, int length, int n){
    union {
        unsigned char chr[2];
        unsigned int uit;
    };

    unsigned int onemask = 0x8000;
    for (int i = 1 ; i<n; i++) onemask = onemask | (onemask >> 1);

    int* masks = new int[8];
    for (int i = 0; i<8; i++){
        masks[i]=onemask >> i;
    }

    // generating masks - eg. for n == 3 would be:
    // [ 1110 0000 0000 0000 ]
    // [ 0111 0000 0000 0000 ]
    // [ 0011 1000 0000 0000 ]
    // [ 0001 1100 0000 0000 ]
    // [ ... ]
    // [ 0000 0001 1100 0000 ]


    uit = 0;
    for (int i=0; i<length-1; i++){
        chr[0] = buffer[i];
        chr[1] = buffer[i+1];
        for (int j=0; j<8; j++){
            if (!(uit & masks[j])) return i*8+j;
        }
    }
    chr[0] = buffer[length-1];
    chr[1] = 0xff; // Fill with ones at the end.
    for (int j=0; j<8; j++){
        if (!(uit & masks[j])) return (length-1)*8+j;
    }
    return length*8; // Not found :(
}

You might be tempted to cast pointer to variables still in buffer directly to word (reinterpret_cast<unsigned short int*>(buffer[i])) and apply masks without using union. Note however that half of such operations would use unaligned variables, which can degrade performance, and on some platforms even generate exceptions.

您可能会倾向于将指针转换为仍然在缓冲区中的变量(re解释t_cast (buffer[i])),并在不使用union的情况下使用掩码。但是注意,这类操作的一半将使用未对齐的变量,这会降低性能,在某些平台上甚至会产生异常。

#1


5  

How many numbers have 6 consecutive 0 bits (even when considering 2 consecutive bytes)?

有多少个数有6个连续的0位(即使考虑2个连续的字节)?

Byte 1
XXXXXXXX
000000??             0/1/2/3
?000000?             0/1/128/129
??000000             0/64/128/192

So if we consider 1 byte at a time then 0/1/2/3/64/128/192

如果我们考虑1个字节,那么0/1/2/3/64/128/192。

Byte a   Byte b
XXXXXXXX XXXXXXXX
??100000 0???????    (a & 31 == 0) && (b & 128 == 0)
???10000 00??????    (a & 15 == 0) && (b & 192 == 0)
????1000 000?????    (a & 7  == 0) && (b & 224 == 0)
?????100 0000????    (a & 3  == 0) && (b & 240 == 0)
??????10 00000???    (a & 1  == 0) && (b & 248 == 0)

So an absolute maximum of 12 tests gives you all combinations.
I am sure if you do the comparisons smartly you can reduce that.

所以12个测试的绝对最大值给了你所有的组合。我相信,如果你做的比较聪明,你可以减少它。

If we steel @Michael Burr solution below for using a table driven approach. Then we can organize it so that you can do one or two comparisons per byte.

如果我们使用一个表驱动的方法来解决下面的@Michael Burr解决方案。然后我们可以组织它,这样你就可以对每个字节进行一到两个比较。

struct TestStruct
{
    bool    is6Consecutive;
    bool    hasTrailing;
    int     maskNextByte;
    int     offset;
};
TestStruct   testData[256];

std::size_t findOffsetOf_6ConsecutiveZero(char const* data, size_t size)
{
    for(int loop = 0;loop < (size-1); ++loop)
    {
        char const&           val  = data[loop];
        TestStructure const&  test = testData[val];

        if (test.is6Consecutive)
        {   return loop*8 + test.offset;
        }

        if (test.hasTrailing)
        {
            if ((data[loop + 1] & test.maskNextByte) == 0)
            {   return loop*8 + test.offset;
            }
        }
    }
    // Test last byte
    TestStructure const&  test = testData[data[size-1]];
    if (test.is6Consecutive)
    {   return (size-1)*8 + test.offset;
    }

    return -1;
}

First few table entries:

前几个表条目:

TestStruct   testData[256] =
{
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {true,  false, 0x00, 0},
    {false, true,  0xf0, 6},  // 4 => 00000100 <Next 4 bytes are zero we hit>
    {false, false, 0x00, 0},  // 5 => 00000101 <Ignore and move on>
    {false, true,  0xf8, 7},  // 6 => 00000110 <Next 5 bytes are zero we hit>
    etc...

};

#2


3  

Here's a function that matches the output of the one provided in the question (at least under limited testing). It uses a table lookup , with the table having been generated by a one-off script. I'm honestly not sure if its performance is competitive with the suggestions that use bit testing hackery or GCC builtins, but I'll bet it's not too far off.

这是一个与问题中提供的输出相匹配的函数(至少在有限的测试中)。它使用一个表查找,表是由一个一次性的脚本生成的。老实说,我不确定它的性能是否与使用bit测试hackery或GCC内置的建议相竞争,但我敢打赌它不会太离谱。

struct zeros {
    unsigned char leading;
    unsigned char internal;
    unsigned char trailing;
};

// forward declaration so the long, non-interesting table is at the 
//  end of this 
static struct zeros const zero_table[256];

int find_zero_bits_offset( char const* data, int datalen, int desired)
{
    int offset = -1;
    int found = 0;
    char const* dataptr = &data[0];
    char const* endptr  = &data[datalen];


    // first, find which byte the sequence of zeros begins

    while (!found && (dataptr != endptr)) {
        unsigned char ch1 = *dataptr++;
        unsigned char ch2 = (dataptr != endptr) ? *dataptr : 0xff;

        int internal = zero_table[ch1].internal;
        int trailing = zero_table[ch1].trailing;
        int leading =  zero_table[ch2].leading;

        found = (desired <= internal) || 
                (trailing && (desired <= (trailing + leading)));
    }


    // now zero in on where the sequence starts within the byte

    if (found) {
        char ch = 0;
        unsigned int mask = 0;

        --dataptr;

        // dataptr points to the byte where the sequence of zeros starts.
        //  figure out exactly where the sequence is...

        // there's possibly some opportunity for optimization, if neccesary,
        //  by testing if the sequence was found in the "leading", "internal", or
        //  "trailing" cases. But the explicit loop will only iterate at most
        //  8 times (and will early-out on the first iteration if the match is 
        //  for the "leading" case) that it didn't seem too important

        ch = *dataptr;
        offset = (dataptr - data) * 8;

        // figure out where the appropriate internal run starts
        // note that the offset we need to return isn't necessarily the
        //  offset for the run of zeros counted by zero_table[tmp].internal_offset
        //  since a sufficient shorter run might come first

        // there may be a more efficient bithack for this, but the
        //  loop will iterate at most 8 times...
        mask = ((1 << desired) - 1);
        mask <<= (8 - desired);

        while ((ch & mask) != 0) {
            ++offset;
            mask >>= 1;
        }
    }
    else {
        // however you want to handle the "not found" situation. 
        //  This is equivalent to what was in the question:
        offset = (endptr - data) * 8;
    }

    return offset;
}



static struct zeros const zero_table[256] = {
    { 8, 8, 8 },  // 0000 0000
    { 7, 7, 0 },  // 0000 0001
    { 6, 6, 1 },  // 0000 0010
    { 6, 6, 0 },  // 0000 0011
    { 5, 5, 2 },  // 0000 0100
    { 5, 5, 0 },  // 0000 0101
    { 5, 5, 1 },  // 0000 0110
    { 5, 5, 0 },  // 0000 0111
    { 4, 4, 3 },  // 0000 1000
    { 4, 4, 0 },  // 0000 1001
    { 4, 4, 1 },  // 0000 1010
    { 4, 4, 0 },  // 0000 1011
    { 4, 4, 2 },  // 0000 1100
    { 4, 4, 0 },  // 0000 1101
    { 4, 4, 1 },  // 0000 1110
    { 4, 4, 0 },  // 0000 1111
    { 3, 4, 4 },  // 0001 0000
    { 3, 3, 0 },  // 0001 0001
    { 3, 3, 1 },  // 0001 0010
    { 3, 3, 0 },  // 0001 0011
    { 3, 3, 2 },  // 0001 0100
    { 3, 3, 0 },  // 0001 0101
    { 3, 3, 1 },  // 0001 0110
    { 3, 3, 0 },  // 0001 0111
    { 3, 3, 3 },  // 0001 1000
    { 3, 3, 0 },  // 0001 1001
    { 3, 3, 1 },  // 0001 1010
    { 3, 3, 0 },  // 0001 1011
    { 3, 3, 2 },  // 0001 1100
    { 3, 3, 0 },  // 0001 1101
    { 3, 3, 1 },  // 0001 1110
    { 3, 3, 0 },  // 0001 1111
    { 2, 5, 5 },  // 0010 0000
    { 2, 4, 0 },  // 0010 0001
    { 2, 3, 1 },  // 0010 0010
    { 2, 3, 0 },  // 0010 0011
    { 2, 2, 2 },  // 0010 0100
    { 2, 2, 0 },  // 0010 0101
    { 2, 2, 1 },  // 0010 0110
    { 2, 2, 0 },  // 0010 0111
    { 2, 3, 3 },  // 0010 1000
    { 2, 2, 0 },  // 0010 1001
    { 2, 2, 1 },  // 0010 1010
    { 2, 2, 0 },  // 0010 1011
    { 2, 2, 2 },  // 0010 1100
    { 2, 2, 0 },  // 0010 1101
    { 2, 2, 1 },  // 0010 1110
    { 2, 2, 0 },  // 0010 1111
    { 2, 4, 4 },  // 0011 0000
    { 2, 3, 0 },  // 0011 0001
    { 2, 2, 1 },  // 0011 0010
    { 2, 2, 0 },  // 0011 0011
    { 2, 2, 2 },  // 0011 0100
    { 2, 2, 0 },  // 0011 0101
    { 2, 2, 1 },  // 0011 0110
    { 2, 2, 0 },  // 0011 0111
    { 2, 3, 3 },  // 0011 1000
    { 2, 2, 0 },  // 0011 1001
    { 2, 2, 1 },  // 0011 1010
    { 2, 2, 0 },  // 0011 1011
    { 2, 2, 2 },  // 0011 1100
    { 2, 2, 0 },  // 0011 1101
    { 2, 2, 1 },  // 0011 1110
    { 2, 2, 0 },  // 0011 1111
    { 1, 6, 6 },  // 0100 0000
    { 1, 5, 0 },  // 0100 0001
    { 1, 4, 1 },  // 0100 0010
    { 1, 4, 0 },  // 0100 0011
    { 1, 3, 2 },  // 0100 0100
    { 1, 3, 0 },  // 0100 0101
    { 1, 3, 1 },  // 0100 0110
    { 1, 3, 0 },  // 0100 0111
    { 1, 3, 3 },  // 0100 1000
    { 1, 2, 0 },  // 0100 1001
    { 1, 2, 1 },  // 0100 1010
    { 1, 2, 0 },  // 0100 1011
    { 1, 2, 2 },  // 0100 1100
    { 1, 2, 0 },  // 0100 1101
    { 1, 2, 1 },  // 0100 1110
    { 1, 2, 0 },  // 0100 1111
    { 1, 4, 4 },  // 0101 0000
    { 1, 3, 0 },  // 0101 0001
    { 1, 2, 1 },  // 0101 0010
    { 1, 2, 0 },  // 0101 0011
    { 1, 2, 2 },  // 0101 0100
    { 1, 1, 0 },  // 0101 0101
    { 1, 1, 1 },  // 0101 0110
    { 1, 1, 0 },  // 0101 0111
    { 1, 3, 3 },  // 0101 1000
    { 1, 2, 0 },  // 0101 1001
    { 1, 1, 1 },  // 0101 1010
    { 1, 1, 0 },  // 0101 1011
    { 1, 2, 2 },  // 0101 1100
    { 1, 1, 0 },  // 0101 1101
    { 1, 1, 1 },  // 0101 1110
    { 1, 1, 0 },  // 0101 1111
    { 1, 5, 5 },  // 0110 0000
    { 1, 4, 0 },  // 0110 0001
    { 1, 3, 1 },  // 0110 0010
    { 1, 3, 0 },  // 0110 0011
    { 1, 2, 2 },  // 0110 0100
    { 1, 2, 0 },  // 0110 0101
    { 1, 2, 1 },  // 0110 0110
    { 1, 2, 0 },  // 0110 0111
    { 1, 3, 3 },  // 0110 1000
    { 1, 2, 0 },  // 0110 1001
    { 1, 1, 1 },  // 0110 1010
    { 1, 1, 0 },  // 0110 1011
    { 1, 2, 2 },  // 0110 1100
    { 1, 1, 0 },  // 0110 1101
    { 1, 1, 1 },  // 0110 1110
    { 1, 1, 0 },  // 0110 1111
    { 1, 4, 4 },  // 0111 0000
    { 1, 3, 0 },  // 0111 0001
    { 1, 2, 1 },  // 0111 0010
    { 1, 2, 0 },  // 0111 0011
    { 1, 2, 2 },  // 0111 0100
    { 1, 1, 0 },  // 0111 0101
    { 1, 1, 1 },  // 0111 0110
    { 1, 1, 0 },  // 0111 0111
    { 1, 3, 3 },  // 0111 1000
    { 1, 2, 0 },  // 0111 1001
    { 1, 1, 1 },  // 0111 1010
    { 1, 1, 0 },  // 0111 1011
    { 1, 2, 2 },  // 0111 1100
    { 1, 1, 0 },  // 0111 1101
    { 1, 1, 1 },  // 0111 1110
    { 1, 1, 0 },  // 0111 1111
    { 0, 7, 7 },  // 1000 0000
    { 0, 6, 0 },  // 1000 0001
    { 0, 5, 1 },  // 1000 0010
    { 0, 5, 0 },  // 1000 0011
    { 0, 4, 2 },  // 1000 0100
    { 0, 4, 0 },  // 1000 0101
    { 0, 4, 1 },  // 1000 0110
    { 0, 4, 0 },  // 1000 0111
    { 0, 3, 3 },  // 1000 1000
    { 0, 3, 0 },  // 1000 1001
    { 0, 3, 1 },  // 1000 1010
    { 0, 3, 0 },  // 1000 1011
    { 0, 3, 2 },  // 1000 1100
    { 0, 3, 0 },  // 1000 1101
    { 0, 3, 1 },  // 1000 1110
    { 0, 3, 0 },  // 1000 1111
    { 0, 4, 4 },  // 1001 0000
    { 0, 3, 0 },  // 1001 0001
    { 0, 2, 1 },  // 1001 0010
    { 0, 2, 0 },  // 1001 0011
    { 0, 2, 2 },  // 1001 0100
    { 0, 2, 0 },  // 1001 0101
    { 0, 2, 1 },  // 1001 0110
    { 0, 2, 0 },  // 1001 0111
    { 0, 3, 3 },  // 1001 1000
    { 0, 2, 0 },  // 1001 1001
    { 0, 2, 1 },  // 1001 1010
    { 0, 2, 0 },  // 1001 1011
    { 0, 2, 2 },  // 1001 1100
    { 0, 2, 0 },  // 1001 1101
    { 0, 2, 1 },  // 1001 1110
    { 0, 2, 0 },  // 1001 1111
    { 0, 5, 5 },  // 1010 0000
    { 0, 4, 0 },  // 1010 0001
    { 0, 3, 1 },  // 1010 0010
    { 0, 3, 0 },  // 1010 0011
    { 0, 2, 2 },  // 1010 0100
    { 0, 2, 0 },  // 1010 0101
    { 0, 2, 1 },  // 1010 0110
    { 0, 2, 0 },  // 1010 0111
    { 0, 3, 3 },  // 1010 1000
    { 0, 2, 0 },  // 1010 1001
    { 0, 1, 1 },  // 1010 1010
    { 0, 1, 0 },  // 1010 1011
    { 0, 2, 2 },  // 1010 1100
    { 0, 1, 0 },  // 1010 1101
    { 0, 1, 1 },  // 1010 1110
    { 0, 1, 0 },  // 1010 1111
    { 0, 4, 4 },  // 1011 0000
    { 0, 3, 0 },  // 1011 0001
    { 0, 2, 1 },  // 1011 0010
    { 0, 2, 0 },  // 1011 0011
    { 0, 2, 2 },  // 1011 0100
    { 0, 1, 0 },  // 1011 0101
    { 0, 1, 1 },  // 1011 0110
    { 0, 1, 0 },  // 1011 0111
    { 0, 3, 3 },  // 1011 1000
    { 0, 2, 0 },  // 1011 1001
    { 0, 1, 1 },  // 1011 1010
    { 0, 1, 0 },  // 1011 1011
    { 0, 2, 2 },  // 1011 1100
    { 0, 1, 0 },  // 1011 1101
    { 0, 1, 1 },  // 1011 1110
    { 0, 1, 0 },  // 1011 1111
    { 0, 6, 6 },  // 1100 0000
    { 0, 5, 0 },  // 1100 0001
    { 0, 4, 1 },  // 1100 0010
    { 0, 4, 0 },  // 1100 0011
    { 0, 3, 2 },  // 1100 0100
    { 0, 3, 0 },  // 1100 0101
    { 0, 3, 1 },  // 1100 0110
    { 0, 3, 0 },  // 1100 0111
    { 0, 3, 3 },  // 1100 1000
    { 0, 2, 0 },  // 1100 1001
    { 0, 2, 1 },  // 1100 1010
    { 0, 2, 0 },  // 1100 1011
    { 0, 2, 2 },  // 1100 1100
    { 0, 2, 0 },  // 1100 1101
    { 0, 2, 1 },  // 1100 1110
    { 0, 2, 0 },  // 1100 1111
    { 0, 4, 4 },  // 1101 0000
    { 0, 3, 0 },  // 1101 0001
    { 0, 2, 1 },  // 1101 0010
    { 0, 2, 0 },  // 1101 0011
    { 0, 2, 2 },  // 1101 0100
    { 0, 1, 0 },  // 1101 0101
    { 0, 1, 1 },  // 1101 0110
    { 0, 1, 0 },  // 1101 0111
    { 0, 3, 3 },  // 1101 1000
    { 0, 2, 0 },  // 1101 1001
    { 0, 1, 1 },  // 1101 1010
    { 0, 1, 0 },  // 1101 1011
    { 0, 2, 2 },  // 1101 1100
    { 0, 1, 0 },  // 1101 1101
    { 0, 1, 1 },  // 1101 1110
    { 0, 1, 0 },  // 1101 1111
    { 0, 5, 5 },  // 1110 0000
    { 0, 4, 0 },  // 1110 0001
    { 0, 3, 1 },  // 1110 0010
    { 0, 3, 0 },  // 1110 0011
    { 0, 2, 2 },  // 1110 0100
    { 0, 2, 0 },  // 1110 0101
    { 0, 2, 1 },  // 1110 0110
    { 0, 2, 0 },  // 1110 0111
    { 0, 3, 3 },  // 1110 1000
    { 0, 2, 0 },  // 1110 1001
    { 0, 1, 1 },  // 1110 1010
    { 0, 1, 0 },  // 1110 1011
    { 0, 2, 2 },  // 1110 1100
    { 0, 1, 0 },  // 1110 1101
    { 0, 1, 1 },  // 1110 1110
    { 0, 1, 0 },  // 1110 1111
    { 0, 4, 4 },  // 1111 0000
    { 0, 3, 0 },  // 1111 0001
    { 0, 2, 1 },  // 1111 0010
    { 0, 2, 0 },  // 1111 0011
    { 0, 2, 2 },  // 1111 0100
    { 0, 1, 0 },  // 1111 0101
    { 0, 1, 1 },  // 1111 0110
    { 0, 1, 0 },  // 1111 0111
    { 0, 3, 3 },  // 1111 1000
    { 0, 2, 0 },  // 1111 1001
    { 0, 1, 1 },  // 1111 1010
    { 0, 1, 0 },  // 1111 1011
    { 0, 2, 2 },  // 1111 1100
    { 0, 1, 0 },  // 1111 1101
    { 0, 1, 1 },  // 1111 1110
    { 0, 0, 0 },  // 1111 1111
};

#3


2  

Iterate through the array word by word (32-bit or 64-bit depend on your arch). Use __builtin_clz and __builtin_ctz to calculate the leading and trailing zeros for each word.

逐字迭代数组(32位或64位依赖于您的arch)。使用__builtin_clz和__builtin_ctz来计算每个单词的前导和尾随零。

There are two cases of consecutive zeros:

有两个连续的零的情况:

  • Within a word
  • 在一个词
  • Across adjective words.
  • 在形容词的单词。

The first case is easy to check. The second case requires to check if leading zeros of this item + trailing zeros of previous item is >= 6.

第一个病例很容易检查。第二种情况需要检查这个项的前导零和前一项的尾随零是>= 6。

#4


1  

Note these arithmetic tricks:

注意这些算术技巧:

// remove the trailing one bits
y = x & (x + 1);

x       11100011
        +      1
        --------
x+1     11100100
x&(x+1) 11100000

// remove the trailing zero bits
z = y | (y - 1);

y       11100000
        -      1
        --------
y-1     11011111
y|(y-1) 11111111

// isolate the hole
hole = z - x;
hole = z ^ x;

z       11111111
x       11100011
        --------
z^x     00011100

// Now you can count the set bits of the hole.
length = bitcount(hole);

// Or you can make it by computing highbit only (log2).
length = highbit(z^y) - highbit(x^y);

So a possible algorithm would be to use these tricks with large integer arithmetic and loop until length==0 (no more hole) or length>=n (you start next loop with x=z;).

因此,一种可能的算法是使用大整数算术和循环的这些技巧,直到长度=0(不再有洞)或长度>=n(你从x=z开始下一个循环)。

You can emulate the large integer yourself, acting on the collection byte after byte, and stopping when there is no more carry.

您可以自己模拟大整数,在字节后的收集字节上执行,并且在没有更多进位时停止。

  • x+1 has a carry only if byte==0xFF
  • 当字节==0xFF时,x+1有一个进位。
  • y-1 has a carry only if byte==0x00
  • y-1只有在字节==0x00时才有携带。
  • highbit is easy to program on a single byte
  • highbit可以在单个字节上编程。

This would give something like this:

它会给出这样的东西:

// return 1-based position of highest bit set in a byte
int highbit(unsigned char c)
{
    unsigned char n;
    int position = 0;
    n = c >> 4;
    if( n > 0 ) { c=n; position+=4; };
    n = c >> 2;
    if( n > 0 ) { c=n; position+=2; };
    n = c >> 1;
    if( n > 0 ) { c=n; position+=1; };
    position += c;
    return position;
}

int find_consecutive_zeros( unsigned char *bits , int nbytes , int nzero )
{
    int i,nTrailingOnes,nTrailingZero,sizeOfNextHole;
    unsigned char x,y;
    for(i=0 , x=bits[0]; 1; )
    {
        // perform y=x&(x+1) to count and remove trailing ones
        for(;x==0xFF && i<nbytes-1;x=bits[++i]);
        y = x&(x+1);
        nTrailingOnes = 8*i + highbit( x^y );
        // perform x=y|(y-1); to count and remove trailing zeros
        for(;y==0 && i<nbytes-1;y=bits[++i]);
        x = y|(y-1);
        nTrailingZero = 8*i + highbit( x^y );
        sizeOfNextHole = nTrailingZero - nTrailingOnes;
        // if size of hole is long enough, return it's low bit rank (0-based)
        if( sizeOfNextHole >= nzero ) return nTrailingOnes;
        // or return -1 if no more hole
        if( sizeOfNextHole == 0 ) return -1;
    }
}

You can make it more efficient by using longer than a byte for the base collection...

您可以使用比一个字节更长的字节来提高它的效率。

EDIT: speed up when you have a fixed size for nzero like 6
The algorithm above iterates on all the holes, and may loose time on small holes.
You can avoid that with a precomputed table with small holes filled.

编辑:当你有一个n0的固定大小的时候,你可以加快速度,上面的算法会在所有的孔上迭代,并且可以在小的洞上浪费时间。您可以避免使用预先计算的表,其中填充了小洞。

For example 10010101 has 3 holes that are too short, so you can replace it with 11111111.
But you must keep leading and trailing zeros unchanged.

例如10010101有3个孔太短,所以你可以用11111111替换它。但是你必须保持领先和尾随零不变。

To initialize such a table, you simply take 0xFF and xor with a mask containing 1 bits in place of trailing zeros (x|(x-1))^x and a mask containing 1 bits in place of leading zeros ((1<<highbit(x))-1)^0xFF.
Add an exception for 10000001, the sole byte containing 6 zeros between ones.

初始化这样一个表,你只需用面具把0 xff xor包含1位代替落后0(x |(x - 1))^ x和一个面具包含1位代替前导零((1 < < highbit 1)^ 0 xff(x))。在10000001中添加一个异常,这个唯一字节包含6个0。

EDIT2 : I have treated the bit sequence with least significant bit of firt byte first which fits well the arithmetic approach. The question explicitely use most significant bit of first byte first. So I have to reverse the bits to fit the question, which is done while initializing the table...

EDIT2:首先,我已经对这个位序列进行了最少的处理,这与算法的方法非常吻合。这个问题明确地使用了最重要的第一个字节。因此,我必须倒转位以适应问题,这是在初始化表时完成的…

int reversebits(unsigned char c)
{
    c = ((c & 0x0F) << 4) | ((c & 0xF0) >> 4);
    c = ((c & 0x33) << 2) | ((c & 0xCC) >> 2);
    c = ((c & 0x55) << 1) | ((c & 0xAA) >> 1);
    return c;
}
void initializeFillShortHoles(unsigned char fillShortHoles[256])
    for(unsigned int x=0;x<256;x++) {
        fillShortHoles[reversebits(x)] = ((1<<highbit(x))-1) ^ (x|(x-1)) ^ x;
    }
    fillShortHoles[0]=0;     // no use to reverse bits for those two
    fillShortHoles[129]=129;
}

Then you just replace occurrences of x=bits[ i ] with x=fillShortHoles[ bits[ i ] ], and you're done:

然后将x=bits (i)替换成x=fillShortHoles[bits[i]],你就完成了:

int find_6_consecutive_zeros( unsigned char *bits , int nbytes )
{
    static unsigned char fillShortHoles[256];
    static unsigned char highbitTable[256];
    static first=1;
    int i,nTrailingOnes,nTrailingZero,sizeOfNextHole;
    unsigned char x,y;

    if (first)
    {
        first = 0;
        initializeFillShortHoles( fillShortHoles );
        for(i=0;i<256;i++) highbitTable[i]=highbit(i);
    }
    for(x=fillShortHoles[bits[i=0]]; 1; )
    {
        // perform y=x&(x+1) to count trailing ones
        for(;x==0xFF && i<nbytes-1;x=fillShortHoles[bits[++i]]);
        y = x&(x+1);
        nTrailingOnes = 8*i + highbitTable[ x^y ];
        // perform z=y|(y-1); to count trailing zeros
        for(;y==0 && i<nbytes-1;y=fillShortHoles[bits[++i]]);
        x = y|(y-1);
        nTrailingZero = 8*i + highbitTable[ x^y ];
        sizeOfNextHole = nTrailingZero - nTrailingOnes;
        // if size of hole is long enough, return it's low bit rank (0-based)
        if( sizeOfNextHole >= 6 ) return nTrailingOnes;
        // or return -1 if no more hole
        if( sizeOfNextHole == 0 ) return -1;
    }
}

EDIT3: Finally, for nzero<=9, a faster way would be to cache the position for each pair of bytes.

EDIT3:最后,对于nzero<=9,更快的方法是缓存每一对字节的位置。

int find_n_consecutive_zeros_bypair( unsigned char *bits , int nbytes , int nzero)
{
    static int first=1;
    static signed char holepositionbypair[8][65536];
    signed char position;
    int i;
    unsigned short x;
    if (first)
    {
        first = 0;
        for(i=0;i<8;i++) {
            initializeHolePositionByPair( holepositionbypair[i] , i+1 );
        }
    }
    for (i=0 , x=0xFF; i<nbytes; i++) {
        x = (x << 8) + bits[i];
        if( (position = holepositionbypair[nzero-1][x]) >= 0) return (i-1)*8 + position;
    }
    return -1;
}

Note the initialization x=0xFF will handle the cases of nbytes<2.

注意,初始化x=0xFF将处理nbytes<2的情况。

No matter how you fill the cache holepositionbypair, it will be called only at initialization. I propose the arithmetic way of course:

无论您如何填充缓存holepositionbypair,它将只在初始化时调用。我提出的算法当然是:

int highbit16(unsigned short c)
{
    unsigned short n;
    int position = 0;
    n = c >> 8;
    if( n ) { c=n; position+=8; };
    n = c >> 4;
    if( n ) { c=n; position+=4; };
    n = c >> 2;
    if( n ) { c=n; position+=2; };
    n = c >> 1;
    if( n ) { c=n; position+=1; };
    position += c;
    return position;
}
unsigned short reversebits16(unsigned short c)
{
    c = ((c & 0x00FF) << 8) | ((c & 0xFF00) >> 8);
    c = ((c & 0x0F0F) << 4) | ((c & 0xF0F0) >> 4);
    c = ((c & 0x3333) << 2) | ((c & 0xCCCC) >> 2);
    c = ((c & 0x5555) << 1) | ((c & 0xAAAA) >> 1);
    return c;
}
initializeHolePositionByPair(signed char holepositionbypair[65536],int n)
{
    int i,n1,n0;
    unsigned short x,y;
    signed char position;
    for(i=0;i<65536;i++) {
        position = -1;
        x = i;
        while(x != 0xFFFF) {
            /* remove trailing ones */
            y = x&(x+1);
            n1 = highbit16(x^y);
            /* remove trailing zeros */
            x = y|(y-1);
            n0 = highbit16(x^y);
            if(n0-n1>=n) {
                position = n1; break;
            }
        }
        holepositionbypair[reversebits16(i)] = position;
    }
}

#5


0  

Here, try this code.

在这里,试试这个代码。

int dataLen = 500;
char data[dataLen];
//Get the data in place with whatever is your algorithm.
int i,j;
unsigned int dataSample;
unsigned int mask;
for(i=0;i<dataLen-1;i++){
    dataSample=((unsigned int)(data[i+1]) << 8) | (unsigned int) (data[i]);
    mask=(1<<6) - 1 ; //6 consecutive 1's
    for(j=0;j<8;j++){
        if((dataSample & (mask << j)) == 0){
            printf("Found consecutive 6 zeros at byte %d offset %d\n",i,j);
            j+=5; // Followed by j++ makes it j+=6.
        }
    }
}

#6


0  

For a byte, you only need to do 3 tests:

对于一个字节,您只需要做3个测试:

if( (byte & 0x3F) == 0) { /* Found */ }
if( (byte & 0x7E) == 0) { /* Found */ }
if( (byte & 0xFC) == 0) { /* Found */ }

It should be relatively easy to expand this to wider values. For example, for uint32_t:

将其扩展到更广泛的值应该相对容易些。例如,对于uint32_t:

tempValue1 = value & 0x3F3F3F3F;
tempValue2 = value & 0x7E7E7E7E;
tempValue3 = value & 0xFCFCFCFC;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }
tempValue1 >>= 8;
tempValue2 >>= 8;
tempValue3 >>= 8;
if( ( (uint8_t)tempValue1 == 0) { /* Found */ }
if( ( (uint8_t)tempValue2 == 0) { /* Found */ }
if( ( (uint8_t)tempValue3 == 0) { /* Found */ }

The above code doesn't look like the best way of doing it (because it probably isn't). I deliberate wrote it like this to make it easier to see how it can be done with SSE. For SSE, it'd be similar to above (but wider).

上面的代码看起来并不是最好的方法(因为它可能不是)。我故意这样写,使它更容易理解如何使用SSE。对于SSE来说,它与上面的相似(但更宽)。

However, for SSE you can do all the comparisons in parallel and get rid of lots of branches. For example, you could AND with each of the 3 masks and use PCMPEQB 3 times and then OR these results together, then do one PMOVMSKB; which would give you a 16 bit value (that represents 16 bytes - one bit per source byte) that can be tested with a single if(result_flags == 0) { /* None of the 16 bytes matched */ }, where this final test could be at the end of a "do/while" loop.

但是,对于SSE,您可以并行地进行所有的比较,并去掉许多分支。例如,您可以使用PCMPEQB 3次,并使用PCMPEQB 3次,然后将这些结果组合在一起,然后执行一个PMOVMSKB;这将给你一个16位值(代表16字节——一位字节),可以用一个测试如果(result_flags = = 0){ / *匹配没有一个16字节* / },这最终的测试可能是最后一个“/“循环。

#7


0  

Suppose you're looking for exactly 6 consecutive zeros. You could use a code snippet like this:

假设你正在寻找6个连续的零。您可以使用如下的代码片段:

unsigned d1 = 0xff;
for (int i = 0; i < dataLen; i += 3) {
  unsigned d1 = (d1 << 24) | (data[i] << 16) | (data [i+1] << 8) | (data[i+2]);
  unsigned d2 = ~d1; // ones
  unsigned d3 = d2 & (d2 << 2) & (d2 << 4);
  unsigned d4 = d3 & (d3 << 1);
  if (!d4) continue;
  doSomethingWithTheSequence(i, d4);
}

d1 will combine the last byte of the previous run with three new bytes. So you can iterate over your data in multiples of 3. You could try multiples of 2, which might be more efficient as you can treat your data as 16 bit atomic quantities. You could also try multiples of 4 and do the subsequent computation on 64 bit numbers, particularly if you're on 64 bit platform. Or you introduce a special case for zero sequences spanning bytes.

d1将前一次运行的最后一个字节与三个新字节结合起来。你可以用3的倍数来迭代你的数据。您可以尝试使用2的倍数,这可能更有效,因为您可以将您的数据视为16位原子数量。您还可以尝试4的倍数,并在64位数字上进行后续计算,特别是在64位平台上。或者你为零序列的字节引入一个特殊的例子。

d2 reverses the bit pattern, which is useful because shifts will introduce artificial zeros but never artificial ones. d3 looks for three matching ones, at offsets 0, 2 and 4. d4 then adds another bit of offset, thus combining all offsets from 0 through 5. So d4 will be non-zero if and only if there is a run of 6 consecutive zeros in d1. You can then use __builtin_clz to identify the location of the most significant one in d4, which is also the position of the first of these 6 bits in d1. From that you can get the position in data.

d2颠倒了位模式,这是有用的,因为转换将引入人工的零,而不是人工的零。d3寻找3个匹配的值,在偏移值0、2和4。然后,d4添加了另一个偏移量,从而将所有偏移量从0到5进行组合。所以d4是非零的如果且仅当d1中有连续6个0。然后可以使用__builtin_clz来确定d4中最重要的位置,也就是d1中第一个6位的位置。这样你就能得到数据的位置。

You can adapt the code to other run lengths, either by adding a loop and hoping that the compiler will optimize it away, or by providing an inline template function which computes d4 from d2 in a way suitable for the desired run length.

您可以通过添加一个循环并希望编译器将其优化,或者通过提供一个内联模板函数(从d2中计算出适合于所需的运行长度的方式)来将代码调整为其他运行长度。

#8


0  

Let me try - how about:

让我试试,怎么样:

class bitIterator{
    unsigned char* farray;
    int foffset;
public:
    bitIterator(unsigned char* array, int offset)
        :farray(array), foffset(offset)
    {}

    bool operator*() const {
        return (farray[foffset/8] >> (7 - foffset%8)) & 1;
    }

    bitIterator& operator++(){
        foffset++;
        return *this;
    }

    int offset() const {
        return foffset; 
    }

};

// Just to demonstrate how to call it
int find_first_n(unsigned char* buffer, int length, int n){
    return std::search_n(bitIterator(buffer, 0), bitIterator(buffer, length*8), n, 0).offset();
}

That was just for fun...

那只是为了好玩……

Now if you really want to squeeze some performance out of it, I will suggset

现在,如果你真的想要挤出一些性能,我会建议你。

int find_first_n(unsigned char* buffer, int length, int n){
    int prev_trail = 0;
    unsigned int* buf = reinterpret_cast<unsigned int*>(buffer);
    int len = length/sizeof(int); 
    // Last few bytes will need special treatment if your array is not multple of int-size length.
    // However last bytes should not influence overall performance of code - assuming it is used on rather long arrays.
    // If you plan using rather short arrays (20-50 bytes?) you will probably be better off just using plain char.
    for (int i=0; i<len; i++){
        if (!buf[i]) return i*sizeof(int)*8-prev_trail; // __builtin_clz and __builtin_ctz are undefined on zero ;
        // EDIT:
        if (!~buf[i]){
            prev_trail = 0;
            continue;
        }
        // END EDIT!


        int shft = __builtin_clz(buf[i]);
        if (shft + prev_trail >= n) return i*sizeof(int)*8-prev_trail; // Leading + previous trailing <= n
                                                           // Not enough leading zeros, continue search.
        prev_trail = __builtin_ctz(buf[i]); // Store it to use for next int in array
        unsigned int tmp =0;               
        while(shft < sizeof(int)*8-prev_trail){   // While we haven't got to trailing zeros in this uint
            tmp = buf[i] << shft;               // Shift-out leading zeros;
            shft += (__builtin_clz(~tmp));
            tmp = buf[i] << shft;               // Shift-out leading ones;

            lead_zeros = __builtin_clz(tmp);
            if (lead_zeros >= n)                // and see if have enough leading zeros.
                return i*sizeof(int)*8+shft;

            shft += lead_zeros;
        }
        return length*8; // Not found :(
    }

This is rather difficult to read, but concept is simple - just iterate over every int-sized block, and for each see if leading zeros + trailing zeros from previous one are >= n. If not just repeatedly shift-out leading zeros and following ones (set bits), and check again for trailing zeroes >= n, as long as we didn't got to trailing bytes.

这是非常难以阅读,但概念很简单——只是遍历每个int-sized块,每个看到如果前导零+来自前一个落后0 > = n。如果不是不断移出前导零后的(比特),并检查再次落后于0 > = n,只要我们不落后于字节。

And just one more idea:

还有一个想法:

int find_first_n(unsigned char* buffer, int length, int n){
    union {
        unsigned char chr[2];
        unsigned int uit;
    };

    unsigned int onemask = 0x8000;
    for (int i = 1 ; i<n; i++) onemask = onemask | (onemask >> 1);

    int* masks = new int[8];
    for (int i = 0; i<8; i++){
        masks[i]=onemask >> i;
    }

    // generating masks - eg. for n == 3 would be:
    // [ 1110 0000 0000 0000 ]
    // [ 0111 0000 0000 0000 ]
    // [ 0011 1000 0000 0000 ]
    // [ 0001 1100 0000 0000 ]
    // [ ... ]
    // [ 0000 0001 1100 0000 ]


    uit = 0;
    for (int i=0; i<length-1; i++){
        chr[0] = buffer[i];
        chr[1] = buffer[i+1];
        for (int j=0; j<8; j++){
            if (!(uit & masks[j])) return i*8+j;
        }
    }
    chr[0] = buffer[length-1];
    chr[1] = 0xff; // Fill with ones at the end.
    for (int j=0; j<8; j++){
        if (!(uit & masks[j])) return (length-1)*8+j;
    }
    return length*8; // Not found :(
}

You might be tempted to cast pointer to variables still in buffer directly to word (reinterpret_cast<unsigned short int*>(buffer[i])) and apply masks without using union. Note however that half of such operations would use unaligned variables, which can degrade performance, and on some platforms even generate exceptions.

您可能会倾向于将指针转换为仍然在缓冲区中的变量(re解释t_cast (buffer[i])),并在不使用union的情况下使用掩码。但是注意,这类操作的一半将使用未对齐的变量,这会降低性能,在某些平台上甚至会产生异常。