Tuesday, October 9, 2007

FFS: Revisited (How to find the first bit set)

This seems to be a reoccurring problem that's looked at within this blog. Moreover, it has many features that are good to learn from. This article will focus on the binary-search with the respect of the number of bits.

This is a classic binary search problem, with the bits as the search space, and the conditional expression will be if the mask & (bitwise-AND) bits_val == 0 or not.

Analysis:
If we separate an uint32_t into 2 16-bit segments, then a branch would be if the lower end or higher end held the first bit (0). And then branch with 2 8-bit segments, etc.
This is quite easy to see; the only stipulation is that we need to keep a running counter when we need to skip "bits". Also, in this case, we will always consider the low-end order since it will be more intuitive.

Therefore, if the bits_val & mask == 0, we need to count all those bits + shift the original bits_val by that number of bits, since we are accounting for it to be on the low-end order.

The algorithm therefore looks like so:

#define BITS 32
uint32_t ffs(uint32_t bits_val)
{
uint32_t offset = 0;
uint32_t mask = (1<<(BITS / 2)) - 1, nbits = BITS / 2;
do {
if ( (bits_val&mask) == 0) {
// update offset and low-end order of bits_val
offset += nbits;
bits_val >>= nbits;
}
// update the bits used as well as the new mask.
nbits /= 2;
mask >>= nbits;
} while(nbits);
return (offset);
}

uint32_t fls(uint32_t bits_val)
{
uint32_t offset = BITS-1;
uint32_t nbits = BITS / 2,
mask = (((1<<(BITS/2))-1)<<(BITS/2);

do {
if ((bits_val&mask) == 0) {
offset -= nbits;
bits_val <<= nbits;
}
nbits /= 2;
mask <<= nbits;
} while(nbits);
return (offset);
}



Also, as a side note, there are many problems that deal with byte-alignment.
Another proposed solution, is the following:
(uses a math trick to get the dividend within range and multiply with the alignment)

#define ALIGN(x,align) ( (((x)+(align)-1)/(align)) * (align) )

Since it uses integer division, (x + align - 1)/align will give you the value that is within range, since x could be a multiple of "align," it will give you the low(ceiling) order of x + align - 1. When multiplying this by align, it will yield the exact (next) multiple of align >= x.

No comments: