Wednesday, January 16, 2008

Generating all subsets

Generating all subsets

I am not really going to dive into much detail here.
But assume you are representing a set with a bitmask within a uint32_t (or the like).
To generate all the subsets is quite easy if you desire to do it within a couple of loops, however, doing it in one loop is conceivably hard, and even harder to prove that it works.

The following code is using recursion to identify and prints out all subsets of the set (bitmask).

void f(int bitmask) {
cout << bitmask << endl;
for (int j = 0; j < n; ++j)
if (bitmask & (1 << j))
f(bitmask ^ (1 << j));
}

However, a better solution and one that avoids recursion is the following:

for (int subset = bitmask;; subset = (subset - 1) & bitmask)
cout << subset << endl;

To prove this, we first need to know what exactly happens when you perform
(subset - 1) & bitmask
Basically, the & bitmask identifies that it is a subset of bitmask.
But does this guarantee all of them?
(subset - 1) will remove the lowest one-bit that is set, and replace it with a zero, and set all the lower bits to one.

The reasoning that this proves that all subsets are conceivably generating with this single for loop is certain. Every time you update subset, you are un-setting the lowest bit within bitmask. Once you have exhausted the lower bit. It goes to the next lowest bit that is set, since the lowest one has been exhausted already.

For example,
X = 11010111
Y = X (this is valid)
Y = (Y - 1) & X = 11010110 (removes lowest bit)
Y = (Y - 1) & X = 11010101 (removes 2nd lowest bit)
Y = (Y - 1) & X = 11010100 (removes 2 lowest bits)
Y = (Y - 1) & X = 11010011 (removes 3rd lowest bit)
Y = (Y - 1) & X = 11010010 (removes 3rd and 1st lowest bit)
Y = (Y - 1) & X = 11010001 (removes 3rd and 2nd lowest bit)
Y = (Y - 1) & X = 11010000 (removes 3 lowest bits)
This follows for all Y, such that Y > 0.
The bitwise-AND with X, guarantees it's always a subset of X.
q.e.d.

Therefore, the invariant of this process, is that the lowest bit is unset (within X), then proceeding with higher bits, it sets all remaining bits to being set, and then it proceeds to unset that bits along with the current bit that is being proceeded to be unset. This will guarantee all subsets.

No comments: