Friday, January 25, 2008

Find the Missing Number

Suppose you have a list of N-1 integers, and they are from 1-N.
Each number is used at most once.
Therefore, there is a missing number.

To add more complexity, assume you can only look at one bit at any given time.
Let's call this function:

boolean isSet(int index, int bitPosition) {
return array[index] & pow(2, bitPosition);
}


The answer is to view the 0th bit, counting up how many are supposed to be odd
and how many are supposed to be even, we can analyze that the missing number is
in one of these sets, and therefore, one of the sets can be removed completely.
Therefore, the next bit check is reduced from N to N/2, and doing so, will be
something similar to N + N/2 + N/4 + ... smaller than N*lg(N).
The trick is when the number is in the set that is "odd" or that has the bit "set",
then that bit will be set, and you increment and look at the next bit, else, look
at the next bit + 1.


for (i = 0; i < 8; ++i) { // log(N)
memset(bitSetList, 0, N);
memset(bitNotSetList, 0, N);
bitSet = bitNotSet = 0;
if (N <= 1) {
printf("Printing the value ...\n");
printf("%d\n", set);
break;
}
for (j = 0; j < N; ++j) // N + N/2 + N/4 + ...
// Roughly N * lg(N) since 1 + 1/2 + 1/3 + ... + 1/N -> lg(N+1)
// and N * (1 + 1/2 + ... + 1/N) > N + N/2 + N/4 + ...
if (f(array[j], i))
bitSetList[bitSet++] = array[j];
else
bitNotSetList[bitNotSet++] = array[j];

// Update the index counter.
if (N%2) {
if (N/2 == bitSet)
set |= pow(2, i),
updateBitSet(&i, &N, bitSet, bitSetList, BIT_SET);
else
updateBitSet(&i, &N, bitNotSet, bitNotSetList, BIT_NOT_SET);
} else { // even
if (N/2 == bitSet)
set |= pow(2, i),
updateBitSet(&i, &N, bitSet, bitSetList, BIT_SET);
else
updateBitSet(&i, &N, bitNotSet, bitNotSetList, BIT_NOT_SET);
}
}

Saturday, January 19, 2008

Factorial n!

Computing n!

There are many ways to compute n!, but I will introduce a divide and conquer method.
Using these rules below:
f(n,m) = 1 if n == m,
f(n,m) = 0 if n > m,
f(n,m) = f(n,x) * f(x+1,m) where x = (n + m)/2, otherwise

int f(int n, int m)
{
if ( !(n < m) ) return n == m;
int x = (n + m) / 2
return f(n,x) * f(x+1,m);
}

Thursday, January 17, 2008

Project Euler - Prime Generation

Project Euler - Mathematics/CS competition site (http://www.projecteuler.net)
I want to show an alternate way than S of E, currently, that I know of.
The space required is just the array of primes.
I used to perform the Seives (of Erasothenes) which is N * lg lg N, which is quite fast.
However, a better technique (in terms of space) is used in the following example.
This code will be python, and I will show the implementation below.
I can find the 10,001st prime number in under one second.

#! /usr/bin/python
# This program computes
# 10,001st prime.
# count of how many primes
nprimes = 1
# current value
n = 1
# current primes list with
# its value squared
primes = [ (2,4) ]
while nprimes < 10001:
# Gen all primes
n += 2
for prime, primeSquared in primes:
if n < sq:
primes.append((n, n**2))
nprimes += 1
if nprimes == 10001:
print n
break;
if n % prime == 0:
break;

The Sieve Of Erasothenes (S of E) follows (which is much faster for large inputs):

#!/usr/bin/python
# O(N lg lg N)
import Numeric
primes = [2]
N = 2000000
p = Numeric.ones(N,Numeric.Int)
n = 3
c = 1
while n < N:
if p[n] == 1:
c += 1
primes.append(n);
j = n + n;
while j < N:
p[j] = 0
j += n
n += 2

Wednesday, January 16, 2008

Generating all subsets

Generating all subsets

I am not really going to dive into much detail here.
But assume you are representing a set with a bitmask within a uint32_t (or the like).
To generate all the subsets is quite easy if you desire to do it within a couple of loops, however, doing it in one loop is conceivably hard, and even harder to prove that it works.

The following code is using recursion to identify and prints out all subsets of the set (bitmask).

void f(int bitmask) {
cout << bitmask << endl;
for (int j = 0; j < n; ++j)
if (bitmask & (1 << j))
f(bitmask ^ (1 << j));
}

However, a better solution and one that avoids recursion is the following:

for (int subset = bitmask;; subset = (subset - 1) & bitmask)
cout << subset << endl;

To prove this, we first need to know what exactly happens when you perform
(subset - 1) & bitmask
Basically, the & bitmask identifies that it is a subset of bitmask.
But does this guarantee all of them?
(subset - 1) will remove the lowest one-bit that is set, and replace it with a zero, and set all the lower bits to one.

The reasoning that this proves that all subsets are conceivably generating with this single for loop is certain. Every time you update subset, you are un-setting the lowest bit within bitmask. Once you have exhausted the lower bit. It goes to the next lowest bit that is set, since the lowest one has been exhausted already.

For example,
X = 11010111
Y = X (this is valid)
Y = (Y - 1) & X = 11010110 (removes lowest bit)
Y = (Y - 1) & X = 11010101 (removes 2nd lowest bit)
Y = (Y - 1) & X = 11010100 (removes 2 lowest bits)
Y = (Y - 1) & X = 11010011 (removes 3rd lowest bit)
Y = (Y - 1) & X = 11010010 (removes 3rd and 1st lowest bit)
Y = (Y - 1) & X = 11010001 (removes 3rd and 2nd lowest bit)
Y = (Y - 1) & X = 11010000 (removes 3 lowest bits)
This follows for all Y, such that Y > 0.
The bitwise-AND with X, guarantees it's always a subset of X.
q.e.d.

Therefore, the invariant of this process, is that the lowest bit is unset (within X), then proceeding with higher bits, it sets all remaining bits to being set, and then it proceeds to unset that bits along with the current bit that is being proceeded to be unset. This will guarantee all subsets.