programmer's blog: F-Question

Given a range [a,b], find out how many numbers within the range, have a prime number of bits set.
The brute force solution is too slow, especially if [a,b] are 64-bit unsigned integers, which in this case we'll assume.

Therefore, we need a way to count the bits. If perhaps we had a perfect range, meaning it was between 0 and a power of 2, then the number of bits that are prime, are the number of bits choose (prime number <= number of bits to choose from). The reason, being is that if we have n bits to choose from, and we want to pick p bits to be set to 1, then there are C(n,p) ways to choose the numbers., all of which are unique.

0000 - 0111 (where the power of 2 would be 1000 in this case)
So all we have to do is look at the bits that we can choose from, 3.
And we have two different prime number values to be chosen - 2 and 3.

Therefore, C(3,2) + C(3,3) is the total number of ways to do this.
This is the easy case. However, numbers aren't exactly given to us in such a perfect world. Multiple bits could be set for a given number x:
101011101

In this case we split the ranges into powers of 2 decreasing within MSB. Why am I doing this? Because it will be easier to calculate the number of bits within the given range. Because it basically reduces the problem to the initial power-of-2 solution we came up with earlier, except now we have a notion of fixed bits and free bits. The fixed bits have to be counted, however, the free bits have a choice. Therefore, the choose function will have to account that the free bits to choose from + fixed bits = prime number of bits. Therefore, we need to have the choose function to be C(n,k-fixed) such that k is prime, therefore, you are only going to choose k-fixed bits from the remaining n left. (It would be nice if k-fixed was >= 0)

The reason that these range values would be a power-of-2 solution is because the right most end of the range would be a "type" of power-of-2 - 1 value, meaning the difference between start and end is that you only have to look at the right most 1-bits that are set, and keep into consideration the remaining bits to the left that are "fixed" when computing the choose function. So, therein lies one last problem, generating all the powers-of-2 numbers starting from MSB and appending the new power-of-2 number. Also, there should be a function to count the number of bits that are set.

The following code shows the entire procedure: (C++)



#include <string>
#include <cstdlib>
#include <iostream>
#include <vector>
#include <map>
using namespace std;

typedef unsigned long long uint64_t;

#define MAX_BITS 64

// decently quick way to count how many bits are set:
uint64_t MIT( uint64_t x ) {
    x = x - ((x>>1ULL)&0x7777777777777777ull)
          - ((x>>2ULL)&0x3333333333333333ull)
          - ((x>>3ULL)&0x1111111111111111ull);
    x = ((x+(x>>4ULL))&0x0f0f0f0f0f0f0f0full) % 255;
}

// memoiz the choose function
map<pair <uint64_t,uint64_t>, uint64_t> nck;

// keeps the list of ranges: 
vector<uint64_t> start, end;

uint64_t next( uint64_t z, uint64_t x, int &y) {
    if (y == -1) return 0;

    for (int i = y; i >= 0; --i) {
        if (x & (1ULL<<i)) {
            // next time y will start
            //  at next (lower) index
            y = i-1;
            return z | (1ULL<<i);
        }
    }
    y = -1;
    return 0;
}
void init(uint64_t x) {
    // always append zero (0) first
    start.push_back(0);

    uint64_t z = 0;

    // which bit do we set? 
    int y = MAX_BITS-1;
    while ((z=next(z,x,y)) != 0) {
        start.push_back(z);
    }
}
uint64_t nchoosek( uint64_t n, uint64_t k)
{
    if (k > n) return 0;
    if (n == k) return 1;
    if (n == 0) return 0;
    if (k == 0) return 1;

    // memoized solution? 
    if (nck.count(make_pair(n,k))) {
        return nck[make_pair(n,k)];
    }
    // memoiz the solution here:
    return nck[make_pair(n,k)] = 
            nchoosek(n-1,k-1) + nchoosek(n-1,k);
}

int isPrime( int x ){
    if (x < 2) {
        return 0;
    }

    // could optimize it for just odd's and 2
    for (int i = 2; i*i <= x; ++i) {
        if ((x % i) == 0) {
            return 0;
        }
    }
    return 1;
}

uint64_t R( uint64_t x1, uint64_t x2) {
    // get the digits that are free
    uint64_t n = MIT(x2 - x1); 

    // find the number of fixed digits
    int fixed = MIT(x2 - (x2-x1)); //MIT(x1 & ~(x2-x1));

    uint64_t sum = 0;    // sum up the diff prime(s)

    // go through all primes, starting with fixed+const
    for (int x = fixed + n; x - fixed >= 0; --x) {
        if (isPrime(x)) {
            sum += nchoosek(n,x-fixed);
        }
    }
    return sum;
}
uint64_t f( uint64_t x ) {
    // find all the ranges
    init(x);

    for (int i = 1; i < start.size(); ++i) {
        end.push_back(start[i]-1);
    }
    end.push_back(start[end.size()]);

    // go through each range, and calculate the sums
    uint64_t sum = 0;

    for (int i = 0; i < start.size(); ++i) {
        sum += R(start[i],end[i]);
    }

    return sum;
}

uint64_t bits_prime( uint64_t a, uint64_t b )
{
        uint64_t fb = f(b);

        // clear the range lists:
        start.clear(); end.clear();

        uint64_t fa = f(a);

        // inclusion exclusion rule:                                                                                                                                                                                                
        return (fb - fa + isPrime(MIT(a)));
}

programmer's blog

Friday, August 17, 2007

F-Question

No comments:

Blog Archive

About Me