Friday, December 21, 2007

Word Numbers

Question:
"If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter?"

To be precise: if the integers from 1 to 999,999,999 are expressed in words (omitting spaces, 'and', and punctuation[1]), and sorted alphabetically so that the first six integers are

* eight
* eighteen
* eighteenmillion
* eighteenmillioneight
* eighteenmillioneighteen
* eighteenmillioneighteenthousand

and the last is

* twothousandtwohundredtwo

then reading top to bottom, left to right, the 28th letter completes the spelling of the integer "eighteenmillion".

The 51 billionth letter also completes the spelling of an integer. Which one, and what is the sum of all the integers to that point?

Solution:
The solution to this problem involves noticing that you can skip many words if the order is not important. Notice the 999,999 numbers after eightmillion. They include 8,000,001 - 8,999,999 in some sorted order. Thus, it's easy to count the numbers (sum) as well as the sum.

The hard part is when you have to drill down on the problem. However, it's the same problem over again. Except there are fewer number that are involved. Let's look at eightthousand. What are the next 999 numbers involved? 8,001 - 8,999 in some sorted order.

So if we have 3 lists of numbers, and place them in orders, relative to 10^3 then we only have roughly 3000 numbers to sort and go through at any time before drilling down. The trick is to count the sum quickly as well as the number of letters while skipping over many letters and numbers in the process.

However, it boils down to only needing to keep track of the lower order categories.

We need to know the sum from 1 - 999, and the sum from 1 - 999,999.
(n*(n+1)/2)
We need to know the sum of letters from 1 - 999, and 1 - 999,999.
This is easy, but not that straight-forward. I approached it differently, so I kept a list of number (spelled out) from 1 - 999. So that part was just looping through and summing all entries in that list.

However, to get the other sum from 1 - 999,999. I looped the higher order list that contained the thousand list of numbers 1,000 - 999,000. And added that sum of characters plus the sum of 1 - 999, and then added it with the sum of characters multiplied by 999.

Therefore, all that's left is to sort all categories and walk through the list, updating the current sum of numbers and the current sum of letters.