3. The 10,000,000,000th Prime Number
By Eugene McDonnell. First published in Vector, 10, 4, (April 1994), 110-113.
What is the 10,000,000,000th prime number?
This column tells a story, and it has a moral. It does not concern itself directly with J, the ostensible reason for these columns, but it can be justified because one of the direct antecedents of J is the language A, developed by Arthur Whitney while he worked at the investment banking firm of Morgan Stanley. It was one page of C code for an A-like interpreter, written one afternoon by Whitney at Ken Iverson's Kiln Farm in Ontario that gave Roger Hui the direction he needed to start work on what was to become J. Roger exhibits this code in Appendix A of "An Implementation of J", published by Iverson Software Inc. Whitney no longer works at Morgan Stanley: he has set out as a free lance and is developing a language called K, which has some affinity with A and J. Hui now works at Morgan Stanley, and it is his adventure hunting down a large prime that the story is about, and Hui used A as the weapon with which he targeted the large prime.
The story begins when Hui's boss at Morgan Stanley challenged him by saying, "You think you're smart, but you don't even know what the 10-to-the-10-th prime is." Hui's immediate response was, "Do you start counting from 0 or 1?" The boss was so taken aback that for a minute or so he didn't understand Roger's question.
The boss's challenge seems to have been meant as an example of a theoretically attainable but practically impossible computational task. This article tells how Roger went about achieving the impossible. I look on it as a triumph of the client-server technology. This column is not so much an article about programming as it is about computer logistics. The programming aspects, while important, are secondary to the story of how Roger went about organizing the solution.
The germ of Hui's solution was to envision a Boolean vector p of length k such that the ith element of p is 1 if i is prime, and 0 otherwise. Just sum-scan this very long vector and look for the index of 1e10 in it. How long should such a vector be? The Prime Number Theorem says that the number of primes less than k is roughly k%(log k) . Solving for k in the equation 1e10=k%(log k) gives a value for k about 2.63e11 . Roger, out of prudence, used the value 2.7e11 . A vector of 2.7e11 elements is unrealizable in the present state of computer memories, especially since A doesn't have a Boolean type: Boolean vectors require the same space as integer vectors. A vector of 4*2.7e11 , or 1e12 bytes long is simply not in the cards. Even a Boolean vector taking just 1 bit per element would have to be more than 3e10 bytes long, so it was clear that the problem had to be partitioned.
Hui's central program computes the primes between m and n , using the sieve method, eliminating multiples of 2, 3, 5, 7, 11, 13, 17, etc. This can be done independently in parallel on many small intervals that make up the larger interval of interest, and, if portioned out to computers that can communicate with a common central file, will permit the problem to be solved in shorter time than if only one computer was to tackle it.
The method of partitioning was suggested by the presence of more than 150 workstations on the floor in Hui's part of Morgan Stanley. They are all interconnected Unix machines, and any machine can be used from any other machine. With relatively small effort a multi-processor solution could be set up, using these machines in parallel. The machines are not heavily used on the weekends, and it was on a weekend that the experiment took place.
The strategy was to let the machines tackle the problem a billion integers at a time. Hui's colleague at Morgan Stanley, Seth Breitbart, suggested creating 271 files, named 0, 1, 2, ..., 270, each denoting the named interval of 1e9 consecutive numbers, and each one empty.
What a machine would do once it was set going was to look at the list of files, pick one at random (named m), erase it, work on the interval (1e9*m)+i.1e9 , a million numbers at a time, and after finishing, write a file containing a record giving the number of primes in each of the million-number intervals within that 1e9 (there are a thousand of them). After finishing, it repeats that process, stopping only when the list of files/intervals are empty. The machines were set up to process a million numbers at a time since the smallest machine available had enough memory to handle that many numbers at once. Roger notes that there's no great harm if two machines accidentally happen to pick the same file/interval. In the flexible Unix universe additional machines could be brought on stream at any time.
If one is a Unix system superuser it is possible to take all sorts of liberties with these machines, like finding the ids of all other machines, but Hui prefers (wisely, I think) not to be tainted by such capabilities, so to get the machine's names he went about the floor reading the names of the machines from strips of paper affixed to each one, then sat down at his machine and made inquiries about the state of each machine on his list. If it was idle, he set it going on the problem.
As he was doing this, there was a nice dilemma to resolve. Should his time be spent improving the algorithm before launching more machines, or should he spend time looking for additional machines? He favored the latter approach for the novelty of it, and ended up using about 15 IBM RS/6000s and 60 Sun Sparc 2s and Sparc 10s.
After 20 hours, he had 271 files, each with 1,000 records. From these he made a 271,000 element vector of the number of primes in the intervals 1e6*i.271000 . By sum-scanning this he knew the interval containing the 10^10-th prime. His function psieve returns a Boolean list selecting the primes between m and n. Applying this to the magic interval gets the actual 10^10-th prime.
Some details about his program "sieve": If a number q is not divisible by any number less than or equal to sqrt(q) , then q is prime. Therefore, to test a number less than 2.7e11 for primality one need only use trial divisors less than sqrt(2.7e11) or roughly 6e5 . In practice, Hui precomputed a list of all the 78,498 primes less than 1e6 recursively, bootstrapping up from 2 3 5 7. (This only takes a few seconds.) It was then a routine matter to determine for any number less than 1e12 whether or not it was prime: just see whether its residues with respect to each of the primes less than a million was nonzero; if so, it was a prime.
For the curious, here is a condensed list of the first 10,000,000,002 primes, with their 0-origin ordinal numbers.
0 2 1 3 2 5 3 7 4 11 ... 1e10-1 252,097,800,623 1e10 252,097,800,629 1e10+1 252,097,800,637
After doing this, Hui found that there is a table in William Judson LeVeque's "Fundamentals of Number Theory", section 1.1, giving the number of primes less than 10^3+i.8 . Hui's table agrees with LeVeque's for 10^3+i.7 . For 10^10 , however, LeVeque says 455,052,512 and Hui says 455,052,511. It turns out that LeVeque is wrong, Hui having checked his results with some help from Lee Dickey at Waterloo University. Dickey tells Hui that his colleagues speculate that LeVeque may have gotten his numbers from lists that D.N. Lehmer compiled, which included 1 as a prime, and LeVeque may have slipped in not subtracting 1 from that particular count. (1 isn't a prime since it doesn't satisfy the definition of a prime: a positive integer n with exactly two distinct factors, 1 and n .)
Now for the moral of the story: Hui tells me he has also since found some work that would have made it much easier to discover the nth prime, for any n. E.D.F. Meissel, a German astronomer, found in the 1870s a method for computing individual values of pi(x) , the counting function for the number of primes <:x . His method was based on recurrences for partial sieving functions, and he used it to compute pi(1e9) , where pi is a function that computes the number of primes less than or equal to its argument. D.H. Lehmer simplified and extended Meissel's method. Recently, further refinements of the Meissel-Lehmer method which incorporate some new sieving techniques have been reported by Lagaria, Miller, and Odlyzko, in "Computing pi(x): Meissel- Lehmer Method", Mathematics of Computation, Volume 44, Number 170, 1985 4, pages 537 to 560. In this article the authors give an asymptotic running time analysis of the resulting algorithm, showing that for every e>0 it computes pi(x) using at most O(x^(2r3)+e) arithmetic operations and using at most O(x^(1r3)+e) storage locations on a computer using words of length 1+<.2^.x bits. The algorithm can be further speeded up using parallel processors. They show that there is an algorithm which, when given M parallel processors, computes pi(x) in time at most O((%M)*x^(2r3)+e) using at most O(x^(1r3)+e) storage locations on each parallel processor, provided M <: x^1r3 . A variant of the algorithm was implemented and used to compute pi(4e16) .
They report that pi(4e16) took them 1730 minutes on an IBM 3081K; pi(2e12) took 3 minutes; pi(3e12) took 4 minutes. Had he known about this method, Hui could have used a binary search technique to find the 1e10th prime, using an O(2 log n) technique, after narrowing the interval to be searched in by a reasonably generous use of the Prime Number Theorem. The value Hui computed was pi(2.52e11) and it took him 20 hours on 60-70 workstations. Brains win again over brawn: a well-designed, mathematically knowledgeable algorithm beats brute force!