A History of APL in 50 Functions

46. 50847534

I was re-reading A Mathematician’s Apology [1] before recommending it to Suki Tekverk, our summer intern, and came across a statement that the number of primes less than 1e9 is 50847478 (§14, page 23). The function pco [89] from the dfns workspace does computations on primes; ¯1 pco n is the number of primes less than n :

   )copy dfns pco

   ¯1 pco 1e9
50847534

The two numbers 50847478 and 50847534 can not both be correct. A search on the internet reveals that 50847534 is correct. 50847478 is erroneously cited in various textbooks and even has a name, Bertelsen’s number, memorably described by MathWorld [126] as “an erroneous name erroneously given the erroneous value of π(10⁹) = 50847478”.

Although several internet sources give 50847534 as the number of primes less than 1e9 , they don’t provide a proof. Relying on them would be doing the same thing — arguing from authority — that led other authors astray, even if it is the authority of the mighty internet. Besides, I’d already given Suki, an aspiring mathematician, the standard spiel about the importance of proof.

How do you prove the 50847534 number? One way is to prove correct a program that produces it. Proving pco correct seems daunting — it has 103 lines and two large tables. Therefore, I wrote a function from scratch to compute the number of primes less than ⍵ , a function written in a way that facilitates proof. The function was improved in 2019 on suggestions by Jay Foad [138].

sieve←{
  4≥⍵:⍵⍴0 0 1 1
  r←⌊0.5*⍨n←⍵
  p←2 3 5 7 11 13 17 19 23 29 31 37 41 43
  p←(1+(n≤×⍀p)⍳1)↑p
  b← 0@1 ⊃ {(m⍴⍵)>m⍴⍺↑1 ⊣ m←n⌊⍺×≢⍵}⌿ ⊖1,p
  {r<q←b⍳1:b⊣b[⍵]←1 ⋄ b[q,q×⍸b↑⍨⌈n÷q]←0 ⋄ ∇ ⍵,q}p
}

   +/ sieve 1e9
50847534

sieve ⍵ produces a boolean vector b with length ⍵ such that ⍸b are all the primes less than ⍵ . Two different methods are used to mark composites with 0s, both effected using local anonymous dfns: The first uses the sieve of Eratosthenes on an initial mask of 1 and a prefix of the primes 2 3…43. (The length of the prefix obtains by comparison with the primorial function ×⍀p .) The second finds the smallest new prime q remaining in b ( q←b⍳1 ), and sets to 0 bit q itself and bits at q times the numbers at remaining 1 bits in a prefix of b ( ⍸b↑⍨⌈n÷q ).

Further examples:

   +/∘sieve¨ ⍳10
0 0 0 1 2 2 3 3 4 4

   +/∘sieve¨ 10*⍳10
0 4 25 168 1229 9592 78498 664579 5761455 50847534

There are other functions which are much easier to prove correct, for example, sieve1← {2=+⌿0=(⍳3⌈⍵)∘.|⍳⍵} . However, sieve1 requires O(⍵*2) space and sieve1 1e9 can not produce a result with current technology. (Hmm, a provably correct program that can not produce the desired result …)

We can test that sieve is consistent with pco and that pco is self-consistent. pco is a model of the p: primitive function in J [51g]. Its cases are:

`pco n`	the `n`-th prime
`¯1 pco n`	the number of primes less than `n`
`1 pco n`	1 iff `n` is prime
`2 pco n`	the prime factors and exponents of `n`
`3 pco n`	the prime factorization of `n`
`¯4 pco n`	the next prime `< n`
`4 pco n`	the next prime `> n`
`10 pco r,s`	boolean vector `b` such that `r+⍸b` are the primes in the half-open interval `[r,s)`

   ¯1 pco 10*⍳10      ⍝ the number of primes < 1e0 1e1 ... 1e9
0 4 25 168 1229 9592 78498 664579 5761455 50847534

   +/ 10 pco 0 1e9    ⍝ sum of the sieve between 0 and 1e9
50847534
                      ⍝ sum of sums of 10 sieves 
                      ⍝ each of size 1e8 from 0 to 1e9
   +/ t← {+/10 pco ⍵+0 1e8}¨ 1e8×⍳10
50847534
                      ⍝ sum of sums of 1000 sieves 
                      ⍝ each of size 1e6 from 0 to 1e9
   +/ s← {+/10 pco ⍵+0 1e6}¨ 1e6×⍳1000
50847534

   t ≡ +/ 10 100 ⍴ s
1
                      ⍝ sum of sums of sieves with 1000 randomly
                      ⍝ chosen end-points, from 0 to 1e9
   +/ 2 {+/10 pco ⍺,⍵}/ 0,({⍵[⍋⍵]}1000?1e9),1e9
50847534

   ¯1 pco 1e9         ⍝ the number of primes < 1e9
50847534
   pco 50847534       ⍝ the 50847534-th prime
1000000007
   ¯4 pco 1000000007  ⍝ the next prime < 1000000007
999999937
   4 pco 999999937    ⍝ the next prime >  999999937
1000000007
   4 pco 1e9          ⍝ the next prime > 1e9
1000000007

                      ⍝ are 999999937 1000000007 primes?
   1 pco 999999937 1000000007
1 1
                      ⍝ which of 999999930 ... 1000000009 
                      ⍝ are prime?
   1 pco 999999930+4 20⍴⍳80
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

   +/∘sieve¨ 999999937 1000000007 ∘.+ ¯1 0 1 
50847533 50847533 50847534
50847534 50847534 50847535

Appeared in [127].