A History of APL in 50 Functions

32. Fast Multiplication

x ×⍢(FFT⍫IFT) y

Let x and y be vectors of the digits of extended precision numbers. The “school” method for extended precision multiply is +/ OB x∘.×y where OB is an operator which applies its operand to the oblique lines (anti-diagonals). For example, if m←4 6⍴⍳24 then its oblique lines are:

   m
 0  1  2  3  4  5
 6  7  8  9 10 11
12 13 14 15 16 17
18 19 20 21 22 23

   OB←{(,⊃∘.+/⍳¨⍴⍵)(⍺⍺⊢)⌸,⍵}

   ⊂OB m
┌─┬───┬──────┬─────────┬─────────┬──────────┬────────┬─────┬──┐
│0│1 6│2 7 12│3 8 13 18│4 9 14 19│5 10 15 20│11 16 21│17 22│23│
└─┴───┴──────┴─────────┴─────────┴──────────┴────────┴─────┴──┘

   x← 3 2 3 9 0 4 2
   y← 1 9 9 1 4

   +/ OB x ∘.× y
3 29 48 57 122 96 59 90 22 18 8

   school← +/OB ∘.×
   x school y
3 29 48 57 122 96 59 90 22 18 8

The killer in the school method is the outer product ∘.× , whose deleterious O(n*2) effects start to be felt at around 1000 digits, and is completely “impossible” when you get up to a million or more digits.

Digits of the product can be put into standard form with a final carry step, modelled here as:

   carry←{(⊢ ↓⍨ 0=⊃){1↓+⌿0 ¯1⌽0,⍨0 10⊤⍵}⍣≡0,⍵}

   carry +/ OB x ∘.× y
6 4 5 0 2 2 8 2 3 8 8

   0 ⍕ 3239042 × 19914
 64502282388

Since the process of computing the carry is O(n) , we will omit it for purposes of comparison.

The following fast multiply was first defined in J by Henry Rich [97] (with an improvement to cube suggested by Marshall Lochbaum) and translated into APL in [98]. For ×⍢FFT , both FFT and its inverse are O(n×⍟n) , and of course × is O(n) , so the whole extended precision multiply is O(n×⍟n) , much better than the O(n*2) of the school method.

multiply←{
  cube     ← {⍵⍴⍨2⍴⍨2⍟≢⍵}
  roots    ← {¯1*(⍳⍵)÷⍵}
  butterfly← {(⊣/⍺)∇⍣(×r)⊢(+⌿⍵),[r-0.5]⍺×[⍳r←≢⍴⍺]-⌿⍵}
  FFT      ← {      ,(cube  roots 2÷⍨≢⍵) butterfly cube ⍵}
  IFT      ← {(≢⍵)÷⍨,(cube +roots 2÷⍨≢⍵) butterfly cube ⍵}
  m←2*⌈2⍟¯1+(≢⍺)+≢⍵
  IFT (FFT m↑⍺) × (FFT m↑⍵)   ⍝ ←→ (m↑⍺) ×⍢FFT (m↑⍵)
}

      x y ← ↓ ?2 2048⍴10
      ⍴x multiply y
4096
      ⌈/ | (x multiply y) - 4096↑x school y
1.73909E¯11

      cmpx 'x multiply y' 'x school y'
  x multiply y → 3.83E¯3 |    0% ⎕⎕⎕⎕⎕⎕
* x school y   → 2.50E¯2 | +554% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

You can see that it’d be pretty hard for the system to “figure out” the inverse of FFT, the IFT or Inverse Fourier Transform. To avoid this problem, we can introduce an obverse operator ⍫ (say); f⍫g ←→ f except that f⍫g⍣¯1 ←→ g⍫f . So to make it explicit,

   ⍺ multiply ⍵ ←→ ⍺ ×⍢FFT ⍵ ←→ ⍺ ×⍢(FFT⍫IFT) ⍵