A History of APL in 50 Functions

23. Combination Index

The Problem

comb is from Chapter 22; m comb n computes all size m combinations of ⍳n in sorted order.

In this note, we discuss the problem of determining the index given m and n and a particular combination, and the combination given m and n and the index. That is, devise verbs ic and ci such that:

((m,n) ic c) ≡ c ⍳⍨ m comb n
((m,n) ci i) ≡ i ⌷  m comb n

Index from Combination

ic can be computed as follows:

   ic←{m n←⍺ ⋄ -/(m-⍳m)+.!n-(¯1↓0,1+⍵),⍪⍵}

   4 6 ic 1 2 3 5
11
   4 6 ic⍤1 ⊢4 comb 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

   i ← ?1000⍴7!18
   i ≡ 7 18 ic⍤1 ⊢(⊂i)⌷7 comb 18
1

ic derives from a recursive solution written years ago:

ic1←{m n←⍺ ⋄ 1≥m:⊃⍵,0 ⋄ (+/(m-1)!(n-k)+⍳k)+(⍺-1,1+k)∇ 1↓⍵-1+k←⊃⍵}

k is the leading term of the combination in question. The sum +/(m-1)!(n-k)+⍳k accounts for the combinations whose leading terms are less than k . Using the sum directly as in ic1 is not satisfactory for an enormous combination such as 23 10000 ic1 9999-⌽⍳23 (with k=9977).

What is this sum? To use a particular example, the terms of the sum are

   m←3 ⋄ n←9 ⋄ k←5
   (m-1)!(n-k)+⍳k
6 10 15 21 28

and are adjacent entries (shown in red) in Pascal’s triangle:

									1
								1		1
							1		2		1
						1		3		3		1
					1		4		6		4		1
				1		5		10		10		5		1
			1		6		15		20		15		6		1
		1		7		21		35		35		21		7		1
	1		8		28		56		70		56		28		8		1
1		9		36		84		126		126		84		36		9		1

If we add the extra term 4 (in green) in Pascal’s triangle to the right of 6 , the sum collapses into a single binomial coefficient (84=3!9 , in blue):

        6 4
      10         10 10
    15         15         15 20
  21         21         21         21 35
28         28         28         28        28 56
                                             84         84

   +/6 10 15 21 28
80
   3!9
84
   (3!9)-3!9-5
80

which is m!n , from which the original extra term m!n-k must be subtracted. “Pascal’s ladder” or “Pascal’s telescoping sum” would be good descriptions of this. Of course, to make the derivation rigorous and respectable, one would have to employ a proof by induction (say).

The following benchmark illustrates the improvement from ic1 to ic .

   mn← 19 8000
   c ← ¯1 + 8000-⌽⍳19
   c
7981 7982 7983 7984 7985 7986 7987 7988 7989 7990 7991 7992 7993 
      7994 7995 7996 7997 7998 7999

   cmpx 'mn ic c' 'mn ic1 c'
  mn ic c  → 2.93E¯5 |     0% ⎕
  mn ic1 c → 1.48E¯3 | +4940% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕

The algorithms in ic and ic1 would produce exact answers on an APL interpreter which uses extended precision arithmetic. In J:

   mn ic c
1159644635052753390832350318382787605969728523299909431999

   19 ! 8000x
1159644635052753390832350318382787605969728523299909432000

Combination from Index

(m,n) ci i produces the i-th size m combination of ⍳n .

ci←{
  m n←⍺
  0=m:⍬
  v←+\(m-1)!(1-m)↓⌽⍳n
  k←(v>⍵)⍳1
  k,(1+k)+(⍺-1,1+k)∇(⍵-k⌷0,v)
}

   (10 comb 15) ≡ 10 15 ci⍤1 0 ⍳10!15
1
   (10 comb 15)[i;] ≡ 10 15 ci⍤1 0 ⊢i←?⍨10!15
1

For large m , n , and i there is a faster computation of ci similar to the “Pascal’s Ladder” derivation for ic . See [79].

Appeared in J in [79].