A Letter from Dijkstra on APL

A Letter from Dijkstra on APL

Roger K.W. Hui

Acknowledgments. I would like to thank Bob Bernecky, Nicolas Delcros, Jay Foad, and Eric Iverson for comments on the manuscript.

Nick Nickolov brought to my attention comments by Dijkstra on APL [0] that I had not seen before. I contacted the author of the website and obtained a copy of Dijkstra’s letter, transcribed below:

Burroughs

PROF DR EDSGER W DIJKSTRA		PLATAANSTRAAT 5
RESEARCH FELLOW		5671 AL NUENEN THE NETHERLANDS

Dr A.Caplin
[street address]
CROYDON, Surrey
United Kingdom

Tuesday 12 January 1982

Dear Dr Caplin,

thank [sic] you for your letter dated 31 May (?) 1981. You were right in your reference to an APL “cult”: some adore it and others abhor it with very few in between. Allow me to offer you another explanation for that phenomenon.

I think that most people (be it subconsciously) realize that “ease of use” is not the most significant aspect. Experience has show that, provided people are sufficiently thrilled by a gadget, they are willing to put up with the most terrible interfaces. Much more important is that the tool shapes the one who trains himself in its usage, just as the words we use shape our thoughts and the instrument forms the violinist. I think that a major reason for shunning APL is that many people are repelled by the influence APL has on its devotees. They implement the prayer “Dear Lord, don’t let me become like them” by ignoring it.

A typical characteristic of the APL devotee is, for instance, his closeness to an implementation of it. I know of a visiting professor at an American University [sic] who, trying to teach APL, bitterly complained about the absence of APL terminals. He was clearly unable to teach it without them. And you, too, write to me that you would like to meet me in your part of the world, so that you can “demonstrate APL” to me. This is in sharp contrast to people who prefer programming languages that can be adadequately [sic] “demonstrated”—i.e. shown, taught and discussed—with pencil and paper.

The fact that the printed or written word is apparently not the proper medium for the propagation of APL may offer a further explanation for its relative isolation; at the same time that fact may be viewed as one of its major shortcomings.

Your writings made me wonder in which discipline you got your doctor’s degree.

With my greetings and best wishes,
	yours ever, (signed) Edsger W. Dijkstra

PS. I apologize for the quality of my signature; having broken my right arm I have to sign with my left hand.
	EWD

I find Dijkstra’s comments deeply ironic, because Ken Iverson invented his notation as a means of communications among people [1], and it was only years later that the notation was implemented on a computer at which time it became APL. Moreover, Dijkstra encountered “Iverson notation” no later than August 1963 before there was an implementation [2]. Even with APL, perhaps especially with APL, one can reasonably do non-trivial things without ever executing it on a computer.

I have read at least one of Dijkstra’s EWDs in which he wrote programs using formal methods, at the end of which is derived a provably correct program. As I read it/them, I thought to myself, “APL should have been natural for Dijkstra”. One can argue what “provably correct program” means. To me, it means what a typical mathematician means when he/she says a theorem has been proven. I know it is far from saying that the program will produce a correct result in all circumstances (compiler/interpreter has a bug, somebody pulls the plug, cosmic ray strikes a transistor, etc.), but I believe I am using “prove” in the same sense that Dijkstra did.

Like Dijkstra’s “visiting professor at an American university”, I would be distressed if I had to teach a course on APL without an APL machine. Were it a course on formal methods, one can get by without a machine; but even in a course on formal methods executability would be an asset, because executability keeps you honest, a faithful servant that can be used to check the steps of a proof. Were it a general programming course, it seems extreme to eschew the use of a machine in showing, teaching, and discussing. It would be like trying to learn a natural language without ever conversing with a speaker of that language.

Herewith, two examples of using APL in formal manipulations. Further such examples can be found in Iverson’s Turing Award Lecture [3]. A proof is here presented as in [3], a sequence of expressions each identical to its predecessor, annotated with the reasoning.

A Summary of Notation is provided at the end.

Example 0: Ackermann’s Function

The derivation first appeared in 1992 [4] in J and is transcribed here in Dyalog APL.

Ackermann’s function is defined on non-negative integers as follows:

   ack←{             
     0=⍺: 1+⍵       
     0=⍵: (⍺-1) ∇ 1
     (⍺-1) ∇ ⍺ ∇ ⍵-1
   }                 

   2 ack 3
9
   3 ack 2
29

Lemma: If ⍺ ack ⍵ ←→ f⍢(3∘+) ⍵ , then (⍺+1)ack ⍵ ←→ f⍣(1+⍵)⍢(3∘+) 1 .

Proof: By induction on ⍵ .

(⍺+1) ack 0 basis

⍺ ack 1 definition of ack

f⍢(3∘+) 1 antecedent of lemma

f⍣(1+0)⍢(3∘+) 1 ⍣

(⍺+1) ack ⍵ induction

⍺ ack (⍺+1) ack ⍵-1 definition of ack

f⍢(3∘+) (⍺+1) ack ⍵-1 antecedent of lemma

f⍢(3∘+) f⍣(1+⍵-1)⍢(3∘+) 1 inductive hypothesis

¯3∘+ f 3∘+ ¯3∘+ f⍣(1+⍵-1) ⊢3∘+ 1 ⍢ and ⊢

¯3∘+ f f⍣(1+⍵-1) ⊢3∘+ 1 +

¯3∘+ f⍣(1+⍵) ⊢3∘+ 1 ⍣

f⍣(1+⍵)⍢(3∘+) 1 ⍢

QED

Using the lemma (or otherwise), it can be shown that:

0∘ack = 1∘+⍢(3∘+)
1∘ack = 2∘+⍢(3∘+)
2∘ack = 2∘×⍢(3∘+)
3∘ack = 2∘*⍢(3∘+)
4∘ack = */∘(⍴∘2)⍢(3∘+)
5∘ack = {*/∘(⍴∘2)⍣(1+⍵)⍢(3∘+) 1}

Example 1: Inverted Table Index-Of

Presented at the 2013 Dyalog Conference [5].

A table is a set of values organized into rows and columns. The rows are records. Values in a column have the same type and shape. A table has a specified number of columns but can have any number of rows. The extended index-of on tables finds record indices.

   tx                   ty                    tx ⍳ ty
┌──────┬─┬───┬──┐    ┌──────┬─┬───┬──┐     3 1 5 2 5 5
│John  │M│USA│26│    │Min   │F│CN │17│
├──────┼─┼───┼──┤    ├──────┼─┼───┼──┤        tx ⍳ tx
│Mary  │F│UK │24│    │Mary  │F│UK │24│     0 1 2 3 4
├──────┼─┼───┼──┤    ├──────┼─┼───┼──┤
│Monika│F│DE │31│    │John  │M│UK │26│        ty ⍳ ty
├──────┼─┼───┼──┤    ├──────┼─┼───┼──┤     0 1 2 3 4 4
│Min   │F│CN │17│    │Monika│F│DE │31│
├──────┼─┼───┼──┤    ├──────┼─┼───┼──┤
│Max   │M│IT │29│    │Mesut │M│DE │24│
└──────┴─┴───┴──┘    ├──────┼─┼───┼──┤
                     │Mesut │M│DE │24│
                     └──────┴─┴───┴──┘

An inverted table is a table with the values of a column collected together. Comma-bar each (⍪¨) applied to an inverted table makes it “look” more like a table. And of course the columns have the same tally (≢). A table can be readily inverted and vice versa.

   x
┌──────┬─────┬───┬──────────────┐
│John  │MFFFM│USA│26 24 31 17 29│
│Mary  │     │UK │              │
│Monika│     │DE │              │
│Min   │     │CN │              │
│Max   │     │IT │              │
└──────┴─────┴───┴──────────────┘
   ⍪¨x                     ≢¨x      
┌──────┬─┬───┬──┐       5 5 5 5
│John  │M│USA│26│
│Mary  │F│UK │24│
│Monika│F│DE │31│
│Min   │F│CN │17│
│Max   │M│IT │29│
└──────┴─┴───┴──┘

   invert ← {↑¨↓⍉⍵}  
   vert   ← {⍉↑⊂⍤¯1¨⍵}
            
   x  ≡ invert tx
1
   tx ≡ vert x
1

A table has array overhead per element. An inverted table has array overhead per column. The difference that this makes becomes apparent when you have a sufficiently large number of rows. The other advantage of an inverted table is that column access is much faster.

An important computation is x index-of y where x and y are compatible inverted tables. Obviously, it can not be just x⍳y . The computation obtains by first “verting” the arguments (un-inverting the tables) and then applying ⍳ , but often there is not enough space for that.

   ⍪¨x                   ⍪¨y
┌──────┬─┬───┬──┐     ┌──────┬─┬───┬──┐
│John  │M│USA│26│     │Min   │F│CN │17│
│Mary  │F│UK │24│     │Mary  │F│UK │24│
│Monika│F│DE │31│     │John  │M│UK │26│
│Min   │F│CN │17│     │Monika│F│DE │31│
│Max   │M│IT │29│     │Mesut │M│DE │24│
└──────┴─┴───┴──┘     │Mesut │M│DE │24│
                      └──────┴─┴───┴──┘
   x ⍳ y
4 4 4 4

   (vert x) ⍳ (vert y)
3 1 5 2 5 5

We derive a more efficient computation of index-of on inverted tables:

`(vert x) ⍳ (vert y)`	(a)
`({⍉↑⊂⍤¯1¨⍵}x) ⍳ ({⍉↑⊂⍤¯1¨⍵}y)`	(b)
`(⍉↑⊂⍤¯1¨x) ⍳ (⍉↑⊂⍤¯1¨y)`	(c)
`(⍉↑x⍳¨x) ⍳ (⍉↑x⍳¨y)`	(d)

(a)	The indices obtain by first uninverting the tables, that is, by first applying `vert` .
(b)	Replace `vert` by its definition.
(c)	Replace the D-fn by its definition. We see that `⊂⍤¯1` plays a major role. `⊂⍤¯1` encloses, or alternatively computes a scalar representation.
(d)	For purposes of index-of `x⍳¨x` and `x⍳¨y` have the same information as `⊂⍤¯1¨x` and `⊂⍤¯1¨y` , but are much more efficient representations (small integers v the data itself).

Point (d) illustrated on column 0:

   ⊂⍤¯1⊢x0←0⊃x
┌──────┬──────┬──────┬──────┬──────┐
│John  │Mary  │Monika│Min   │Max   │
└──────┴──────┴──────┴──────┴──────┘
   x0 ⍳ x0
0 1 2 3 4

   ⊂⍤¯1⊢y0←0⊃y
┌──────┬──────┬──────┬──────┬──────┬──────┐
│Min   │Mary  │John  │Monika│Mesut │Mesut │
└──────┴──────┴──────┴──────┴──────┴──────┘
   x0 ⍳ y0
3 1 0 2 5 5

That is, the function {(⍉↑⍺⍳¨⍺)⍳(⍉↑⍺⍳¨⍵)} computes index-of on inverted tables.

I believe that in another language a derivation such as the one above would be very long (in part because the program would be very long), possibly impractically long.

Summary of Notation

The following table lists the APL notation used in the paper. A complete language reference can be found in [6]. D-fns are described in [6, pp. 112-127] and [7].

`←→`	equivalent (extralingual)
`←`	assignment
`⍺`	left argument
`⍵`	right argument
`×`	times
`*`	exponentiation
`⍴`	reshape; `n⍴s` makes `n` copies of `s`
`⍉`	transpose
`⍳`	index-of
`⊂`	enclose
`⊃`	pick
`↑`	mix (disclose)
`↓`	split (enclose rows)
`⍪`	table, ravel the major cells
`≡`	match
`≢`	tally, the length of the leading dimension
`⊢`	right (identity function)
`f∘g`	function composition
`a∘f`	currying (fix left argument)
`f∘a`	currying (fix right argument)
`f⍤r`	rank operator; `f` on rank `r` subarrays
`f⍣n`	power operator; `n` applications of `f` ; the `n`-th iterate of `f`
	`f⍣¯1` is the inverse of `f`
`f⍢g`	dual operator; `g⍣¯1∘f∘g` (not yet implemented in Dyalog APL)
`f/`	reduce (fold)
`f¨`	each (map)
`{⍺ … ⍵}`	D-function
`∇`	D-function: recursion
`:`	D-function: guard

References

[0]	Daylight, Edgar Graham, A Letter about APL, 2012-04-05.
[1]	Iverson, Kenneth E., A Personal View of APL, IBM Systems Journal, Volume 30, Number 4, 1991-12.
[2]	Iverson, Kenneth E., Formalism in Programming Languages, Communications of the ACM, Volume 7, Number 2, 1964-02. See the last question in the discussion.
[3]	Iverson, Kenneth E., Notation as a Tool of Thought, Communications of the ACM, Volume 23, Number 8, 1980-08.
[4]	Hui, Roger K.W., Three Combinatoric Puzzles, Vector, Volume 9, Number 2, 1992-10; also in Ackermann’s Function, J Wiki Essay, 2005-10-14.
[5]	Hui, Roger K.W., Rank & Friends, 2013 Dyalog Conference, 2013-10-22.
[6]	Dyalog Limited, Dyalog APL Programmer’s Guide & Language Reference, Version 13.1, 2012.
[7]	Scholes, John, D: A Functional Subset of Dyalog APL, Vector, Volume 17, Number 4, 2001-04.

Written in honor of Ken Iverson’s 93^rd birthday.

created:	2013-12-15 08:30
updated:	2019-11-10 10:55