APL\? This paper describes a version of APL based upon the dictionary [1], but significantly simplified and enhanced, and directly usable on any machine that provides ASCII characters. It also describes salient features of a C implementation that has been tested on several machines, and is available as freeware. There have been four primary motivations for this work:
Examples of the use of the language in a variety of topics are provided in an appendix. We are indebted to a number of colleagues for advice and help: Anthony Howe, David Steinbrook, Bob Bernecky, Mark Czerwinski, L.J. Dickey, Jiri Dvorak, James Hui, Eric Iverson, Paul Jackson, and Roland Pesch. A. Orthography At the time of the first implementation of APL, the then-new IBM Selectric typewriter with its changeable type element offered a welcome escape from the limitations of the existing printers, which provided only a few symbols beyond a one-case alphabet, punctuation, and the decimal digits. The Selectric was exploited by designing an alphabet that provided single-character spelling of all words in the language (except for the literal names used for variables). This spelling scheme offered several advantages, due to the fact that the words were:
However, special alphabets pose serious display problems, and it is desirable to have a spelling scheme based on a widely available computer alphabet. We have here attempted to design a spelling scheme based on the ASCII alphabet that retains the advantages cited above for the older spelling scheme. Words are spelled with one character or with two, the last of which is a period or a colon; words are formed by scanning from right to left, each colon or period (not in a number) combining with the character to its left to form a word. Any number of spaces may be used between words, but spaces are not required, except that in a number, a space or zero must precede a decimal point that is not preceded by a digit or negative sign. The spelling scheme is shown in the language summary of Table 1, a study of which should clarify the application of the following guides used in its design:
Anyone who is familiar with earlier spelling of APL words, or who is using earlier APL literature, may find it helpful to pronounce them in the traditionat way, as in iota for i. . The function /. cuts its list argument into words according to rules appropriate to an APL sentence, Thus, /. '+/3 4 5*i.3' yields the boxed list + and / and 3 4 5 and * and i. and 3 .
B. Major Cells, Replicate, Reshape, and Outer Product Because of the importance of major cells, we will adopt the terms item and atom for the major cells and the scalars. We will also adopt the symbol # for the item count, or tally; #b is 1 if b is an atom, and is otherwise equal to 0{$ b . The dyadic case n#b is similar to the replicate function previously provided (for historical reasons) by the derived function n/ ; the successive atoms of n specify the number of repetitions of successive items of b to be selected. The reshape ($) is also redefined to apply to items rather than atoms; the old behaviour is obtained by ravelling the right argument. Catenation of the items of A and B by the expression A comma-bar B is more useful than the catenation of 1-cells provided by the comma; in particular, the catenation of 1-cells can be provided by comma-bar of rank 1 . Consequently, we will use the comma for catenation of items (that is, catenation along the leading axis), and drop the symbol comma-bar. For similar reasons, the / and \ will be adopted for the meanings that were assigned to /-bar and \-bar, and the latter pair of symbols will be dropped. The table function (previously provided by the monadic case of the comma-bar) will be provided by the semi-colon, its dyadic use being assigned to the link function. Thus, a;b is defined by (<a),b , with the right argument b automatically boxed if it is open. The expression jot.f for outer product uses (for historical reasons) a conjunction where an adverb would serve. We will adopt the dyadic case of f\ for this purpose, and the jot and the notation jot.f will be dropped. C. User-Defined Verbs, Adverbs, and Conjunctions The conjunction denoted in the dictionary by the inverted Greek Delta will be denoted by the double colon, and the right-arrow and $ used to denote the sequence control and self-reference will be replaced by $. and $: . The forms m::d and 1::a and 2::a will be otherwise adopted. As in the dictionary, assignment provides dynamic localization; for example, the first execution of a=.g a in a function f applies g to the global value of a , but produces a local copy. Unlike the dictionary definition, the Iocalization is strict, so that a local copy is not available to user-defined functions that are invoked in f . Global assignment is provided by the copula =: . Strict localization provides significant advantages over the heritable localization of earlier APL, and is now practicable because of the ease of passing parameters in boxed arguments. Direct detinitions are easily provided by a simple cover function employing the forms m::'' and ''::d . D. From, Iota, and Base The monadic case of i. is defined like monadic iota, but extended to list arguments as follows: i.s is (|s)$+\0,(*/|s)$1 , but reversed along each axis for which the corresponding element of s is negative; the result for an empty argument is the scalar 0 . For example: i. 2 3 i. 2 _3 i. ''1 i. -4 0 1 2 2 1 0 0 3 2 1 0 3 4 5 5 4 3 A new monadic case of base-value is defined as the base-2 value; that is, #.v is equivalent to 2#.v . An infinite rank monadic case of anti-base is defined as (n$2)#:a , where n is the maximum of the minimum lengths required to represent the (integer) atoms of a . E. Permutations The words \. and -. will be used for transposition and for leading-axis reverse and rotale, the lines in the spelling indicating the axes involved, as they did in the old symbols for these functions. Other permutations (modelled upon, and replacing, Those in the dictionary called cycle, mix, and mix index will be represented by @. and @: . Standard Direct and Cycle Representations. If p is a permutation of the atoms of i.n then p is said to be a permutation vector of order n , and if n=#b , then p{b is a permutation of the items of b . The expression @.p yields a list of boxed lists of the atoms of i.#p called the srandard cycle represenlation of p . If (as in the example in the dictionary) p=. 4 5 2 1 0 3 , then @.p yields 2;4 0; 5 3 1 because the permutation p moves to position 2 the item 2 , to 4 the item 0 , to 0 the item 4 , to 5 the item 3 , to 3 the item 1 , and to 1 the item 5 . The monad @. is self-inverse; when applied to a standard cycle representation it produces the corresponding direct representation. A given permutation could be represented by cycles in a variety of ways, and the standard form is made unique by the following restrictions: The cycles are disjoint and exhaustive (that is, the atoms of the boxed elements together form a permutation vector); each boxed cycle begins with its largest element (possible because any rotation of a single cycle represents the same permutation); and the boxed cycles are arranged in ascending order on their leading elements (possible because the cycles are disjoint). Non-Standard Representations. If d and c are direct and cycle representations of order #b , then d@.b and c@.b produce the corresponding permutations of the items of b . More generally, since the item count of b determines the order of the permutation, the arguments d and c may be non-standard in ways to be defined. In particular, elements belonging to (i.2*#b)-#b are permitted, and are treated as their residues modulo #b . If q is not boxed, and if the elements of (#b)|q are distinct, then q@.b is equivalent to d@. b , where d is the standard form of q given by a=.((i.n)~.n|q),n|q , where n is #b . In other words, positions occurring in q are moved to the tail end. If q is boxed, then the elements of (#b)|>j{q must be distinct for each j , and the boxes are applied in succession. For example, (2 1;3 0 1)@.i.5 is equivalent to (<2 1)@.(<3 0 1)@.i.5 , and the result of either is the standard direct permutation 1 2 3 0 4 . Atomic Representation. If T is the table of all !n permutations of order n arranged in lexical order (that is, /:T is i.!#T ), then k is said to be the atomic representation of the permutation k{T . Moreover, k@:b permutes items of b by the permutation of order #b whose atomic representation is (!#b)|k . For example, 1@:b transposes the last two items of b , and _1@:b reverses the items, and 3@:b and 4@:b rotate the last three items of b . Finally, (i.!n)@:i.n produces the ordered table of all permutations of order n , as does the fork [3] used in the expression (i.&!@:i.) n. The transformation between direct and cycle representations provided by the monad @. is extended to non-negative non-standard cases by treating any argument q as a representation of a permutation of order l+>./}.q . Similarly, the monad @: applied to any cycle or direct permutation yields its atomic representation. For example, @:0 3 2 1 is 5 , as are @:3 2 1 and @:0;2;3 1 and @:<3 1 . F. Transpositions and Sections The symbol @ will replace the hoof, with the noun cases of the conjunction (Defer and Prefer) modified so that v@n defers axes n of the right argument before applying v , and n@v defers axes of the left. Consequently, the expression a nO@v@n1 b defers axes of both arguments before applying v . The monadic cases of v@n and n@v are identical. If the number of elements of n equals the rank of v , then v@n applies v to the cells selected by the axes specified by the atoms of n , and v@n can therefore be said to apply v at n, as suggested by the name of the symbol @ . Because {: is an identity function, transposition alone can be obtained by using {:@n . A boxed argument n provides sectioning, grouping the axes specified by a single box into a single result axis. For example, if b has the shape i. 6 and n=.2;4 1;0 , then the shape of {:@n b is 3 5 2 1 0 . G. Format The dyadic case of format (":) is defined with both ranks 1 , and with each element e of the left argument controlling the representation of the corresponding element of the right argument as follows:
The monadic rank of ": is infinite, and the result is equivalent to the application of the dyadic definition with a left argument chosen to provide a minimum of one space between columns. Default output is equivalent to the use of the monadic case. H. External Communication Communication with the keyboard, screen, and operating system files is provided by the conjunction X. , whose many arguments provide considerable flexibility. I. Some Implications for Teaching The mere introduction of lists, scan, and outer product allows a wealth of interesting explorations, as in +\a=.0 1 2 3 4 5 for the triangular numbers, in +\1+a+a to see that the odd numbers sum to squares, and in various outer products such as a+\a and a*\a to see addition, multiplication, remainder, divisibility and other tables, including the binomial coefficients (Pascal’s Triangle) provided by a!\a . Lists are easily explained as the use of collective nouns, and the scan is easily explained as an adverb. Unfortunately, the simple and important notion of a function table required, in traditional APL, not just a further use of an adverb, but the use of a conjunction whose first argument could only be explained as an historical anomaly. The present use of an adverb for outer product avoids this difficulty. Expressions such as pr=.+% provide a simple introduction of the notion of function definition (and of the hook [3]), and expressions such as pr\1 2 2 2 2 2 2 and pr\3 7 15 1 show interesting uses of such a defined function in producing successive approximations to interesting quantitites. Expressions such as sum=.+/ and sqrt=.^&0.5 and log=.10&^. and neq=.-.@= provide simple and interesting uses of adverbs and conjunctions. Moreover, the general form of definition provided by the :: conjunction permits a simple introduction to the use of iteration and recursion. The generally useful notions of classification can be introduced by using the outer product a<:\b in expressions for producing barcharts and graphs, and can be explored further using the expression #:i.2^n to produce the complete classification table of order n . Thus if CCT=.#:i.2^#v=.2 3 5, then v+..*CCT and v*..^CCT produce the sums and products over all subsets of v . In a more specialized area, the functions @. and @: provide powerful facilities for the discussion of permutations. Thus, (i.!4)@:i.4 displays a complete table of permutations, and an expression such as @. 4 3 0 1 2 can provide an introduction to cycles and to the use of the LCM (*.) of their lengths to determine the power of a permutation. For examples in further topics, see the appendix. J. The C Implementation The system is implemented in C, because it is an adequate language available on a wide variety of machines. The implementation is guided by two principles: clarity, and exploitation of underlying facilities. Efficiency is not a main objective. Clarity does not mean the micro (and relatively insignificant) clarity of individual C statements, but the macro clarity of being close to the APL or mathematical definitions. The C code is written to be understandable by an APL-knowledgeable reader. Facilities already available in the environment are exploited: for memory management, the C library functions malloc() and free() are used, the underlying virtual memory facilities being presumed to be adequate; for session management, the system reads from standard input and writes to standard output. This, together with the ASCIl spelling, makes it possible to use any of several widely-available session managers, such as EMACS or SunView/OpenLook. Organization. The system is organized along the lines suggested by the dictionary, in particular, by the parser [1, p. 381]. The parsing rules are expressed in C as follows: #define RHS (NOUN+VERB+ADV+CONJ) #define EDGE (MARK+ASGN+LPAR) static struct {I c[4];AF f;I b,e;)cases[] = { EDGE+ADV+VERB, VERB, NOUN, ANY, verb,1, 2, CONJ, NOUN, VERB, NOUN, verb,2, 3, EDGEtADV+VERB+NOUN,NOUN, VERB, NOUN, verb,1, 3, EDGEtADV+VERB+NOUN,NOUN+VERB,ADV, ANY, adv, 1, 2, EDGEtADV+VERB+NOUN,NOUN+VERB,CONJ, NOUN+VERB,conj,1, 3, EDGEtADV+VERB+NOUN,VERB, VERB, VERB, form,1, 3, EDGE, VERB, VERB, ANY, form,1, 2, NAME, ASGN, RHS, ANY, is, 0, 2, LPAR, RHS, RPAR, ANY, punc,0, 2, ANY, ANY, ANY, ANY, move,0,-1, }; A sentence to be parsed is placed on a left stack, and as execution proceeds words are moved from the tail of the left stack to the front of a right stack. When the first four words of the right stack match a pattern (columns 0 to 3 of the table), the corresponding action (4) is triggered and applied to the indicated words (5, 6), with the result replacing these words. Data Structures. The fundamental data structure is the APL array, that is, the C structure: typedef long I; typedef struct (I t,c,n,r,s[1];)*A;
All objects, whether numeric, literal, or boxed, whether noun, verb, adverb, conjunction, or punctuation, are represented by this structure. Most C functions in the system accept APL arrays as arguments and return them as results. Definitions and macros. Extensive use is made of C preprocessor definitions and macros; to augment the expressive power of C. to enforce uniformity, and to increase readability. Example: An “APL function” is a function which accepts one or two APL array arguments, and returns an APL array result. The macros F1 and F2 encapsulate this convention: #define F1(f) A f (w,self)A w,self: #define F2(f) A f (a,w,self)A a,w,self; (self is a pointer to function parts—rank, inverse, etc.) A compact but readable programming style results from using such definitions. The implementation of ,:y (itemize) and x,:y (laminate) are cases in point: Itemize: ,:y adds a single unit axis to y , making the shape 1,$y . Fl(lamin1){R reshape(over(one,shape(w)),ravel(w));} Laminate: If the shapes of x and y are equal, then x,:y is defined by (,:x),(,:y) . If one is an atom a , it is first replaced by s$a , where s is the shape of the other. F2(lamin2){R over(a,reshape(over(one,shape(AR(w)?w:a)), ravel(w)));} Statistics. Analysis of the C implementation as it stands on 1990 2 22 yields the following statistics. (Header files and variables without functions are excluded.)
181 of the 240 functions are APL functions. Therefore, the implementation consists of a large number of short functions, having short lines, with a well-defined uniform interface. These are characteristic of an APL programming style.References
Appendix The forty-five frames in the following appendix show examples of use of the system in a variety of topics. All were actually executed on the system in March 1990. |
First appeared in the APL90 Conference Proceedings, APL Quote-Quad, Volume 20, Number 4, July 1990.
|