The Role of Operators in APL

The Role of Operators in APL

Kenneth E. Iverson

IBM/T.J. Watson Research Center
P.O. Box 218
Yorktown Heights, NY 10598

Abstract

Operators, which apply to functions to produce functions, are an important component of APL. Despite their importance, their role is not well understood, and they are often lumped with functions in expositions of the language. This paper attempts to clarify the role of operators in APL by tracing their development, outlining possible future directions, and commenting briefly on their roles in other languages, both natural and programming.

Introduction

According to the APL Language [1, p39] manual, “An operator may be applied to a function to derive a different function”. This is essentially the definition given in the International Dictionary of Applied Mathematics [2, p653], although it is there qualified by the comment that “In its widest sense the word operator is synonymous with function …”. The long list of subsequent dictionary entries (such as OPERATORS, DIFFERENCE, and OPERATORS, VECTOR) testifies to the utility of the concept in applied mathematics.

The lack of appreciation of operators in APL may be due in part to lack of familiarity with the use of operators in mathematics, but is more probably due to two other factors: a conflicting use of the term in elementary mathematics, and the manner of presenting APL adopted in most textbooks.

At an elementary level, functions such as + and × are commonly called operators and one might hesitate to conflict with this common practice. However, this use of the term operator should perhaps be discouraged in any case because it tends to impede the students’ grasp of the important notion of function.

It may be that writers of APL textbooks avoid the precise usage of the terms “function” and “operator” because they do not wish to conflict with existing usage in elementary mathematics, or because they feel that the concept of an operator is too difficult for beginners. Either reason seems ill-founded; the former because the issue is too important to dismiss lightly, the second because approaches can be found in which introduction of the operator concept clarifies rather than complicates matters. Bach [3], for example, has developed a simple introduction of operators in material at high-school level.

The definition of operators quoted above from Reference 1 does not cover the case of the axis operator, since one of its arguments is a number rather than a function. Moreover, it appears useful to extend the idea of operator to a single numeric argument (as in the case of a boolean operator to produce the 16 boolean functions of two arguments [4, p15]), to a character-string argument which represents the function to be produced [4, p9], or even to two numeric arguments, as may be convenient for a “hypergeometric” operator. The essential characteristic of an operator therefore concerns its range only — it produces a function.

The Development of Operators

Although reduction, inner product, and outer product appeared early in the development of APL, several years elapsed before they were recognized as operators. This was due in part to the typography used: for the dyadic cases the configurations and used to denote outer and inner product involved no explicit symbol for the operator, and for the monadic operator reduction it was easy to misconstrue the slash in an expression such as +/x as a case of a dyadic function.

The linearization of the typography forced by computer implementation led to the introduction of an explicit symbol for the dyadic operator, and to the recognition of the existence of operators in the language. However, when this was first documented in a paper on the design of APL [5, p328], it was felt that “… the operators had not been given a consistent syntax and that the notation should eventually be regularized to give operators the same syntax as functions, i.e., an operator taking two arguments occurs between the (function) arguments (as in +.×) and an operator taking one argument appears in front of it.”

Somewhat later, in working on the development of an APL standard [6], it was recognized that (except for the anomalous use of brackets for the axis operator) existing syntax was consistent under the rule that a monadic operator applied to the argument on its left, making the operator syntax a “mirror image” of function syntax. The “mirror-image” notion was used [4, p2] to define an order of execution for a sequence of operators which makes an operator apply to the result of the entire operator sequence to its left.

Another characteristic of the operator which should be noted is its precedence over functions; in any expression, the operators are executed before functions. For example, in the expression a+.×b , parentheses are not required around the operator expression +.× . Moreover, in present computer realizations of APL, parentheses used as in a(+.×)b are not only not required, they are not permitted. This must be rectified to permit the use of parentheses to specify the order of execution of operators, as in f(g∆) (the function which is the sum of f and the derivative of g), which differs from (fg)∆ or fg∆ (the derivative of the function which is the sum of functions f and g).

Experimentation with writing according to these rules has shown rather pleasing results, with the establishment of a function (by a sequence of operators) nicely set off from the argument to which it applies, and with a relatively low incidence of required parentheses in operator expressions.

For example, the application along axis 2 of an array a of the derivative of the composition of the sum of functions f and g with the function h would be written (using the operators defined in Operators and Functions [4]) as:

fg¨h∆⍤2 a

and the sum rule and chain rule for the derivatives of scalar functions would be written as:

	`fg∆ ←→ f∆(g∆)`
	`f¨g∆ ←→ f∆¨g(g∆)`

Interest in rationalizing the syntax of operators led to the recognition that derived functions produced by operators could, like primitive functions, be ambivalent, the particular valence (monadic or dyadic) being determined by the context in which the function is used. It also led to the recognition that an operator could apply to both numeric and character arguments as well as to functions, producing different classes of derived functions in the different cases. These observations lead to potentially important extensions to the language.

The first published use of ambivalence of derived functions (allowing a monadic use of the result of the inner-product operator to denote a generalization of the determinant, as in -.×m for the determinant of m , and +.×m for the permanent) appears to have been in my “Two Combinatoric Operators” [7]. However, I have since been informed by E.E. McDonnell that a similar suggestion was made in 1968 by Seth Breidbart, at the time a summer student employee with the APL group in IBM Research, a suggestion which, not being pursued at the time, was subsequently forgotten.

Other published proposals for exploiting the ambivalence of derived functions include the “controlled reduction” (more fully defined in Operators and Functions [4, p12]) in which the left argument in the expression w f/ a specifies the “width of the window” over which reduction by the function f is applied, allowing expressions such as 12 +/ v (for sums over all one-year periods of monthly data), and 2 -/ v and 2 ÷/ v and ∧/2 </ v (for pairwise difference, pairwise ratios, and strict monotonicity). They also include a “function-table operator”, for which the notation ∘.f may be employed, although the published paper [7] improperly suggests the notation f followed by a period.

Exploitation of non-function arguments for operators can lead to economical and mnemonically effective use of symbols. For example, as defined in [4], the composition operator ¨ can be applied to two functions (as in f¨g), but can also be applied to a dyadic function and a numeric argument (as in *¨.5 for the square-root function and x¨* for the “powers-of-x” function) to produce monadic cases of dyadic functions, and to two character arguments to “compose” a function from a “Lambda-Calculus” definition in an expression such as 'p+÷l'¨'l p'.

It can also lead to a rationalization of existing constructs in APL. For example, the slash is commonly construed as a monadic operator in the expression +/x , but as a dyadic function in the expression u/x , where u is a boolean vector. Construed as a monadic operator which takes either function or numeric arguments, the slash not only becomes regular in its definition and syntax (producing a derived function u/ which is a particular member, specified by u , of a family of selection functions), but also permits the definition of a dyadic derived function for use in expressions of the form x u/ y . This could usefully be defined to be the mask function [8, p19], and, together with a similar definition for x u\ y , could bring back the mesh and mask functions included in early APL but as yet excluded from all implementations.

The mesh and mask were of course excluded because they took three arguments, and an operator which takes a non-function argument or arguments can be construed as a device for introducing further parameters needed in the definition of a function. The introduction of system variables such as the index origin (⎕io) and comparison tolerance (⎕ct) also introduced extra parameters into certain functions. The two ideas can be related by the variant operator discussed in Operators and Functions [4].

The fact that the slash in the compression u/x was long considered as a dyadic function and can now be considered as a monadic operator (yielding a monadic function) without change to the meaning of the expression, suggests that other existing dyadic functions could perhaps be changed to monadic operators. The answer is that any dyadic function with no defined monadic case could be so treated.

Candidates therefore include the relations and other logical functions, the membership function ∊ , and take and drop (↑ and ↓). Certain symmetries in the first class, such as the commutativity of ∧ , ∨ , ⍲ , ⍱ , = , and ≠ , suggest that they should remain as dyadic functions. Any potential advantage in treating the others (∊ , ↑ , and ↓) as operators rests on defining a useful dyadic case of the derived function produced.

Although the circle function has a useful monadic case (which yields pi times its argument), the unusual nature of the left argument, serving as it does to select one of a limited family of functions, suggests that ○ might be considered as a monadic operator. However, the convenience of the present dyadic function in conjunction with the inner- and outer-product operators, as well as the convenience of the monadic case, weights heavily against a change. The definition of a very useful dyadic case for the derived function k○ could conceivably change this situation.

The inner-product, outer-product, and reduction operators in current implementations of APL apply only to primitive scalar functions. There is no syntactic or semantic difficulty in extending their domain to include defined scalar functions. However, to make reduction apply to empty vectors as it does for primitive functions, it will be necessary to provide a mechanism for specifying, in its definition, the identity element of the defined function. The identity element is a simple example of the need to specify, in addition to its “basic” definition, certain characteristics of the function defined. Other such characteristics are discussed in “The Derivative Operator” [9].

Now that the syntax of operators is understood, it may be appropriate to consider the introduction of a formal mechanism for operator definition, so that a user might define his own operators as well as his own functions. Since the readability of an expression is greatly enhanced by the immediate recognition of the valence of each object, the introduction of defined operators would impose on the careful APL user the burden of choosing distinctive names for them, just as good writers now make it possible to easily distinguish defined functions from variables.

Valence and Rank

The term valence was introduced in 1968 [10] to describe the three types of APL functions (dyadic, monadic, and niladic), and was used essentially as it is in chemistry to describe the “binding power” of an object, describing in this case the number of arguments to which the object applies.

We commonly say that most APL primitive functions are ambiguous in valence, being either monadic or dyadic according to context, and that certain functions, such as ~ and ≤ , have fixed valence. Strictly speaking, it is the symbol or operator expression representing the two potential functions which has ambiguous valence, and the function itself, as selected by the context, has a fixed valence. Moreover, it would seem to be more consistent to speak as if all function symbols and expressions are ambiguous, and that certain of the functions (such as dyadic ~ and monadic ≤) have empty domains. The practical consequence of this is that expressions such as x~y and ≤y should perhaps return domain errors rather than syntax errors.

Although ambiguous valence occurs in chemistry, there is no completely satisfactory chemical term for it. Thus bivalence, divalence and polyvalence are all used to describe a particular valence (2, 2, and ≥3) as well as the ambiguous character of a valence. The term ambivalent therefore seems most suitable to describe ambiguous valence, even though it has an accepted meaning in psychology, and even though the term divalence has been used [4].

APL arrays, both variables and constants, have valence 0, the same valence as a niladic function. It therefore appears desirable to eliminate the notion of niladic functions or at least to make them syntactically equivalent to arrays. This would, in particular, eliminate all ambiguity from the use of expressions of the form f←+.× , or m(f←+.×)n (essentially as suggested in [4]) to assign a name to a function.

We have in the past used the terms “variable” and “argument” to refer to an object of valence 0, although most such usage was not meant to exclude constants, which also have valence 0. Now, however, we may wish to speak of the arguments of operators, distinguishing between arguments of various valences, but also distinguishing between the occurrence of constant arguments, as in +.× , and variable arguments, as in f.g . Consequently, it seems best to characterize objects by their valence. To make such characterization complete, one could assign to monadic and dyadic operators the valences ¯1 and ¯2 , giving a symmetric set of valences running from ¯2 to 2 .

The utility of ambivalence in functions suggests a similar use of ambivalence in operators. However, such ambivalence appears to lead to an excessive use of parentheses, since they would be required to enclose every monadic use of an operator. Moreover, it would alter the present meaning of expressions, since the possibility of a dyadic use of the operator / would require parentheses around +/ in the expression +/x . It therefore appears best to retain the convention of present APL, each operator having a fixed valence, ¯1 or ¯2 .

Finally, the use of the jot symbol ∘ in the outer product ∘.× suggests a way in which most of the advantages of ambivalence in operators may be obtained even with fixed valence. The expression ∘.× involves a dyadic operator, but only one significant argument. Consequently, the jot can be viewed as a general device for denoting a “monadic case” of a dyadic operator. This device is exploited for a number of new operators in Reference 4. In fact, two distinct monadic cases could be obtained (as in ∘.f and f.∘) as well as a niladic case (as in ∘.∘).

The operators in present APL apply to arguments of valence 0, 1, or 2, and produce results of valence 1 or 2. In generalizing the reduction operator to apply to non-scalar functions, it is useful to distinguish between the expression f/⍤v , which applies reduction by f to each of the elements along the axis or axes specified by the vector or scalar v , and the expression f(/⍤s) which “partitions” the argument along the single axis s and applies a reduction by f which is equivalent to placing the dyadic function f between the resulting partitions. In the latter case the result of (/⍤s) is clearly an operator, and we have an example of an operator (the axis operator ⍤) applying to an operator (that is, /) to produce a derived operator which in turn applies to the function f.

The utility of operators which apply to functions to produce operators is also fairly obvious. For example, the special symbols such as and introduced in Reference 4 for scalar operators (and defined to produce operators analogous to the corresponding functions, as in fg x←→(f x)+(g x)) can be considered as especially important cases of the general notion of an operator which applied to a dyadic function f produces the corresponding operator. A possible notation for such an operator is a monadic case of composition, as in f¨∘.

It therefore appears useful to assume that operators may apply to arguments of any valence, including zero, and produce results of any valence except zero.

The inner product, outer product, and reduction operators now apply only to scalar functions, which are defined on scalar arguments and produce scalar results. The problem of extending these and other operators to non-scalar functions is closely related to the problem of extending a function defined on an argument of a certain rank to arguments of higher rank.

If a function is uniform in the sense that the shape of its result depends only on the shape of its argument, then it can be extended to arguments of higher rank in a systematic manner, since each of the results produced is of the same shape and they can form sub-arrays of the overall result.

Non-uniform functions can be handled by the introduction of an enclose function < , which “encodes” its argument as a scalar. Consequently, the composition <¨f is a uniform function (producing scalar results) which can be used to effectively extend the application of any non-uniform function f to arrays of higher rank. Moreover, if we introduce a disclose function > inverse to enclose, then the composition <¨f¨> produces a scalar function which applies to the arguments “enclosed” or “encoded” in the scalar elements of its argument. These matters are discussed in further detail in Reference 4.

Operators in Other Languages

It would be astonishing if a notion so fruitful as the operator did not occur in some form in other languages, both natural and formal. An operator in APL can be conceived as modifying the behavior of a function to which it is applied. If we make the obvious association between a function in APL (which performs some action on its argument) and a verb in a natural language, then an operator corresponds closely to an adverb.

An adverb may modify an adverb (as in “run very quickly”) or an adjective (as in “a very black night”) as well as a verb. The first of these is interesting in that it corresponds to an operator applying to an operator to produce a new operator. The second is interesting in that an adjective (and, consequently, a modified adjective) can be thought to modify a verb indirectly, as the adjective “green” indirectly modifies the action of “pointing” in the sentence “point out all the green houses”. In this sense declarations in programming languages are operators because they behave as adjectives modifying arguments rather than functions, but have the substantive effect of modifying the behavior of functions applied to these arguments.

Mathematical notation has provided the inspiration for most of the existing operators in APL, the inner product being a generalization of matrix product, the outer product a generalization of the outer product of tensor analysis, and reduction a generalization of the sigma notation for summation. The axis operator seems to have no counterpart in mathematical notation (except perhaps for the tensor notation for contraction which groups axes by use of common symbols rather than by reference to indices), nor, surprisingly, does the very useful scan operator. The needs filled by the axis and scan operators in APL seem to have been filled in mathematics by variations of the sigma notation, using, for example; expressions such as i<n_j under the sigma to denote scan.

Mathematical notation has also inspired most of the new proposals for operators in References 4 and 9, but again the APL operators are made more general. For example:

1.		The dyadic “controlled reduction” `w f/ x` provides running sums, running maxima, and pairwise ratios as well as the difference operator (`¯2 -/ x`) found in mathematics.
2.		The derivative operator `∆` produces the entire array of partial derivatives (whose determinant, in the case of a vector function, is called the Jacobian), and therefore provides not only the derivative for scalar functions, but also provides the basis for other operators of the vector calculus.
3.		The scalar operators provide obviously useful facilities in the case of and , but less obviously provide for the treatment of set theory in terms of the propositional functions which define the sets, as in `pq` for intersection, `pq` for union, and `pq` for symmetric difference.
4.		The dual operator provides the duals expressed in De Morgan’s laws but, because it permits the specification of the monadic function involved, it also covers duality with respect to functions other than logical negation, as in `⌊-←→⌈` . Moreover, because it is defined to accommodate mapping functions which are not self-inverse, it applies to cases such as `+⍟←→×` and `×*←→+` .

There remain many operators in mathematics which should be explored for possible generalization and inclusion in APL. A limit operator would be extremely useful although perhaps difficult to implement well. Other possible candidates include fractional derivatives [11], convolution, and correlation, as well as moment-generating functions and certain transforms such as Fourier and Laplace.

Expressions of the form a (+,×,-,÷) b , used in the Formal Description of SYSTEM/360 [12] to express the application to arguments a and b of a function selected by an index i , suggest the need for a catenation operator and indexing or other selection operators. Collections of functions produced by catenation could be used in a variety of ways.

Control structures such as the Fortran DO statement occurring in programming languages may be construed as operators in the sense that they apply some function defined by certain expressions in the body of the DO statement over certain sets of arguments. In simple cases this corresponds exactly to some APL function on arrays (as in v×w) or APL operator (as in +/v or +\v), but in others it may involve some test for completion which depends upon the function values computed, and therefore cannot be mimicked directly by existing APL operators. Many cases of DO statements not directly expressible in terms of present APL operators could be covered by the application of operators to defined functions, and by the introduction of a limit operator.

References

1.		APL Language, Form No. GC26-3847, IBM Corporation.
2.		The International Dictionary of Applied Mathematics, Van Nostrand, Princeton, N.J.
3.		Bach, Gottfried, Unpublished class notes.
4.		Iverson, K.E., Operators and Functions, Research Report No. 7091, IBM Corporation.
5.		Falkoff, A.D., and K.E. Iverson, The Design of APL, IBM Journal of Research and Development, Vol. 16, No. 4, July, 1973.
6.		Falkoff, A.D., and D.L. Orth, APL Standard, APL79 Conference Proceedings, APL Quote-Quad, Volume 9, Number 4, June, 1979.
7.		Iverson, K.E., Two Combinatoric Operators, Proceedings of APL76, Ottawa, Ontario.
8.		Iverson, K.E., A Programming Language, Wiley and Sons, New York, N.Y., 1962.
9.		Iverson, K.E., The Derivative Operator, APL79 Conference Proceedings, APL Quote-Quad, Volume 9, Number 4, June, 1979.
10.		Falkoff, A.D. and K.E. Iverson, The APL\360 Terminal System, Symposium on Interactive Systems for Experimental and Applied Mathematics, eds. M. Klerer and J. Reinfelds, Academic Press, New York, 1968.
11.		Oldham, K.B., and Jerome Spanier, The Fractional Calculus, Academic Press, 1974.
12.		Falkoff, A.D., K.E. Iverson, and E.H. Sussenguth, A Formal Description of SYSTEM/360, IBM Systems Journal, October, 1964.

Originally appeared in the APL79 Conference Proceedings, APL Quote-Quad, Volume 9, Number 4, 1979-06.

original writing:	2010-01-21 10:55
last updated:	2013-05-03 06:45