The Role of Operators in APL Abstract Operators, which apply to functions to produce functions, are an important component of APL. Despite their importance, their role is not well understood, and they are often lumped with functions in expositions of the language. This paper attempts to clarify the role of operators in APL by tracing their development, outlining possible future directions, and commenting briefly on their roles in other languages, both natural and programming. Introduction According to the APL Language
[1, p39]
manual, “An operator may be applied to a
function to derive a different function”.
This is essentially the definition given in
the International Dictionary of Applied Mathematics
[2, p653],
although it is there qualified by the comment that
“In its widest sense the word operator is synonymous with
function
”.
The long list of subsequent
dictionary entries
(such as The lack of appreciation of operators in APL may be due in part to lack of familiarity with the use of operators in mathematics, but is more probably due to two other factors: a conflicting use of the term in elementary mathematics, and the manner of presenting APL adopted in most textbooks. At an elementary level, functions such as + and × are commonly called operators and one might hesitate to conflict with this common practice. However, this use of the term operator should perhaps be discouraged in any case because it tends to impede the students’ grasp of the important notion of function. It may be that writers of APL textbooks avoid the precise usage of the terms “function” and “operator” because they do not wish to conflict with existing usage in elementary mathematics, or because they feel that the concept of an operator is too difficult for beginners. Either reason seems ill-founded; the former because the issue is too important to dismiss lightly, the second because approaches can be found in which introduction of the operator concept clarifies rather than complicates matters. Bach [3], for example, has developed a simple introduction of operators in material at high-school level. The definition of operators quoted
above from
Reference 1
does not cover the case of the axis operator,
since one of its arguments is a number
rather than a function.
Moreover, it appears useful to
extend the idea of operator to a single
numeric argument
(as in the case of a boolean operator
to produce the 16 boolean
functions of two arguments
[4, p15]),
to a character-string argument
which represents the function to be produced
[4, p9],
or even to two numeric arguments,
as may be convenient
for a “hypergeometric” operator.
The essential characteristic of an operator
therefore concerns its range only —
it produces a function. The Development of Operators Although reduction, inner product,
and outer product appeared early
in the development of APL,
several years elapsed
before they were recognized as operators.
This was due in part to the typography used:
for the dyadic cases the configurations
The linearization of the typography forced by computer implementation led to the introduction of an explicit symbol for the dyadic operator, and to the recognition of the existence of operators in the language. However, when this was first documented in a paper on the design of APL [5, p328], it was felt that “ the operators had not been given a consistent syntax and that the notation should eventually be regularized to give operators the same syntax as functions, i.e., an operator taking two arguments occurs between the (function) arguments (as in +.×) and an operator taking one argument appears in front of it.” Somewhat later, in working on the development of an APL standard [6], it was recognized that (except for the anomalous use of brackets for the axis operator) existing syntax was consistent under the rule that a monadic operator applied to the argument on its left, making the operator syntax a “mirror image” of function syntax. The “mirror-image” notion was used [4, p2] to define an order of execution for a sequence of operators which makes an operator apply to the result of the entire operator sequence to its left. Another characteristic of the operator
which should be noted
is its precedence over functions;
in any expression, the operators
are executed before functions.
For example,
in the expression a+.×b ,
parentheses are not required around the operator
expression +.× .
Moreover, in present computer realizations of APL,
parentheses used as in a(+.×)b
are not only not required,
they are not permitted.
This must be rectified to permit
the use of parentheses to specify the order
of execution of operators,
as in Experimentation with writing according to these rules has shown rather pleasing results, with the establishment of a function (by a sequence of operators) nicely set off from the argument to which it applies, and with a relatively low incidence of required parentheses in operator expressions. For example, the application along axis 2 of an array a of the derivative of the composition of the sum of functions f and g with the function h would be written (using the operators defined in Operators and Functions [4]) as:
and the sum rule and chain rule for the derivatives of scalar functions would be written as:
Interest in rationalizing the syntax of operators led to the recognition that derived functions produced by operators could, like primitive functions, be ambivalent, the particular valence (monadic or dyadic) being determined by the context in which the function is used. It also led to the recognition that an operator could apply to both numeric and character arguments as well as to functions, producing different classes of derived functions in the different cases. These observations lead to potentially important extensions to the language. The first published use of ambivalence of derived functions (allowing a monadic use of the result of the inner-product operator to denote a generalization of the determinant, as in -.×m for the determinant of m , and +.×m for the permanent) appears to have been in my “Two Combinatoric Operators” [7]. However, I have since been informed by E.E. McDonnell that a similar suggestion was made in 1968 by Seth Breidbart, at the time a summer student employee with the APL group in IBM Research, a suggestion which, not being pursued at the time, was subsequently forgotten. Other published proposals for
exploiting the ambivalence of derived
functions include the “controlled reduction”
(more fully defined in Operators and Functions
[4, p12])
in which the left argument
in the expression Exploitation of non-function arguments
for operators can lead to economical and
mnemonically effective use of symbols.
For example, as defined in
[4],
the composition operator ¨
can be applied to two functions
(as in f¨g),
but can also be applied to a dyadic function
and a numeric argument
(as in *¨.5 for the square-root function
and x¨*
for the “powers-of-x” function)
to produce monadic cases of dyadic functions,
and to two character arguments to “compose”
a function from a “Lambda-Calculus” definition
in an expression such
as It can also lead to a rationalization
of existing constructs in APL.
For example, the slash is commonly construed as a monadic
operator in the expression +/x ,
but as a dyadic function
in the expression u/x ,
where u is a boolean vector.
Construed as a monadic operator
which takes either function or numeric arguments,
the slash not only becomes regular in its definition and syntax
(producing a derived function u/
which is a particular member,
specified by u ,
of a family of selection functions),
but also permits the definition
of a dyadic derived function for use in expressions
of the form The mesh and mask were of course excluded because they took three arguments, and an operator which takes a non-function argument or arguments can be construed as a device for introducing further parameters needed in the definition of a function. The introduction of system variables such as the index origin (⎕io) and comparison tolerance (⎕ct) also introduced extra parameters into certain functions. The two ideas can be related by the variant operator discussed in Operators and Functions [4]. The fact that the slash in the compression u/x was long considered as a dyadic function and can now be considered as a monadic operator (yielding a monadic function) without change to the meaning of the expression, suggests that other existing dyadic functions could perhaps be changed to monadic operators. The answer is that any dyadic function with no defined monadic case could be so treated. Candidates therefore include the relations and other logical functions, the membership function ∊ , and take and drop (↑ and ↓). Certain symmetries in the first class, such as the commutativity of ∧ , ∨ , ⍲ , ⍱ , = , and ≠ , suggest that they should remain as dyadic functions. Any potential advantage in treating the others (∊ , ↑ , and ↓) as operators rests on defining a useful dyadic case of the derived function produced. Although the circle function has a useful monadic case (which yields pi times its argument), the unusual nature of the left argument, serving as it does to select one of a limited family of functions, suggests that ○ might be considered as a monadic operator. However, the convenience of the present dyadic function in conjunction with the inner- and outer-product operators, as well as the convenience of the monadic case, weights heavily against a change. The definition of a very useful dyadic case for the derived function k○ could conceivably change this situation. The inner-product, outer-product, and reduction operators in current implementations of APL apply only to primitive scalar functions. There is no syntactic or semantic difficulty in extending their domain to include defined scalar functions. However, to make reduction apply to empty vectors as it does for primitive functions, it will be necessary to provide a mechanism for specifying, in its definition, the identity element of the defined function. The identity element is a simple example of the need to specify, in addition to its “basic” definition, certain characteristics of the function defined. Other such characteristics are discussed in “The Derivative Operator” [9]. Now that the syntax of operators is understood,
it may be appropriate to consider the introduction
of a formal mechanism for operator definition,
so that a user might define his own operators
as well as his own functions.
Since the readability of an expression
is greatly enhanced by the immediate recognition
of the valence of each object,
the introduction of defined operators
would impose on the careful APL user the burden
of choosing distinctive names for them,
just as good writers now make it possible
to easily distinguish defined functions
from variables.
Valence and Rank The term valence was introduced in 1968 [10] to describe the three types of APL functions (dyadic, monadic, and niladic), and was used essentially as it is in chemistry to describe the “binding power” of an object, describing in this case the number of arguments to which the object applies. We commonly say that most APL primitive functions are ambiguous in valence, being either monadic or dyadic according to context, and that certain functions, such as ~ and ≤ , have fixed valence. Strictly speaking, it is the symbol or operator expression representing the two potential functions which has ambiguous valence, and the function itself, as selected by the context, has a fixed valence. Moreover, it would seem to be more consistent to speak as if all function symbols and expressions are ambiguous, and that certain of the functions (such as dyadic ~ and monadic ≤) have empty domains. The practical consequence of this is that expressions such as x~y and ≤y should perhaps return domain errors rather than syntax errors. Although ambiguous valence occurs in chemistry, there is no completely satisfactory chemical term for it. Thus bivalence, divalence and polyvalence are all used to describe a particular valence (2, 2, and ≥3) as well as the ambiguous character of a valence. The term ambivalent therefore seems most suitable to describe ambiguous valence, even though it has an accepted meaning in psychology, and even though the term divalence has been used [4]. APL arrays, both variables and constants, have valence 0, the same valence as a niladic function. It therefore appears desirable to eliminate the notion of niladic functions or at least to make them syntactically equivalent to arrays. This would, in particular, eliminate all ambiguity from the use of expressions of the form f←+.× , or m(f←+.×)n (essentially as suggested in [4]) to assign a name to a function. We have in the past used the terms “variable” and “argument” to refer to an object of valence 0, although most such usage was not meant to exclude constants, which also have valence 0. Now, however, we may wish to speak of the arguments of operators, distinguishing between arguments of various valences, but also distinguishing between the occurrence of constant arguments, as in +.× , and variable arguments, as in f.g . Consequently, it seems best to characterize objects by their valence. To make such characterization complete, one could assign to monadic and dyadic operators the valences ¯1 and ¯2 , giving a symmetric set of valences running from ¯2 to 2 . The utility of ambivalence in functions suggests a similar use of ambivalence in operators. However, such ambivalence appears to lead to an excessive use of parentheses, since they would be required to enclose every monadic use of an operator. Moreover, it would alter the present meaning of expressions, since the possibility of a dyadic use of the operator / would require parentheses around +/ in the expression +/x . It therefore appears best to retain the convention of present APL, each operator having a fixed valence, ¯1 or ¯2 . Finally, the use of the jot symbol ∘ in the outer product ∘.× suggests a way in which most of the advantages of ambivalence in operators may be obtained even with fixed valence. The expression ∘.× involves a dyadic operator, but only one significant argument. Consequently, the jot can be viewed as a general device for denoting a “monadic case” of a dyadic operator. This device is exploited for a number of new operators in Reference 4. In fact, two distinct monadic cases could be obtained (as in ∘.f and f.∘) as well as a niladic case (as in ∘.∘). The operators in present APL apply to arguments of valence 0, 1, or 2, and produce results of valence 1 or 2. In generalizing the reduction operator to apply to non-scalar functions, it is useful to distinguish between the expression f/⍤v , which applies reduction by f to each of the elements along the axis or axes specified by the vector or scalar v , and the expression f(/⍤s) which “partitions” the argument along the single axis s and applies a reduction by f which is equivalent to placing the dyadic function f between the resulting partitions. In the latter case the result of (/⍤s) is clearly an operator, and we have an example of an operator (the axis operator ⍤) applying to an operator (that is, /) to produce a derived operator which in turn applies to the function f. The utility of operators
which apply to functions to produce operators
is also fairly obvious.
For example, the special symbols
such as It therefore appears useful to assume that operators may apply to arguments of any valence, including zero, and produce results of any valence except zero. The inner product, outer product, and reduction operators now apply only to scalar functions, which are defined on scalar arguments and produce scalar results. The problem of extending these and other operators to non-scalar functions is closely related to the problem of extending a function defined on an argument of a certain rank to arguments of higher rank. If a function is uniform in the sense that the shape of its result depends only on the shape of its argument, then it can be extended to arguments of higher rank in a systematic manner, since each of the results produced is of the same shape and they can form sub-arrays of the overall result. Non-uniform functions can be handled
by the introduction of an enclose
function < ,
which “encodes” its argument as a scalar.
Consequently, the composition <¨f
is a uniform function (producing scalar results)
which can be used to effectively extend the
application of any non-uniform function f
to arrays of higher rank.
Moreover, if we
introduce a disclose function >
inverse to enclose,
then the composition <¨f¨>
produces a scalar function which applies to the
arguments “enclosed”
or “encoded” in the
scalar elements of its argument.
These matters are discussed in further detail in
Reference 4.
Operators in Other Languages It would be astonishing if a notion so fruitful as the operator did not occur in some form in other languages, both natural and formal. An operator in APL can be conceived as modifying the behavior of a function to which it is applied. If we make the obvious association between a function in APL (which performs some action on its argument) and a verb in a natural language, then an operator corresponds closely to an adverb.An adverb may modify an adverb (as in “run very quickly”) or an adjective (as in “a very black night”) as well as a verb. The first of these is interesting in that it corresponds to an operator applying to an operator to produce a new operator. The second is interesting in that an adjective (and, consequently, a modified adjective) can be thought to modify a verb indirectly, as the adjective “green” indirectly modifies the action of “pointing” in the sentence “point out all the green houses”. In this sense declarations in programming languages are operators because they behave as adjectives modifying arguments rather than functions, but have the substantive effect of modifying the behavior of functions applied to these arguments. Mathematical notation has provided the inspiration for most of the existing operators in APL, the inner product being a generalization of matrix product, the outer product a generalization of the outer product of tensor analysis, and reduction a generalization of the sigma notation for summation. The axis operator seems to have no counterpart in mathematical notation (except perhaps for the tensor notation for contraction which groups axes by use of common symbols rather than by reference to indices), nor, surprisingly, does the very useful scan operator. The needs filled by the axis and scan operators in APL seem to have been filled in mathematics by variations of the sigma notation, using, for example; expressions such as i<nj under the sigma to denote scan. Mathematical notation has also inspired most of the new proposals for operators in References 4 and 9, but again the APL operators are made more general. For example:
There remain many operators in mathematics which should be explored for possible generalization and inclusion in APL. A limit operator would be extremely useful although perhaps difficult to implement well. Other possible candidates include fractional derivatives [11], convolution, and correlation, as well as moment-generating functions and certain transforms such as Fourier and Laplace. Expressions of the
form Control structures
such as the Fortran DO statement
occurring in programming languages
may be construed as operators
in the sense that they apply some function
defined by certain expressions in the body
of the DO statement over certain sets of arguments.
In simple cases this corresponds
exactly to some APL function on arrays
(as in v×w) or APL operator
(as in +/v or +\v),
but in others it may involve some test
for completion which depends
upon the function values computed,
and therefore cannot be mimicked
directly by existing APL operators.
Many cases of DO statements not directly
expressible in terms of present APL operators
could be covered by the application of operators
to defined functions,
and by the introduction of a limit operator.
References
Originally appeared in the APL79 Conference Proceedings, APL Quote-Quad, Volume 9, Number 4, 1979-06.
|