An Operator Calculus
Abstract This paper extends a line of APL development
presented in a sequence of papers Brief treatments are also given to a number of smaller matters: a transliteration or token substitution facility, the treatment of niladic functions, a custom (variant) operator, the obsolescence of certain system variables, and some changes in the function definition operator and in the treatment of supernumerary axes. The Structure of Functions The term “attribute”, used in earlier papers, led to some misunderstanding because it improperly suggested two conclusions:
We will therefore adopt the more neutral term “part” instead of “attribute”. A function comprises one or more parts: a core, and zero or more ancillary parts that extend the domain of, or otherwise modify, the function defined by the core. For example, if the core of f
is the function ⌹
as defined in APL\360,
and if f possesses no other
parts, then the right domain of f includes
arguments of rank 2, but none of higher rank.
However, if f also contains a rank part that
specifies how an argument of higher rank is to be
split into cells of rank 2 for application
of the core function,
then the domain of f
is extended to right arguments of higher rank.
Finally, the domain of the derived
function f⌿ will include an array of
shape The core itself has five parts, two kernels (monadic and dyadic), and three parameters, referred to in the APL expressions that comprise the kernels by the distinguished names ⎕A , ⎕B , and ⎕C . The first two of these were formerly referred to as underscored F and G [5 6], and are specified by the arguments of most operators. For example, if c is the composition
[3]
of functions a and b
(that is, c←a⍤b),
then the kernels of c are the
vectors The third parameter (⎕C) is the custom or variant parameter that can be used to provide variants of a function in the sense introduced in [1]. It is respecified by the right argument of the variant operator (assumed here to be the function-variable case of the dot, as in ⍋. 0 for 0-origin grade, and =.tol for equality comparison with a specific tolerance tol). For example, if the function sin has a monadic kernel '1○⍵×○÷2×⎕C' and ⎕C set to ○.5 , then sin ⍵ yields the sine of an argument expressed in radians, sin. 90 ⍵ yields the sine of an argument in degrees, and sin.(○.5) is equivalent to sin. In principle, a function may incorporate any number of ancillary parts, but the present treatment is limited to seven: rank, coherence, shape surrogate, inverse, dyad, derivative, and identity. If a part referred to by some operator is not present, the operator does not produce a domain error, but does produce a derived function with a restricted (perhaps empty) domain. For example, the “dual” operator ¨ in the expression h←⌽¨| does not produce a domain error, but any subsequent application of h does, since the operator requires the inverse of its right argument [3], and the magnitude function does not possess an inverse part. Absence of the rank part is equivalent to the
presence of an infinite rank,
and the same is true of coherence.
Rank The rank part is a three-element vector whose elements limit the ranks of the argument cells presented to the function defined by the core; the successive elements apply to the monadic, left dyadic, and right dyadic arguments. The rank operator is the function-variable case
of ⍤ ,
and f⍤k is equivalent to f
with its rank part specified by ⌽3⍴⌽k .
Coherence The following definition is adapted from [7] with the arguments of the corresponding operator reversed and with the term “coherence” used instead of “conformance”. In the application of a dyadic function f , the outer shapes ol and or are each split into two sets of axes (called bound and free) such that ol≡bl,fl and or≡br,fr; the shape of the overall result is b,fl,fr,sir , where b is one of bl and br , and sir is the shape of the individual results of applying the function to its cells. A shape s is said to be single if 1=×/s ; if one of bl and br is single, then b equals the other; if both are, then b equals the one of greater length; if neither is single, then bl and br must agree, and b is chosen as either one. The lengths of bl and br (that determine the number of bound axes) are each limited by the coherence part of a function; all primitive functions have infinite coherence. The coherence operator (denoted by k.f , where k is a non-negative integer), produces a derived function equivalent to f , but having coherence k. For example: ⍴(a←2 3 4⍴⍳24)0 .×(b←2 3 5⍴⍳30) 2 3 4 2 3 5 ⍴a 1 .× b 2 3 4 3 5 ⍴a 2 .× b 2 3 4 5 ⍴a 3 .× b length error ⍴a 1 .×(1 4⍴9) 2 3 4 4 ⍴(1 1 1⍴8)+(1 1⍴9) 1 1 1 ⍴2 3+1 1 1 1⍴4 2 The case of zero coherence (0 .f) is
equivalent to outer product (∘.f).
The reason for
reversing the arguments of the earlier conformance operator
[7]
is to leave the case f.k free for the
variant operator, with no inhibition on the use
of k←<'' or k←∘ that would have been required in
using the form k.f .
Surrogate Arguments In applying a monadic function f of (non-negative) rank k to an argument a , the shape of the overall result is os←(-k)↓⍴a suffixed by sir , the (necessarily common) shape of the individual results obtained in applying f to each of the ×/os cells of shape cs←(-k⌊⍴⍴a)↑⍴a . If there are no cells (i.e., 0=×/os), the value of sir cannot be determined by applying f , and must be determined from the cell shape cs alone. It will be defined as the shape of the value of lowest rank and smallest shape that could be produced by applying f to any argument of shape cs . For example: f: ⍴⍤k ,⍤k <⍤k ⍉⍤k ⍕⍤k sir: ⍴cs ×/cs ⍳0 ⌽cs cs The dyadic case is treated similarly (in terms of the left cell shape ls and the right cell shape rs), but is complicated by the cases of “scalar extension”, that is, a left cell value lv will be present if the left outer shape is single (i.e., the product over it is unity), and a right value rv will be present if the right outer shape is single. For example, if f is the dyadic function ⍴⍤(1,k) and a left value lv is present, then sir is lv ; if only ls is available, the value of sir is ls⍴0 , since the value that serves as lv must be ls⍴0 , that is, something of shape ls that produces a result of minimum shape. The entire situation can be handled by providing surrogate argument functions that, in each case of an empty frame, apply to the cell shape and function argument to produce a surrogate argument (whose shape equals the cell shape). This surrogate is submitted to the original function to produce a result whose shape properly determines the “individual result shape” required. Table 1 specifies surrogate argument functions for existing primitives. As examples of the use of Table 1, consider the following cases: ⍴ ⍉⍤3(0 1 2 3 4⍴0) 0 1 4 3 2 ⍴ 1 0 2⍉⍤1 3(0 1 2 3 4⍴0) 0 1 3 2 4 ⍴ (1 0 3⍴0)⍉⍤1 3(2 3 4⍴0) 0 1 ⍴ 2↑⍤1(0 3⍴0) 0 2 ⍴ (0 2⍴0)⍴0 0 0 0 ⍴ 1 2⍴⍤1(0 2⍴0) 0 1 2 ⍴ 1 2⍴⍤1(0 0⍴0) length error Inverse The result of the inverse operator ⊂
applied to f is the inverse part of f ,
except that its inverse part is respecified as f .
Derivative Consider a dyadic operator DOP such
that More generally, we will assume that df is
defined to apply to a vector index v that specifies
successive derivatives.
For example, The derivative part of f ,
and the action of the
derivative operator, may now be defined as
follows. The derivative part of f has the monadic
kernel '⎕B ⎕A ⍵' ,
a null dyadic kernel, and the
parameters df and v .
Both the core and the
derivative part of As pointed out in
[2],
the derivative of a
function f having a “result rank” of r and an
argument rank a
must have a result rank of r+a .
This behaviour must be incorporated in the
definition of the function df referred to in the
preceding paragraphs. Moreover, integration must
produce an indefinite integral, and Identity Function In extending the notion of an identity element,
first introduced to give meaning to expressions
such as +/⍳0 and ∧/⍳0 ,
it is clear that the result
for a non-scalar function must depend upon the
shape of the cell to which it is applied. For
example, in The notion of an identity element must
therefore be replaced by adding a new part that is
an identity function (that applies to the cell
shape), and by adding an operator that assigns a
value to the part.
Dyads Monadic functions such as 10¨⍟ (the base-10 logarithm) and *¨2 (the square function) might appear to be fully defined by the arguments of the operator ¨ , and therefore require no special part in the function to which the operator is applied. However, each such function may have parts such as derivative and inverse, and even a dyadic case. For example, the inverse and derivative of a¨+ are (-a)¨+ and the constant function 1 , respectively, and if f is a selection function, then i¨f can have a dyadic case that provides a merge of its arguments, as described in [5 6] for the case where f is the indexing function from ({). We define the monadic cases of the derived functions a¨f and f¨b as follows: f¨b a ←→ a 0 .f b a¨f b ←→ b 0 .f⊂ a where ⊂ denotes the commute operator [7]. Moreover, the dyads each inherit the appropriate dyadic rank of f , as well as the appropriate dyadic surrogate. Any inverse, derivative, or dyadic case of the
derived functions a¨f and f¨b are determined
from the information provided in the dyad parts of f .
Calculus of Operators In the case of an operator such as inverse, the entire derived function is determined by the inverse part of the original function, and there is no question of parts of the derived function being determined from any other parts of the original. However, in the case of an operator such as composition, it is clear that certain parts of the derived function should be inherited from (or at least derived from) various parts of the original functions. For example, if h←f⍤g , then the inverse part of h should have the monadic kernel '⎕B⊂⍤(⎕A⊂) ⍵' , and parameters f and g . The cases of the rank and coherence operators are the most interesting, since a number of parts of the function argument might be usefully passed on to the derived function with little or no change. The effect that the coherence operator should have on the parts of the derived function is rather straightforward, but that of the rank operator is more problematical. For example, the proper inverse of ⍉⍤r is clearly ⍉⍤r , but since the inverse of the enclose (<), of infinite rank, is the disclose (>) of rank 0, what rank should be assigned to the inverse <⍤3⊂ ? Moreover, in the dual f¨g , the inverse of g is applied to an argument of whatever rank is produced by applying f to the result produced by g on one of its cells. What then should be the rank of the inverse h←g⍤r⊂ to apply properly in f¨h ? Composition. The rank part of the function f⍤g is
the monadic rank of g ,
giving “close composition”
as defined in
[3];
the kernels are The reason for using ⎕B of infinite rank (that
is, ⎕B⍤¯) rather than ⎕B
in the kernel is
illustrated by the following example. If f←⌽
and g←⍉⍤¯1 , and Operators related to composition. The kernels and ranks of the derived functions of three other operators show marked similarity to those produced by composition (⍤).
Intrinsic rank. Although ⊢ is the identity function and is of unbounded rank, the composed function g←f⍤⊢ may differ from f , the difference becoming apparent only when the rank operator is applied to the functions. For example, if f←⍉⍤2 (where ⍉ itself is of infinite rank), then f⍤3 is equivalent to ⍉⍤3 , but g⍤3 is equivalent to ⍉⍤2 . We will therefore say that h←g⍤3 has extrinsic rank 3 , but intrinsic rank 2 . More generally, if p←q⍤j⍤⊢⍤k , then p is said to have intrinsic rank j and extrinsic rank k; the extrinsic rank is immediately respecified by application of the rank operator, but the intrinsic rank is unaffected. Every primitive function will be defined to have an intrinsic rank that is equal to its extrinsic rank, and is non-negative. The results of rank. Except for the derivative, dyads, and inverse parts, all parts of f (including, in particular, its coherence) are inherited by f⍤r . Since a derivative of a function f must apply to the same cells as f , the rank must be inherited by the derivative. The dyad parts are not inherited by f⍤r , but the monadic cases of a¨(f⍤r) and f⍤r¨b are, of course, defined as stated earlier. As remarked earlier, there is no relation between the rank of a function and the rank of its inverse that applies for all functions. However, in the case of a rank-preserving function f , the inverse function f⊂ would be expected to have the same rank, and we propose to choose the treatment of f⍤r⊂ to provide behaviour appropriate to such a function. Moreover, appropriate behaviour of the dual g¨(f⍤r) can be expected only in the case where g is also rank-preserving. A necessary condition that f⊂ be a proper
inverse is that f⊂⍤f be the identity function; it is
also desirable that f⍤(f⊂) be an identity. The
problems of defining the inverse appropriate to a
function f⍤r will first be illustrated by the
function t⍤r ,
where t is a self-inverse transpose
of intrinsic rank 3 (i.e., t ←→ t⊂ ←→ ⍉⍤3⍤⊢),
and the argument a
has shape We will examine two main cases, the direct inheritance of t⊂ by t⍤r , and the modified inheritance of t⊂⍤r . Within each of these we will examine the cases of rank restriction and expansion.
From the foregoing it is clear that only case 1 (direct inheritance) gives correct behaviour for all sub-cases for the more important left-inverse (t⍤r⊂⍤(t⍤r)) and that neither case can give correct behaviour for all sub-cases of the right-inverse. We therefore propose adoption of the rule of direct adoption of the inverse f⊂ for the inverse of the derived function f⍤r . The results of coherence. Since coherence
determines only which pairs of cells of the two
arguments are submitted to the function, all parts
of f save the coherence are passed on to the
derived function k.f .
Supernumerary Axes As remarked in [7], certain operators introduce one or more supernumerary axes in addition to the axes produced by the particular function to which the operator is applied. Although such supernumerary axes should precede the normal axes, the cited paper proposed an exception for the case of f\ (scan along the last axis) as a means of maintaining compatibility with the present behaviour of scan for primitive scalar functions. A more palatable way of retaining compatibility
is to assign rank 1 to the derived function f\ (and
also to f/).
Formally, f\←→f⍀⍤1 ,
and f/←→f⌿⍤1 .
Transliteration The ability to represent letters or words in the corresponding characters in another alphabet, known in natural languages as transliteration, can also be very useful in formal languages. For example, in APL one might substitute for the word RHO (entered by someone using a deficient terminal, or frightened of symbols other than the Roman alphabet) the symbol ⍴ , or conversely substitute for ⍴ (entered by someone who wishes to exploit the brevity and connotations of that symbol to refer to a related, but different, defined function) the word RHO . We will consider only substitutions for individual tokens (that is, those elements of APL such as abc2 , 2.34e6 , and +) that serve as words in APL, and will exclude substitution at a character level (such as TCH for CH in the word CHEBYCHEV) as well as substitution for phrases (such as 1 2 3 for ⍳3). Substitution for phrases will be avoided
because it would necessarily concern syntax
analysis of the sentence (to avoid, for example,
substituting 1 2 3 for ⍳3
in the expression We propose the introduction of a transliteration
system variable (to be referred to here as ⎕tr)
such that ⎕tr is a two-row matrix whose rows
consist of enclosed strings of tokens. Just before
evaluating any token in an APL sentence, a
substitution is made if the token occurs in the first
row of ⎕tr .
Moreover, substituted elements are
treated exactly as if they occurred in the original
sentence and, as a consequence, substitutions may
be chained.
Function Definition Because an ambivalent function is evaluated only in the context of arguments, the three expressions mpf←+.× and mp←m+.×n and per←+.×n can be used to assign names to three distinct entities, a matrix product function (mpf), a matrix product of two arguments (mp), and the permanent of a matrix (per). Because a niladic function nf requires no argument, a similar distinction between an evaluation of the function, and the function itself, cannot be made. Consequently, f←nf must be used for one of the possible meanings. If we choose to mean that f becomes the niladic function nf , then there is no mechanism for indicating evaluation of a niladic function. However, if we choose to mean that f becomes the result of executing nf , then niladic functions will continue to behave as they always have; moreover, canonical definition provides a means for associating any desired name with a niladic function. We therefore propose that niladic functions continue to be used and defined in the established manner. The most recent statement of the evolving “direct” definition operator occurs in [7]. We now introduce a slightly modified statement that 1) makes explicit the use of a system variable ⎕s for the sequence control vector (making branching and the re-starting of a halted function possible through expressions of the form ⎕s← rather than through the introduction of the branch arrow), and 2) makes indexing of the segments 0-origin:
A function produced by the definition operator ∇
has unbounded ranks and coherence, and the
custom parameter ⎕C set to ⍳0 .
System Variables As remarked earlier, the variant operator could be employed to make less cumbersome the use of functions now dependent upon system variables. Nevertheless, efforts to remove dependence on system variables should be continued, especially in cases where the dependence was inessential, and therefore ill-considered, and in cases where the need has been obviated by other developments in the language. Index origin is an example of the former, introduced in [8] (not only for indexing, but for other functions such as residue) because of awareness of the convenience of 0-origin in treating computer hardware, and of the familiarity of 1-origin to people not acquainted with computers. The situation has changed radically since then: the convenience of 0-origin in all areas has become more apparent; familiarity with 0-origin and its convenience has grown; and the bane of forever specifying index origin has become apparent to most APL programmers. We therefore re-iterate the proposal made in [7] that index origin be considered obsolescent, that is, maintained unchanged in existing primitive functions, but used in no new functions or operators. Printing width (first controlled by a system command) is an example of a parameter which, though essential when introduced (before the existence of the format function, when there was no way within the language of controlling the width of output), is no longer essential, and may, in fact, impede the full exploitation of scrolling facilities now available on video terminals. For example, in the implementation of Sharp
APL on the IBM PC, a long row of a matrix may be
shown “extended” rather than “folded” to fit the
screen width, and the “window” may be scrolled
over the row to view all parts of it.
However, a narrow setting
of print width (⎕PW) will cause each
row to be emitted as a sequence of independent
segments. An infinite setting of ⎕PW could
overcome this difficulty, but may be impossible due
to limitations in an APL system, or to implicit
assumptions about ⎕PW made in applications
designed for the system.
Catenation and Reshape Operators Consider a catenation operator COP and a reshape operator ROP such that the functions f←+COP-COP×COP÷ and g←2 2 ROP f would each have rank 0 , and would produce (for each cell) results of shape 4 and 2 2 respectively. More generally, we define Finally, we define s ROP f as equivalent
to s¨⍴⍥f .
References
Table 1: Function Ranks and Surrogates
Note: ⍺⍴id ¯2↑1 1,⍺ where id:(⍳1↑⍵)∘.=⍳1↓⍵ Originally appeared in the APL84 Conference Proceedings, APL Quote Quad, Volume 14, Number 4, 1984-06.
|