Recreational APL Sauce for the Gander People just learning APL are happy to find that to add a scalar to each element of a higher rank array is trivial: the scalar is “extended” to have the shape of the higher rank array, and then elementbyelement addition takes place. This rule is applied universally for all the dyadic scalar functions, and explains the behavior which initially may appear mysterious of adding a scalar to an empty array. The reason the result is empty is that the scalar is “extended” to have the shape of the higher rank array, by reshaping the scalar by the shape of the higher rank array. If the higher rank array were an empty vector, we would have the scalar reshaped by 0⍴s , giving us an empty vector and thus having conforming lengths and ranks for the two arguments. So adding a scalar to an empty array gives a result that is also empty and has the same shape as the empty array. This much can be explained. But there is no easy answer when the question is asked: Why doesn’t APL permit me to add (or multiply, or compare for equality) a vector to each row of a matrix? Indeed, the question has been discussed in APL Quote Quad and elsewhere [1, 2, 3, 4, 5, 6]. Larry Breed’s proposal for handling this [3] went like this: If ⍴q is 10 10 and ⍴w is 10 and ⍴u is 15 , then either q+w or q+1 10⍴w adds w to each row of q , and q+10 1⍴w adds w to each column of q . Lastly, (10 1⍴w)+1 15⍴u would give the outer product sum of w and u .Breed’s proposal has a defect — it lacks symmetry. It favors dimensions of length one at the expense of other dimension lengths. Breed’s proposal included a resolution of the problem of what should be the shape of the result of composing with some function two arrays of differing ranks, but with just one element each: the proposal ensured that the rank of the result would always be that of the higher rank array. Essentially what is wrong with the proposal is that it introduces the requirement to reshape a vector argument in order to add it to the columns of a matrix. Forkes’ proposal [6] also gives special properties to axes of length one, and involves a new primitive function having the purpose of adding axes of length one to an array, so that the special conformability rules applicable to axes of length one may be employed. I think that the rules for axes of length one represent an early and aberrational state of APL, and ought not to be exploited or used with more primitives. Jizba proposes [5] an extension to the inner product operator, which he calls “distributed product”, to achieve reshaping of a vector argument to a matrix argument in the proper way to permit elementbyelement combination of the two arguments on either the rows or the columns of the matrix argument. He does not give a specific notation nor does his proposal generalize to combining arrays of any differing ranks. It also requires an asymmetry in the expression of the functional arguments to his new inner product (one reverses the two functional arguments according to the relative positions of the vector and the matrix arguments). On two different occasions it has been noted in these pages [2, 4] that there is a way to achieve the desired effect. Philip Abrams noted [2] that to combine a vector v with the rows of a matrix m , you could write 1 2 2⍉m∘.+v 2 1 2⍉v∘.+m and to combine a vector with the columns of a matrix you could write 1 2 1⍉m∘.+v 1 1 2⍉v∘.+m Per Gjerlov noted the same [4], and he pointed out that these expressions do not check for nonconforming arrays, but they do permit arbitrary rank arguments with axes of any length. Ideally, you would want the lengths of the indicated axes to agree. The notes by Abrams and Gjerlov show how a collapsing transpose can be used to select from the result of an outer product those subplanes corresponding to the desired result. These formulas do indeed work, but at the expense of undergoing potentially several redundant calculations, developing unneeded terms which are subsequently thrown away. They do, however, contain a useful implication, which shows how APL can be augmented to give the desired function in a congenial and economical way. If you look at the solutions given by Abrams and Gjerlov, you find that the left argument to transpose exhibits a definite pattern. First, it always contains all the indices of the axes of the matrix. Second, it indicates which axis of the matrix the vector is to be applied along. For example, when the vector is to be added to the columns (that is, down along the first dimension), the argument contains 1 2 and also 1 . The axis indication is on the left side of 1 2 if the vector is on the left of the outer product, and on the right side of 1 2 if the vector is on the right side of the outer product. This pattern is maintained when you want to add arrays of any differing ranks. The set of indices of the axes of the higher rank array, in order, forms the foundation of the left argument to transpose; adjoining this, on the left or on the right, depending on whether the lower rank array is on the left or the right of the outer product, is the list of axis numbers of the higher rank array along which the lower rank array is to be applied. It should be clear that the axes of the lower rank array must have the same length as the indicated axes of the higher rank array. Let’s work through a few examples, so you are able to get the flavor of this construction. Suppose we have two arrays l and s , where s has the lower rank, and we want to add s to the subarrays of l along axes i of l . For example, suppose l is a matrix and s is a vector, and we want to add s to the columns of l ; then i will have the value 1 , since the columns go down the first dimension, through the rows. If we have the outer product l∘.+s as the right argument to transpose, the proper left argument for transpose is (⍳⍴⍴l),i ; if we have s∘.+l , the proper left argument is i,⍳⍴⍴l . Thus, the left argument always has the term ⍳⍴⍴l , which can be derived from l , and needs to have specified, as a third argument, the list of axes i , which tell which axes of the larger array the smaller is to be composed along. If ⍴l is 19 7 3 13 5 and ⍴s is 7 13 5 , then the only axes that the smaller array can be applied against are those where its lengths match, namely, axes 2 4 5 . If we want to add s to the subarrays of l along these axes, we could write either
2 4 5 1 2 3 4 5⍉s∘.+l If we wanted to write a defined function to perform composition using any scalar dyadic function of any two arrays of differing rank along a set of axes of the larger array, the arguments that would be required are just these four items. We would need a tesseradic function. APL doesn’t provide for such an adicity. What to do? Most people I have shown this much to begin to get edgy at this point, and want to butt in with their own solution to the problem, and the solution is always the same and is, in my opinion, the right one. They suggest that the domain of the axis operator (which now includes reversal, rotation, catenation, compression, expansion, reduction, and scan) should be extended to include the dyadic scalar functions. If the axis operator domain were so extended, we would be able to write: ⎕←a←3 4⍴⍳12 1 2 3 4 5 6 7 8 9 10 11 12 ⎕←b←10×⍳3 10 20 30 ⎕←c←100×⍳4 100 200 300 400 ⎕←a+[1]b 11 12 13 14 25 26 27 28 39 40 41 42 ⎕←b+[1]a 11 12 13 14 25 26 27 28 39 40 41 42 ⎕←a+[2]c 101 202 303 404 105 206 307 408 109 210 311 412 ⎕←c+[2]a 101 202 303 404 105 206 307 408 109 210 311 412 This seems to me to be the right way to go about extending the notion of scalar extension. It also seems to me to be one of the more pleasant ways of extending APL, in that no new symbols are added to the language — simply, new meanings are found for existing symbols. In this case, it is the axis operator that is extended, to include in its domain the dyadic scalar functions. The definition of the extension of the axis operator
is not original with me,
but the formulation in terms of the specific collapsing transpose
of an outer product is,
although the possibility was suggested
by Alex Morrow of IBM.
References
Interesting Prime Numbers
The Reshape of Things The reshape function in APL, dyadic ⍴ , has several interesting properties and also has an interesting history. This article will tell you all about it, and you will also understand how the week got its length and the days got their names. Herb Hellerman, now a computer architect for Amdahl Corporation, was in the early 1960s manager of a group at IBM’s Thomas J. Watson Research Center in Yorktown Heights, New York, that included Ken Iverson and Adin Falkoff. Hellerman had started a programmer from that computer center on implementing Iverson’s notation, but this project foundered when the programmer was recalled by the computer center management. Herb then decided to undertake the job himself, and thus became the first implementer (in 1963) of an Iverson language system. He called the result the Personalized Array Translator, or PAT [1]. Several features of the PAT system relate to our topic. First, it included declarations, unlike all subsequent APL implementations. In writing functions, programmers had to specify the lengths of the dimensions of the arrays to be worked with, and also had to indicate the type of data. Second, assignment had a special flavor. If the object on the right of the assignment had more or fewer elements than the object on the left of the assignment, the size of the object on the left was the controlling factor. The elements from the object on the right were used up only to the point that all the elements of the object on the left were filled, or, if the object on the right had too few elements, these were reused cyclically in order to fill up the object on the left. These two features preshadowed, somewhat dimly, the behavior of the reshape function. Before the reshape function could come into existence, it was necessary to rid the notation of the nu and mu functions. The nu function gave the number of elements in a vector, or the number of columns in a matrix. The mu function gave the number of rows in a matrix. These were displaced in 1965 by the monadic rho function, which, of course, simply gives the length of each dimension of an object. Since the thrust at that time was to use to the fullest all of the symbols on an 88character typing element, a dyadic use was sought for rho; it was then that the influence of some of the features of PAT must have been felt, and we got reshape. It is interesting to note that before nu and mu disappeared, Iverson notation was pretty much constrained to be a twodimensional language. With them gone, and rho available, the sky was the limit. Now, how does this bear on the length of the week and the names of the days? I’ll tell you. The ancient Egyptian astronomers were no slouches. They recognized seven planets and their relative distances from the Earth. From farthest to nearest, these were
They also subdivided the day into 24 periods, and began reckoning their day from midnight, just as we do. Furthermore, to each hour of the day beginning with the first at midnight, they consecrated a planet, beginning with the most remote. When they ran out of planets after the 7th hour, they began again with Saturn in the 8th hour, and so on. When the new day began, at the 25th hour, they simply continued the cycle unaltered. To see which planet is associated with each hour of every day for, say, 21 days, let the numbers 1 through 7 stand for the planets as listed above, and compute (keeping your eye on reshape as it stretches its argument out to fill the ordained size, just as the Greek giant Procrustes stretched his victims to fit one of his beds): ⍉21 24⍴⍳7 (the transpose is to permit display of the results in our narrow column): 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 3 6 2 5 1 4 7 3 6 2 5 1 4 7 3 6 2 5 1 4 7 This chart shows how the planets were assigned to the hours of the day, over a 21day period, in an unending cycle. More important, it shows that the first hour of each day is different for seven days, and then it begins again with the outermost planet. The week acquired its length of seven days from this. Furthermore, let us list the planets associated with the first hours of each day together with the names of the days in Latin and English:
The correspondence with the Latin names for the days is perfect (Luna=Moon). In other words, the planet associated with the first hour of each day gave its name to the day. But how does it happen that the first (most remote) planet (Saturn) is associated with the last day of our present week? The reason for this is that when the Jews fled the Egyptian captivity they made the Egyptian’s first day their last day, from hatred of their oppressors. Next, what is the correspondence between the Latin and the English names? Tuesday and Friday are easily explained: Tiu is the god of sky and war in Saxon mythology, corresponding exactly to Mars in Roman mythology. Similarly, Frigg, Wotan’s wife, is the goddess of love corresponding exactly to Venus of Roman myth. On the other hand, Wednesday and Thursday are troublesome, since we would expect an equation between Jupiter and Wotan, not between Jupiter and Thor. Perhaps the answer lies in the following speculation: the Saxons, like other Northern Europeans, customarily worked themselves into a frenzy when going to battle. They prayed to Wotan for this divine madness, in order to give themselves more than human strength. The metal mercury, on the other hand, has the wellknown property that working with it can drive a person insane. This was typically the case, for example, with the hatters of England and Europe, who used mercury to prepare felt from rabbit skins. The phrase “mad as a hatter” comes from the sad effects of their use of mercury on their brains. The Mad Hatter in Alice in Wonderland illustrates this more than legendary madness associated with the use of mercury by hatters. We see a perhaps farfetched connection between the madness aspect of Mercury and that of Wotan. Assuming there is some justice in the speculation,
we then find it not too unnatural to see
an association between the Jove
of the thunderbolt and the Thor of thunder,
in order to complete our little story.
Reference
First appeared in
APL QuoteQuad, Volume 9, Number 3, 197903.
The text requires the APL385 Unicode font, which can be
downloaded from
http://www.vector.org.uk/resource/apl385.ttf .
To resolve (or at least explain) problems with displaying
APL characters see
http://www.vector.org.uk/archive/display.htm .
