Algebra as a Language
Kenneth E. Iverson
Monadic form f b  f  Dyadic form a f b  





Table A.1
A little experimentation with the notation of Table A.1 will show that it can be used to express clearly a number of matters which are awkward of impossible to express in conventional notation. For example, X÷Y is the quotient of X divided by Y ; either ⌊(X÷Y) or ((X(YX))÷Y yield the integer part of the quotient of X divided by Y ; and X⌈(X) is equivalent to X . In conventional notation the symbols < , ≤ , = , ≥ , > , and ≠ are used to state relations among quantities; for example, the expression 3<4 asserts that 3 is less than 4 . It is more useful to employ them as symbols for dyadic functions defined to yield the value 1 if the indicated relation actually holds, and the value zero if it does not. Thus 3≤4 yields the value 1 , and 5+(3≤4) yields the value 6 . Arrays. The ability to refer to collections or arrays of items is an important element in any natural language and is equally important in mathematics. The notation of vector algebra embodies the use of arrays (vectors, matrices, 3dimensional arrays; etc.) but in a manner which is difficult to learn and limited primarily to the treatment of linear functions. Arrays are not normally included in elementary algebra, probably because they are thought to be difficult to learn and not relevant to elementary topics. A vector (that is, a 1dimensional array) can be represented by a list of its elements (e.g., 1 3 5 7) and all functions can be assumed to be applied elementbyelement. For example: 1 2 3 4 × 4 3 2 1 produces Similarly: 1 2 3 4 + 4 3 2 1 5 5 5 5 ! 1 2 3 4 1 2 6 24 1 2 3 4 * 2 1 4 9 16 2 * 1 2 3 4 2 4 8 16 In addition to applying a function to each element of an array, it is also necessary to be able to apply some specified function to the collection itself. For example, “Take the sum of all elements”, or “Take the product of all elements”, or “Take the maximum of all elements”. This can be denoted as follows: +/2 5 3 2 12 ×/2 5 3 2 60 ⌈/2 5 3 2 5 The rules for using such vectors are simple and obvious from the foregoing examples. Vectors are relevant to elementary mathematics in a variety of ways. For example:
Constants. Conventional notation provides means for writing any positive constant (e.g., 17 or 3.14) but there is no distinct notation for negative constants, since the symbol  occurring in a number like 35 is indistinguishable from the symbol for the negation function. Thus negative thirtyfive is written as an expression, which is much as if we neglected to have symbols for five and zero because expressions for them could be written in a variety of ways such as 83 and 88 . It seems advisable to follow Beberman [1] in using a raised minus sign to denote negative numbers. For example: 3  5 4 3 2 1 ¯2 ¯1 0 1 2 Conventional notation also provides no convenient way to represent numbers which are easily expressed in expressions of the form 2.14×10^{8} or 3.265×10^{¯9} . A useful practice widely used in computer languages is to replace the symbols ×10 by the symbol E (for exponent) as follows: 2.14E8 and 3.265E¯9 . Order of execution. The order of execution in an algebraic expression is commonly specified by parentheses. The rules for parentheses are very simple, but the rules which apply in the absense of parentheses are complex and chaotic. They are based primarily on a hierarchy of functions (e.g., the power function is executed before multiplication, which is executed before addition) which has apparently arisen because of its convenience in writing polynomials. Viewed as a matter of language, the only purpose of such rules is the potential economy in the use of parentheses and the consequent gain in readability of complex expressions. Economy and simplicity can be achieved by the following rule: parentheses are obeyed as usual and otherwise expressions are evaluated from right to left with all functions being treated equally. The advantages of this rule and the complex and ambiguity of conventional rules are discussed in Berry [2], page 27 and in Iverson [3], Appendix A. Even polynomials can be conveniently written without parentheses if use is made of vectors. For example, the polynomial in X with coefficients 3 1 2 4 can be written without parentheses as +/3 1 2 4 × X * 0 1 2 3 . Moreover, Horner’s expression for the efficient evaluation of this same polynomial can also be written without parentheses as follows: 3+X×1+X×2+X×4 Analogies with natural language. The arithmetic expression 3×4 can be viewed as an order to do something, that is, multiply the arguments 3 and 4 . Similarly, a more complex expression can be viewed as an order to perform a number of operations in a specified order. In this sense, an arithmetic expression is an imperative sentence, and a function corresponds to an imperative verb in natural language. Indeed, the word “function” derives from the Latin verb “fungi” meaning “to perform”. This view of a function does not conflict with the usual mathematical definition as a specified correspondence between the elements of domain and range, but rather supplements this static view with a dynamic view of a function as that which produces the corresponding value for any specified element of the domain. If functions correspond to imperative verbs, then their arguments (the things upon which they act) correspond to nouns. In fact, the word “argument” has (or at least had) the meaning topic, theme, or subject. Moreover, the positive integers, being the most concrete of arithmetical objects, may be said to correspond to proper nouns. What are the roles of negative numbers, rational numbers, irrational numbers, and complex numbers? The subtraction function, introduced as an inverse to addition, yields positive integers in some cases but not in others, and negative numbers are introduced to refer to the results in these cases. In other words, a negative number refers to a process or the result of a process, and is therefore analogous to an abstract noun. For example, the abstract noun “justice” refers not to some concrete object (examples of which one may point to) but to a process or result of a process. Similarly, rational and complex numbers refer to the results of processes; division, and finding the zeros of polynomials, respectively. A.3 Algebraic Notation Names. An expression such as 3×X can be evaluated only if the variable X has been assigned an actual value. In one sense, therefore, a variable corresponds to a pronoun whose referent must be made clear before any sentence including it can be fully understood. In English the referent may be made clear by an explicit statement, but is more often made clear by indirection (e.g., “See the door. Close it.”), or by context. In conventional algebra, the value assigned to a variable name is usually made clear informally by some statement such as “Let X have the value 6 ” or “Let X=6 ”. Since the equal symbol (that is, '=') is also used in other ways, it is better to avoid its use for this purpose and to use a distinct symbol as follows: X←6 Y←3×4 X+Y 18 (X3)×(X5) 3 Assigning names to expressions. In the foregoing example, the expression (X3)×(X5) was written as an instruction to evaluate the expression for a particular value already assigned to X . One also writes the same expression for the quite different notation “Consider the expression (X3)×(X5) for any value which might later be assigned to the argument X .” This is a distinct notion which should be represented by distinct notation. The idea is to be able to refer to the expression and this can be done by assigning a name to it. The following notation serves: ∇ Z ← G X Z←(X3)×(X5)∇ The ∇’s indicate that the symbols between them define a function; the first line shows that the name of the function is G . The names X and Z are dummy names standing for the argument and result, and the second line shows how they are related. Following this definition, the name G may be used as a function. For example: G 6 3 G 1 2 3 4 5 6 7 8 3 0 ¯1 0 3 8 Iterative functions can be defined with equal ease as shown in Chapter 12. Form of names. If the variables occurring in algebraic sentences are viewed simply as names, it seems reasonable to employ names with some mnemonic significance as illustrated by the following sequence: LENGTH←6 WIDTH←5 AREA←LENGTH×WIDTH HEIGHT←4 VOLUME←AREA×HEIGHT This is not done in conventional notation, apparently because it is ruled out by the convention that the multiplication sign may be elided; that is, AREA cannot be used as a name because it would be interpreted as A×R×E×A . This same convention leads to other anomalies as well, some of which were discussed in the section on arithmetic notation. The proposal made there (i.e., that the multiplication sign cannot be elided) will permit variable names of any length. A.4 Analogies with the Teaching of Natural Language If one views the teaching of algebra as the teaching of a language, it appears remarkable how little attention is given to the reading and writing of algebraic sentences, and how much attention is given to identities, that is, to the analysis of sentences with a view to determining other equivalent sentences; e.g., “Simplify the expression (X4) × (X+4) .” It is possible that this emphasis accounts for much of the difficulty in teaching algebra, and that the teaching and learning processes in natural languages may suggest a more effective approach. In the learning of a native language one can distinguish the following major phases:
The same phases can be distinguished in the teaching of algebraic notation:
In learning a native language, a child spends many years in the informal and formal phrases (both in and out of school) before facing the analytic phrase. By this time she has easy familiarity with the purpose of a language and the meanings of sentences which might be analyzed and transformed. The situation is quite different in most conventional courses in algebra — very little time is spent in the formal phase (reading, writing and “understanding” formal algebraic sentences) before attacking identities such as commutativity, associativity, distributivity, etc.). Indeed, students often do not realize that they might quickly check their work in “simplification” by substituting certain values for the variables occurring in the original and derived expressions and comparing the evaluated results to see if the expressions have the same “meaning”, at least for the chosen values of the variables. It is interesting to speculate on what would happen if a native language were taught in an analogous way, that is, if children were forced to analyze sentences at a stage in their development when their grasp of the purpose and meaning of sentences were as shaky as the algebra student’s grasp of the purpose and meaning of algebraic sentences. Perhaps they would fail to learn the converse, just as many students fail to learn the much simpler task of reading. Another interesting aspect of learning the nonanalytic aspects of a native language is that much (if not most) of the motivation comes not from an interest in language, but from the intrinsic interest of the material (in children’s stories, everyday dialogue, etc.) for which it is used. it is doubtful that the same is true in algebra — ruling out statements of an analytic nature (identities, etc.), how many “interesting” algebraic sentences does a student encounter? The use of arrays can open up the possibility of much more interesting algebraic sentences. This can apply both to sentences to be read (that is, evaluated) and written by students. For example, the statements: 2*1 2 3 4 5 2×1 2 3 4 5 2÷1 2 3 4 5 1 2 3 4 5÷2 1 2 3 4 5*2 1 2 3 4 5×5 4 3 2 1 produce interesting patterns and therefore have more intrinsic interest than similar expressions involving only single quantities. For example, the last expression can be construed as yielding a set of possible areas for a rectangle having a fixed perimeter of 12 . More interesting possibilities are opened up by certain simple extensions of the use of arrays. One example of such extensions will be treated here. This extension allows one to apply any dyadic function to two vectors A and B so as to obtain not simply the elementbyelement product produced by the expression A×B but a table of all products produced by pairing each element of A with each element of B . For example: A←1 2 3 B←2 3 5 7 Aº.×B Aº.+B Aº.*B 2 3 5 7 3 4 6 8 1 1 1 1 4 6 10 14 4 5 7 9 4 8 32 128 6 9 15 21 5 6 8 10 9 27 243 2187 If S←1 2 3 4 5 6 7 , then the following expressions yield an addition table, a multiplication table, a subtraction table, a maximum table, and “equal” table, and a “greater than or equal” table: Sº.+S Sº.⌈S 2 3 4 5 6 7 8 1 2 3 4 5 6 7 3 4 5 6 7 8 9 2 2 3 4 5 6 7 4 5 6 7 8 9 10 3 3 3 4 5 6 7 5 6 7 8 9 10 11 4 4 4 4 5 6 7 6 7 8 9 10 11 12 5 5 5 5 5 6 7 7 8 9 10 11 12 13 6 6 6 6 6 6 7 8 9 10 11 12 13 14 7 7 7 7 7 7 7 Sº.×S Sº.=S 1 2 3 4 5 6 7 1 0 0 0 0 0 0 2 4 6 8 10 12 14 0 1 0 0 0 0 0 3 6 9 12 15 18 21 0 0 1 0 0 0 0 4 8 12 16 20 24 28 0 0 0 1 0 0 0 5 10 15 20 25 30 35 0 0 0 0 1 0 0 6 12 18 24 30 36 42 0 0 0 0 0 1 0 7 14 21 28 35 42 49 0 0 0 0 0 0 1 Sº.S Sº.≥S 0 ¯1 ¯2 ¯3 ¯4 ¯5 ¯6 1 0 0 0 0 0 0 1 0 ¯1 ¯2 ¯3 ¯4 ¯5 1 1 0 0 0 0 0 2 1 0 ¯1 ¯2 ¯3 ¯4 1 1 1 0 0 0 0 3 2 1 0 ¯1 ¯2 ¯3 1 1 1 1 0 0 0 4 3 2 1 0 ¯1 ¯2 1 1 1 1 1 0 0 5 4 3 2 1 0 ¯1 1 1 1 1 1 1 0 6 5 4 3 2 1 0 1 1 1 1 1 1 1 Moreover, the graph of a function can be produced as an “equal” table as follows. First recall the function G defined earlier: ∇Z←G X Z←(X3)×(X5)∇ G S 8 3 0 ¯1 0 3 8 The range of the function for this set of arguments is from 8 down to ¯1 , and the elements of this range are all contained in the following vector: R←8 7 6 5 4 3 2 1 0 ¯1 Consequently, the “equal” table Rº.=G S produces a rough graph of the function (represented by 1’s) as follows: Rº.=G S 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 A.5 A Program for Elementary Algebra The following analysis suggests the development of an algebra curriculum with the following characteristics:
Such an approach has been adopted in the present text, where it has been carried through as far as the treatment of polynomials and of linear functions and linear equations. The extension to further work in polynomials, to slopes and derivatives, and to the circular and hyperbolic functions is carried forward in Chapters 48 of Iverson [3]. It must be emphasized that the proposed notation, though simple, is not limited in application to elementary algebra. A glance at the bibliography of Rault and Demars [4] will give some idea of the wide range of applicability. The role of the computer. Because the proposed notation is simple and systematic it can be executed by automatic computers and has been made available on a number of timeshared computer terminal systems. The most widely used of these is described in Falkoff and Iverson [5]. It is important to note that the notation is executed directly, and the user need learn nothing about the computer itself. In fact, each of the examples in this appendix are shown exactly as they would be typed on a computer terminal keyboard. The computer can obviously be useful in cases where a good deal of tedious computation is required, but it can be useful in other ways as well. For example, it can be used by a student to explore the behaviour of functions and discover their properties. To do this a student will simply enter expressions which apply the functions to various arguments. If the terminal is equipped with a display device, then such exploration can even be done collectively by an entire class. This and other ways of using the computer are discussed by Berry et al [6] and in Appendix C. References
First published as Appendix A, Algebra: An Algorithmic Treatment, AddisonWesley, 1972; preliminary edition entitled Elementary Algebra, IBM, 1971.
