Algebra as a Language

Algebra as a Language

Kenneth E. Iverson

A.1 Introduction

Although few mathematicians would quarrel with the proposition that the algebraic notation taught in high school is a language (and indeed the primary language of mathematics), yet little attention has been paid to the possible implications of such a view of algebra. This paper adopts this point of view to illuminate the inconsistencies and deficiencies of conventional notation and to explore the implications of analogies between the teaching of natural languages and the teaching of algebra. Based on this analysis it presents a simple and consistent algebraic notation, illustrates its power in the exposition of some familiar topics in algebra, and proposes a basis for an introductory course in algebra. Moreover, it shows how a computer can, if desired, be used in the teaching process, since the language proposed is directly usable on a computer terminal.

A.2 Arithmetic Notation

We will first discuss the notation of arithmetic, i.e., that part of algebraic notation which does not involve the use of variables. For example, the expression 3-4 and (3+4)-(5+6) are arithmetic expressions, but the expressions 3-X and (X+4)-(Y+6) are not. We will now explore the anomalies of arithmetic notation and the modifications needed to remove them.

Functions and symbols for functions. The importance of introducing the concept of “function” rather early in the mathematical curriculum is now widely recognized. Nevertheless, those functions which the student encounters first are usually referred to not as “functions” but as “operators”. For example, absolute value (|-3|) and arithmetic negation (-3) are usually referred to as operators. In fact, most of the functions which are so fundamental and so widely used that they have been assigned some graphic symbol are commonly called operators (particularly those functions such as plus and times which apply to two arguments), whereas the less common functions which are usually referred to by writing out their names (e.g. Sin, Cos, Factorial) are called functions.

This practice of referring to the most common and most elementary functions as operators is surely an unnecessary obstacle to the understanding of functions when that term is first applied to the more complex functions encountered. For this reason the term “function” will be used here for all functions regardless of the choice of symbols used to represent them.

The functions of elementary algebra are of two types, taking either one argument or two. Thus addition is a function of two arguments (denoted by X+Y) and negation is a function of one argument (denoted by -Y). It would seem both easy and reasonable to adopt one form for each type of function as suggested by the foregoing examples, that is, the symbol for a function of two arguments occurs between its arguments, and the symbol for a function of one argument occurs before its argument. Conventional notation displays considerable anarchy on this point:

	1.		Certain functions are denoted by any one of several symbols which are supposed to be synonymous but which are, however, used in subtly different ways. For example, in conventional algebra `X×Y` and `XY` both denote the product of `X` and `Y` . However, one would write either `3×Y` or `3X` or `X×3` or `3×4` , but would not likely accept `X3` as an expression for `X×3` , nor `3 4` as an expression for `3×4` . Similarly, `X÷Y` and `X/Y` are supposed to be synonymous, but in the sentence “Reduce `8/6` to lowest terms”, the symbol `/` does not stand for division.
	2.		The power function has no symbol, and is denoted by position only, as in `X^N` . The same notation is often used to denote the `N`th element of a family or array `X` .
	3.		The remainder function (that is, the integer remainder of dividing `X` into `Y`) is used very early in arithmetic (e.g., in factoring) but is commonly not recognized as a function on par with addition, division, etc., nor assigned a symbol. Because the remainder function has no symbol and is commonly evaluated by the method of long division, there is a tendency to confuse it with division. This confusion is compounded by the fact that the term “quotient” itself is ambiguous, sometimes meaning the quotient and sometimes the integer part of the quotient.
	4.		The symbol for a function of one argument sometimes occurs before the argument (as in `-4`) but may also occur after it (as in `4!` for factorial `4`) or on both sides (as in `\|X\|` for absolute value of `X`).

Table A.1 shows a set of symbols which can be used in a simple consistent manner to denote the functions mentioned thus far, as well as a few other very useful basic functions such as maximum, minimum, integer part, reciprocal, and exponential. The table shows two uses for each symbol, one to denote a monadic function (i.e. a function of one argument), and one to denote a dyadic function (i.e. a function of two arguments). This is simply a systematic exploitation of the example set by the familiar use of the minus sign, either as a dyadic function (i.e., subtraction as in 4-3) or as a monadic function (i.e., negation as in -3). No function symbol is permitted to be elided; for example, X×Y may not be written as XY .

Monadic form f b f Dyadic form a f b

Definition or example

+3 ↔ 0+3

-3 ↔ 0-3

×3 ↔ (3>0)-(3<0)

÷3 ↔ 1÷3

⌈ 3.14 ↔ 4

⌈¯3.14 ↔ ¯3

⌊ 3.14 ↔ 3

⌊¯3.14 ↔ ¯4

*3 ↔ (2.71828...)*3

⍟*5 ↔ 5 ↔*⍟5

|¯3.14 ↔ 3.14

Name

Plus

Negative

Signum

Reciprocal

Ceiling

Floor

Exponential

Natural

logarithm

Magnitude

+

-

×

÷

⌈

⌊

*

⍟

|

Name

Plus

Minus

Times

Divide

Maximum

Minimum

Power

Logarithm

Remainder

Definition or example

2+3.2 ↔ 5.2

2-3.2 ↔ ¯1.2

2×3.2 ↔ 6.4

2÷3.2 ↔ 0.625

3⌈7↔ 7

3⌊7↔ 3

2*3↔ 8

10⍟3 ↔ Log 3 base 10

10⍟3 ↔ (⍟3)÷⍟10

3|8 ↔ 2

Table A.1

A little experimentation with the notation of Table A.1 will show that it can be used to express clearly a number of matters which are awkward of impossible to express in conventional notation. For example, X÷Y is the quotient of X divided by Y ; either ⌊(X÷Y) or ((X-(Y|X))÷Y yield the integer part of the quotient of X divided by Y ; and X⌈(-X) is equivalent to |X .

In conventional notation the symbols < , ≤ , = , ≥ , > , and ≠ are used to state relations among quantities; for example, the expression 3<4 asserts that 3 is less than 4 . It is more useful to employ them as symbols for dyadic functions defined to yield the value 1 if the indicated relation actually holds, and the value zero if it does not. Thus 3≤4 yields the value 1 , and 5+(3≤4) yields the value 6 .

Arrays. The ability to refer to collections or arrays of items is an important element in any natural language and is equally important in mathematics. The notation of vector algebra embodies the use of arrays (vectors, matrices, 3-dimensional arrays; etc.) but in a manner which is difficult to learn and limited primarily to the treatment of linear functions. Arrays are not normally included in elementary algebra, probably because they are thought to be difficult to learn and not relevant to elementary topics.

A vector (that is, a 1-dimensional array) can be represented by a list of its elements (e.g., 1 3 5 7) and all functions can be assumed to be applied element-by-element. For example:

1 2 3 4 × 4 3 2 1 produces
4 6 6 4

Similarly:

      1 2 3 4 + 4 3 2 1
5 5 5 5
      ! 1 2 3 4
1 2 6 24
      1 2 3 4 * 2
1 4 9 16
      2 * 1 2 3 4
2 4 8 16

In addition to applying a function to each element of an array, it is also necessary to be able to apply some specified function to the collection itself. For example, “Take the sum of all elements”, or “Take the product of all elements”, or “Take the maximum of all elements”. This can be denoted as follows:

The rules for using such vectors are simple and obvious from the foregoing examples. Vectors are relevant to elementary mathematics in a variety of ways. For example:

1.		They can be used (as in the foregoing examples) to display the patterns produced by various functions when applied to certain patterns of arguments.
2.		They can be used to represent points in coordinate geometry. Thus `5 7 19` and `2 3 7` represent two points, `5 7 19 - 2 3 7` yields `3 4 12` , the displacement between them, and `(+/(5 7 19 - 2 3 7)2).5` yields `13` , the distance between them.
3.		They can be used to represent rational numbers. Thus if `3 4` represents the fraction three-fourths, then `3 4×5 6` yields `15 24` , the product of the fractions represented by `3 4` and `5 6` . Moreover, `÷/3 4` and `÷/5 6` and `÷/15 24` yield the actual numbers represented.
4.		A polynomial can be represented by its vector of coefficients and vector of exponents. For example, the polynomial with coefficients `3 1 2 4` and exponents `0 1 2 3` can be evaluated for the argument `5` by the following expression: +/3 1 2 4 × 5 * 0 1 2 3 558

Constants. Conventional notation provides means for writing any positive constant (e.g., 17 or 3.14) but there is no distinct notation for negative constants, since the symbol - occurring in a number like -35 is indistinguishable from the symbol for the negation function. Thus negative thirty-five is written as an expression, which is much as if we neglected to have symbols for five and zero because expressions for them could be written in a variety of ways such as 8-3 and 8-8 .

It seems advisable to follow Beberman [1] in using a raised minus sign to denote negative numbers. For example:

      3 - 5 4 3 2 1
¯2 ¯1 0 1 2

Conventional notation also provides no convenient way to represent numbers which are easily expressed in expressions of the form 2.14×10⁸ or 3.265×10^¯9 . A useful practice widely used in computer languages is to replace the symbols ×10 by the symbol E (for exponent) as follows: 2.14E8 and 3.265E¯9 .

Order of execution. The order of execution in an algebraic expression is commonly specified by parentheses. The rules for parentheses are very simple, but the rules which apply in the absence of parentheses are complex and chaotic. They are based primarily on a hierarchy of functions (e.g., the power function is executed before multiplication, which is executed before addition) which has apparently arisen because of its convenience in writing polynomials.

Viewed as a matter of language, the only purpose of such rules is the potential economy in the use of parentheses and the consequent gain in readability of complex expressions. Economy and simplicity can be achieved by the following rule: parentheses are obeyed as usual and otherwise expressions are evaluated from right to left with all functions being treated equally. The advantages of this rule and the complex and ambiguity of conventional rules are discussed in Berry [2], page 27 and in Iverson [3], Appendix A. Even polynomials can be conveniently written without parentheses if use is made of vectors. For example, the polynomial in X with coefficients 3 1 2 4 can be written without parentheses as +/3 1 2 4 × X * 0 1 2 3 . Moreover, Horner’s expression for the efficient evaluation of this same polynomial can also be written without parentheses as follows:

      3+X×1+X×2+X×4

Analogies with natural language. The arithmetic expression 3×4 can be viewed as an order to do something, that is, multiply the arguments 3 and 4 . Similarly, a more complex expression can be viewed as an order to perform a number of operations in a specified order. In this sense, an arithmetic expression is an imperative sentence, and a function corresponds to an imperative verb in natural language. Indeed, the word “function” derives from the Latin verb “fungi” meaning “to perform”.

This view of a function does not conflict with the usual mathematical definition as a specified correspondence between the elements of domain and range, but rather supplements this static view with a dynamic view of a function as that which produces the corresponding value for any specified element of the domain.

If functions correspond to imperative verbs, then their arguments (the things upon which they act) correspond to nouns. In fact, the word “argument” has (or at least had) the meaning topic, theme, or subject. Moreover, the positive integers, being the most concrete of arithmetical objects, may be said to correspond to proper nouns.

What are the roles of negative numbers, rational numbers, irrational numbers, and complex numbers? The subtraction function, introduced as an inverse to addition, yields positive integers in some cases but not in others, and negative numbers are introduced to refer to the results in these cases. In other words, a negative number refers to a process or the result of a process, and is therefore analogous to an abstract noun. For example, the abstract noun “justice” refers not to some concrete object (examples of which one may point to) but to a process or result of a process. Similarly, rational and complex numbers refer to the results of processes; division, and finding the zeros of polynomials, respectively.

A.3 Algebraic Notation

Names. An expression such as 3×X can be evaluated only if the variable X has been assigned an actual value. In one sense, therefore, a variable corresponds to a pronoun whose referent must be made clear before any sentence including it can be fully understood. In English the referent may be made clear by an explicit statement, but is more often made clear by indirection (e.g., “See the door. Close it.”), or by context.

In conventional algebra, the value assigned to a variable name is usually made clear informally by some statement such as “Let X have the value 6 ” or “Let X=6 ”. Since the equal symbol (that is, '=') is also used in other ways, it is better to avoid its use for this purpose and to use a distinct symbol as follows:

      X←6
      Y←3×4
      X+Y
18
      (X-3)×(X-5)
3

Assigning names to expressions. In the foregoing example, the expression (X-3)×(X-5) was written as an instruction to evaluate the expression for a particular value already assigned to X . One also writes the same expression for the quite different notation “Consider the expression (X-3)×(X-5) for any value which might later be assigned to the argument X .” This is a distinct notion which should be represented by distinct notation. The idea is to be able to refer to the expression and this can be done by assigning a name to it. The following notation serves:

      ∇ Z ← G X
      Z←(X-3)×(X-5)∇

The ∇’s indicate that the symbols between them define a function; the first line shows that the name of the function is G . The names X and Z are dummy names standing for the argument and result, and the second line shows how they are related.

Following this definition, the name G may be used as a function. For example:

      G 6
3
      G 1 2 3 4 5 6 7
8 3 0 ¯1 0 3 8

Iterative functions can be defined with equal ease as shown in Chapter 12.

Form of names. If the variables occurring in algebraic sentences are viewed simply as names, it seems reasonable to employ names with some mnemonic significance as illustrated by the following sequence:

      LENGTH←6
      WIDTH←5
      AREA←LENGTH×WIDTH
      HEIGHT←4
      VOLUME←AREA×HEIGHT

This is not done in conventional notation, apparently because it is ruled out by the convention that the multiplication sign may be elided; that is, AREA cannot be used as a name because it would be interpreted as A×R×E×A .

This same convention leads to other anomalies as well, some of which were discussed in the section on arithmetic notation. The proposal made there (i.e., that the multiplication sign cannot be elided) will permit variable names of any length.

A.4 Analogies with the Teaching of Natural Language

If one views the teaching of algebra as the teaching of a language, it appears remarkable how little attention is given to the reading and writing of algebraic sentences, and how much attention is given to identities, that is, to the analysis of sentences with a view to determining other equivalent sentences; e.g., “Simplify the expression (X-4) × (X+4) .” It is possible that this emphasis accounts for much of the difficulty in teaching algebra, and that the teaching and learning processes in natural languages may suggest a more effective approach.

In the learning of a native language one can distinguish the following major phases:

1.		An informal phase, in which the child learns to communicate in a combination of gestures, single words, etc., but with no attempt to form grammatical sentences.
2.		A formal phase, in which the child learns to communicate in formal sentences. This phase is essential because it is difficult or impossible to communicate complex matters with precision without imposing some formal structure on the language.
3.		An analytics phase, in which one learns to analyze sentences with a view to determining equivalent (and perhaps “simpler” or “more effective”) sentences. The extreme case of such analysis is Aristotelian Logic, which attempts a formal analysis of certain classes of sentences. More practical everyday cases occur every time one carefully reads a composition and suggests alternative sentences which convey the same meaning in a briefer or simpler form.

The same phases can be distinguished in the teaching of algebraic notation:

An informal phase in which one issues an instruction to add 2 and 3 in any way which will be understood. For example:

      2 + 3               Add 2 and 3

          2                   2
          3                  +3
        ---                 ---

      Add two and three

      Add // and ///

The form of the expression is unimportant, provided that the instruction is understood.

A formal phase in which one emphasizes proper sentence structure and would not accept expressions such as

                 2
           6 ×   3          or    6 × (add two and three)
               ---

in lieu of 6×(2+3) . Again, adherence to certain structural rules is necessary to permit the precise communication of complex matters.

3. An analytic phrase in which one learns to analyze sentences with a view to establishing certain relations (usually identity) among them. Thus one learns not only that 3+4 is equal to 4+3 but that the sentences X+Y and Y+X are equivalent, that is, yield the same result whatever the meanings are assigned to the pronouns X and Y .

In learning a native language, a child spends many years in the informal and formal phrases (both in and out of school) before facing the analytic phrase. By this time she has easy familiarity with the purpose of a language and the meanings of sentences which might be analyzed and transformed. The situation is quite different in most conventional courses in algebra — very little time is spent in the formal phase (reading, writing and “understanding” formal algebraic sentences) before attacking identities such as commutativity, associativity, distributivity, etc.). Indeed, students often do not realize that they might quickly check their work in “simplification” by substituting certain values for the variables occurring in the original and derived expressions and comparing the evaluated results to see if the expressions have the same “meaning”, at least for the chosen values of the variables.

It is interesting to speculate on what would happen if a native language were taught in an analogous way, that is, if children were forced to analyze sentences at a stage in their development when their grasp of the purpose and meaning of sentences were as shaky as the algebra student’s grasp of the purpose and meaning of algebraic sentences. Perhaps they would fail to learn the converse, just as many students fail to learn the much simpler task of reading.

Another interesting aspect of learning the non-analytic aspects of a native language is that much (if not most) of the motivation comes not from an interest in language, but from the intrinsic interest of the material (in children’s stories, everyday dialogue, etc.) for which it is used. it is doubtful that the same is true in algebra — ruling out statements of an analytic nature (identities, etc.), how many “interesting” algebraic sentences does a student encounter?

The use of arrays can open up the possibility of much more interesting algebraic sentences. This can apply both to sentences to be read (that is, evaluated) and written by students. For example, the statements:

      2*1 2 3 4 5
      2×1 2 3 4 5
      2÷1 2 3 4 5
      1 2 3 4 5÷2
      1 2 3 4 5*2
      1 2 3 4 5×5 4 3 2 1

produce interesting patterns and therefore have more intrinsic interest than similar expressions involving only single quantities. For example, the last expression can be construed as yielding a set of possible areas for a rectangle having a fixed perimeter of 12 .

More interesting possibilities are opened up by certain simple extensions of the use of arrays. One example of such extensions will be treated here. This extension allows one to apply any dyadic function to two vectors A and B so as to obtain not simply the element-by-element product produced by the expression A×B but a table of all products produced by pairing each element of A with each element of B . For example:

      A←1 2 3
      B←2 3 5 7

      Aº.×B                Aº.+B              Aº.*B 
   2 3  5  7            3 4 6  8           1  1   1    1 
   4 6 10 14            4 5 7  9           4  8  32  128 
   6 9 15 21            5 6 8 10           9 27 243 2187

If S←1 2 3 4 5 6 7 , then the following expressions yield an addition table, a multiplication table, a subtraction table, a maximum table, and “equal” table, and a “greater than or equal” table:

      Sº.+S                         Sº.⌈S
   2 3  4  5  6  7  8            1 2 3 4 5 6 7 
   3 4  5  6  7  8  9            2 2 3 4 5 6 7 
   4 5  6  7  8  9 10            3 3 3 4 5 6 7 
   5 6  7  8  9 10 11            4 4 4 4 5 6 7 
   6 7  8  9 10 11 12            5 5 5 5 5 6 7 
   7 8  9 10 11 12 13            6 6 6 6 6 6 7 
   8 9 10 11 12 13 14            7 7 7 7 7 7 7 

      Sº.×S                         Sº.=S
   1  2  3  4  5  6  7           1 0 0 0 0 0 0 
   2  4  6  8 10 12 14           0 1 0 0 0 0 0 
   3  6  9 12 15 18 21           0 0 1 0 0 0 0 
   4  8 12 16 20 24 28           0 0 0 1 0 0 0 
   5 10 15 20 25 30 35           0 0 0 0 1 0 0 
   6 12 18 24 30 36 42           0 0 0 0 0 1 0 
   7 14 21 28 35 42 49           0 0 0 0 0 0 1 

      Sº.-S                         Sº.≥S
   0 ¯1 ¯2 ¯3 ¯4 ¯5 ¯6           1 0 0 0 0 0 0 
   1  0 ¯1 ¯2 ¯3 ¯4 ¯5           1 1 0 0 0 0 0 
   2  1  0 ¯1 ¯2 ¯3 ¯4           1 1 1 0 0 0 0 
   3  2  1  0 ¯1 ¯2 ¯3           1 1 1 1 0 0 0 
   4  3  2  1  0 ¯1 ¯2           1 1 1 1 1 0 0 
   5  4  3  2  1  0 ¯1           1 1 1 1 1 1 0 
   6  5  4  3  2  1  0           1 1 1 1 1 1 1

Moreover, the graph of a function can be produced as an “equal” table as follows. First recall the function G defined earlier:

      ∇Z←G X
      Z←(X-3)×(X-5)∇

      G S
8 3 0 ¯1 0 3 8

The range of the function for this set of arguments is from 8 down to ¯1 , and the elements of this range are all contained in the following vector:

      R←8 7 6 5 4 3 2 1 0 ¯1

Consequently, the “equal” table Rº.=G S produces a rough graph of the function (represented by 1’s) as follows:

      Rº.=G S
1 0 0 0 0 0 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 0 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 1 0 1 0 0
0 0 0 1 0 0 0

A.5 A Program for Elementary Algebra

The following analysis suggests the development of an algebra curriculum with the following characteristics:

	1.		The notation used is unambiguous, with simple and consistent rules of syntax, and with provision for the simple and direct use of arrays. Moreover, the notation is not taught as a separate matter, but is introduced as needed in conjunction with the concepts represented.
	2.		Heavy use is made of arrays to display mathematical properties of functions in terms of patterns observed in vectors and matrices (tables), and the make possible the reading, writing, and evaluation of a host of interesting algebraic sentences before approaching the analysis of sentences and the concomitant development of identities.

Such an approach has been adopted in the present text, where it has been carried through as far as the treatment of polynomials and of linear functions and linear equations. The extension to further work in polynomials, to slopes and derivatives, and to the circular and hyperbolic functions is carried forward in Chapters 4-8 of Iverson [3].

It must be emphasized that the proposed notation, though simple, is not limited in application to elementary algebra. A glance at the bibliography of Rault and Demars [4] will give some idea of the wide range of applicability.

The role of the computer. Because the proposed notation is simple and systematic it can be executed by automatic computers and has been made available on a number of time-shared computer terminal systems. The most widely used of these is described in Falkoff and Iverson [5]. It is important to note that the notation is executed directly, and the user need learn nothing about the computer itself. In fact, each of the examples in this appendix are shown exactly as they would be typed on a computer terminal keyboard.

The computer can obviously be useful in cases where a good deal of tedious computation is required, but it can be useful in other ways as well. For example, it can be used by a student to explore the behaviour of functions and discover their properties. To do this a student will simply enter expressions which apply the functions to various arguments. If the terminal is equipped with a display device, then such exploration can even be done collectively by an entire class. This and other ways of using the computer are discussed by Berry et al [6] and in Appendix C.

References

1.		Beberman, M., and H. E. Vaughan, High School Mathematics Course 1, Heath, 1964.
2.		Berry, P. C., APL\360 Primer, IBM Corp., 1969.
3.		Iverson, K. E., Elementary Functions: An Algorithmic Treatment, Science Research Associates, 1966.
4.		Rault., J. C., and G. Demars, “Is APL Epidemic? Or a study of its growth through an extended bibliography”, Fourth International APL User’s Conference, Board of Education of the City of Atlanta, Georgia, 1972.
5.		Falkoff, A. D., and K. E. Iverson, APL Language, Form Number GC26-3847, IBM Corp.
6.		Berry, P. C., A. D. Falkoff, and K. E. Iverson, “Using the Computer to Compute: A Direct but Neglected Approach to Teaching Mathematics”, IFIP World Conference on Computer Education, Amsterdam, August 24-28, 1970.
7.		Iverson, K.E., Introducing APL to Teachers, IBM Philadelphia Scientific Center Technical Report No. 320-3014, IBM Corp., 1972
8.		Berry, P.C., G. Bartoli, C. Dell’Aquila and V. Spadavecchia, APL and Insight: Using Functions to Represent Concepts in Teaching, IBM Philadelphia Scientific Center Technical Report No. 320-3009, IBM Corp., December 1971.
9.		Iverson, K. E., The Use of APL in Teaching, IBM Publication No. G320-0996, IBM Corp., 1969.

First published as Appendix A, Algebra: An Algorithmic Treatment, Addison-Wesley, 1972; preliminary edition entitled Elementary Algebra, IBM, 1971.

created:	2009-05-27 10:05
updated:	2014-05-28 14:15