﻿ TAO Axioms

TAO Axioms

Adám Brudzewsky
Jay Foad
Roger Hui

I. Introduction

This paper describes the extensions to monadic and in Dyalog APL version 17.0, under the rubric “total array ordering” or TAO. (Sinceis based onit too is similarly extended.)

The goal of TAO is to be able to sort any array. An “array” is an array whose items are numbers, characters, ⎕null , or other arrays. For now, the ordering is not total because ⎕or and namespaces and … are excluded, because for now ordering those other things would be contentious but of little incremental benefit.

TAO posits a dyadic function which indicates how two arrays compare to each other.

 ⍺ cmp ⍵  = ¯1 if ⍺ strictly precedes ⍵ 0 if ⍺ exactly matches ⍵ 1 if ⍺ strictly follows ⍵

An alternative is a boolean function le which indicates whetherprecedes or matches (is “less than or equal to” in the ordering). The “signum” result of cmp is handier in practice, easier to use in quicksort, for instance. In any case, cmp and le can be defined in terms of the other:

```   ⍺ cmp ⍵  ←→  (⍵ le ⍺) - (⍺ le ⍵)
⍺ le  ⍵  ←→  1 > ⍺ cmp ⍵

```

II. Axioms

0. Backward Compatible

If ⍋⍵ has a result in current Dyalog APL, then it has the same result under the TAO. This means, among other things, that:

 • TAO comparisons are exact, that is, as if ⎕ct←⎕dct←0 and ⎕fr←1287 . • If ⍺ and ⍵ are (scalar) characters, then they are compared according to their internal bit patterns as unsigned integers, ⎕ucs x in Unicode or ⎕av⍳x in Classic.

1. Itemwise in Ravel Order

Ifandhave the same shape and are non-empty, then

```   ⍺ cmp ⍵  ←→  ⊃ 0 ~⍨ , ⍺ cmp¨ ⍵
```

That is, the result of ⍺ cmp ⍵ is the first non-zero value in comparing corresponding items in ravel order. If the comparisons are all zeros, then the result is 0 .

 Notes: • The existing ⍋ is consistent with this axiom. • In practice, proceeding in ravel order, you’d stop comparing items as soon as you found a pair which do not match.

2. Complex Numbers

Ifandare (scalar) complex numbers, then

```   ⍺ cmp ⍵  ←→  (9 11○⍺) cmp (9 11○⍵)
```

That is, compare the real and imaginary parts.

 Notes: • Because of the itemwise axiom, the imaginary parts matter only if the real parts match. • In school it is taught that “the complex numbers can not be ordered”. What about that? The complete statement is “the complex numbers can not be ordered if field (arithmetic) properties are preserved”. A TAO does not (and can not) claim to order the complex numbers and preserve arithmetic properties.

Interlude

We have come to the end of the easy parts. The axioms listed so far are likely non-controversal. Additional axioms are required. We can expect them to be somewhat arbitrary, even more so than the axioms already listed.

 • ⎕null v numbers v characters • arrays with the same rank but different shapes • arrays with different ranks

3. Type: ⎕null v Number v Character

How do a ⎕null , a number, and a character compare? There are six simple choices; any of the six axioms, together with the other axioms, derive a TAO which is consistent and complete.

Dyalog APL implements the following choice: ⎕null precedes numbers, and numbers precede characters.

4. Shape: Dictionary Order

Ifand have the same rank and are not both empty, then

```   ⍺ cmp ⍵  ←→  (s pad ⍺) cmp (s pad ⍵) ⊣ s←(⍴⍺)⌈(⍴⍵)
```

Padandas in s↑x to the common maximal shape s but the fill is a hypothetical array β which precedes all other arrays, then compare.

T0.  Ifandhave the same rank but different shapes, and are not both empty, then

```   ⍺ cmp ⍵  ←→  ⊃ 0 ~⍨ ((m⍴⍺)cmp¨(m⍴⍵)), (⌽⍴⍺)cmp(⌽⍴⍵)
```

where
m ← ×/ (∨\ ⌽<\⌽ (⍴⍺)≠(⍴⍵)) / (⍴⍺)⌊(⍴⍵)

This can be seen as follows. In axiom 4, s pad ⍺ and s pad ⍵ are two arrays with the same shape. Comparing corresponding items in ravel order (axiom 1), we can stop no later than the first point where one item is β and the other is not. The number of items before that point is the value m calculated above, the product of the minimal shape elements including and after the rightmost different element in the shape.

If (m⍴⍺)cmp¨(m⍴⍵) has a non-zero element, the first such element is the result of ⍺ cmp ⍵ ; if it is all zeros, then the argument shorter in the rightmost different axis precedes, because that argument would have a β item and the other one not; that is, (⌽⍴⍺)cmp(⌽⍴⍵) . QED.

Example a:

```   ⍴⍺
1 2 3 5 7 11
⍴⍵
2 1 3 4 7 11
```

then at most 308=×/4 7 11 items need to be compared.

Example b:

```   ⍴⍺                 ⍴⍵
3 5                4 3
⍺                  ⍵
3 2 7 3 4          1 8 9
5 3 5 7 0          7 7 2
2 3 9 1 6          3 9 7
7 2 8
```

andcan be compared per axiom 4, with both arguments padded by β to the maximal shape s . In such comparison, at most m items need be compared, per T0.

```   ⊢ s←(⍴⍺)⌈(⍴⍵)
4 5

s pad ⍺            s pad ⍵
3 2 7 3 4          1 8 9 β β
5 3 5 7 0          7 7 2 β β
2 3 9 1 6          3 9 7 β β
β β β β β          7 2 8 β β

⊢ m ← ×/ (∨\ ⌽<\⌽ (⍴⍺)≠(⍴⍵)) / (⍴⍺)⌊(⍴⍵)
3
×/ ⊃⌽ (⍴⍺) (≠⊂⌊) (⍴⍵)
3
```

T1. If andhave the same rank and exactly one oforis empty, then

```   ⍺ cmp ⍵  ←→  (0∊⍴⍵)-(0∊⍴⍺)
```

Ifis empty and is not, s pad ⍺ is an array of β and s pad ⍵ is not, and soprecedes . Similarlyfollows ifis not empty butis.

 Notes: • The hypothetical array β is not necessarily a scalar, nor numeric, nor character. It is used as scaffolding or bookkeeping device to derive more convenient formulations which do not explicitly use β. Neither T0 nor T1 explicitly use β. • Why the “and are not both empty” caveat in the antecedent of axiom 4? If ⍺ and ⍵ have the same rank but are both empty, padding to the common maximal shape leads to two possibilities: a. The maximal shape s contains a zero. Padding produces two empty arrays with the same shape. The comparison remains unresolved: still comparing two empty arrays, but with possibly less information than before (about the shapes). b. The maximal shape s does not contain a zero. Padding produces two non-empty identical arrays, and comparing them gives a 0 result. For example: ``` (0 4⍴'a')cmp(4 0⍴0) ←→ (4 4⍴β)cmp(4 4⍴β) ←→ 0 ``` This is incorrect if the original arrays do not match. • In T0, it’d be incorrect to use (m⍴⍺)cmp(m⍴⍵) instead of (m⍴⍺)cmp¨(m⍴⍵) , because in the event of a tie the former would lead to the prototypes being compared. • In T0, the value m can also be computed as m←×/⊃⌽(⍴⍺)(≠⊂⌊)(⍴⍵) . • This axiom, together with the previous ones, dictate the following results: ``` 'short' cmp 'sesquipedalian' 1 1 2 3 cmp 1 2 3 ¯4 ¯5 ¯1 (x⍪8) cmp (x,8) ⊣ x←2 2⍴1 2 3 4 ¯1 ```

5. Empty v Empty

Ifandhave the same rank and are both empty, then

```   ⍺ cmp ⍵  ←→  (⍺⍴⍨1+⍴⍺) cmp (⍵⍴⍨1+⍴⍵)
```
 Notes: • This axiom dictates that if ⍺ and ⍵ are both empty, comparisons are first decided by prototypes, then by shapes. • If ⍺ and ⍵ are both empty and have the same shape, both axiom 1 and axiom 5 are applicable, and lead to the same results.

6. Rank: Visual Fidelity

It remains to resolve the case where andhave different ranks.

```   ⍺ cmp ⍵  ←→  x cmp ⍵
←→  ⍺ cmp y
```
if (,⍺)≡(,x) and the display of matches that of x (if the elements ofand x match and“looks like” x); similarly if (,⍵)≡(,y) and the display of matches that of y .

 Notes: • This axiom “justifies” how to compare arrays with different ranks. Alternatively, you can dispense with the visual fidelity mumbo jumbo and just say by fiat that ``` ⍺ cmp ⍵ ←→ ((⊂r⍴⍺),⍴⍴⍺) cmp ((⊂s⍴⍵),⍴⍴⍵) ``` where r←(⍴⍺),⍨(m-⍴⍴⍺)⍴1 and s←(⍴⍵),⍨(m-⍴⍴⍵)⍴1 with m being the larger of the ranks of ⍺ and ⍵ . That is, a smaller-rank array is extended with leading 1-axes, and a tie in the resultant comparison is resolved in favor of the smaller-rank array. • Using axiom 6, T1 can be simplified to say: If exactly one of ⍺ or ⍵ is empty, then ``` ⍺ cmp ⍵ ←→ (0∊⍴⍵)-(0∊⍴⍺) ``` • This axiom, together with the previous ones, dictate that ``` 'aardvark' cmp 'z' ¯1 1 2 3 cmp 999 ¯1 (2 4⍴⍳8) cmp 9 10 11 ¯1 ``` These would not be the results if a smaller-rank array always precedes a larger-rank one.

III. Model

The present model does not make the simplest use of the axioms, but instead exploits properties readily derivable from them. It arbitrarily chooses to make ⎕null precede numbers precede characters, and only models Unicode (non-Classic).

```cmp←{  ⍝ compare basic arrays, results in ¯1, 0, or 1
⎕ct←⎕dct←0
tp ←{(⍵≡⎕null)+(326≠t)×3-2|t←⎕dr ⍵}  ⍝ 1:⎕null 2:num 3:char 0:other
rk ←{⍵⍴⍨((⍺-⍴⍴⍵)⍴1),⍴⍵}
fnz←{⊃0~⍨,⍵}                         ⍝ first nonzero value
em ←{×/(∨\⌽<\⌽⍺≠⍵)/⍺⌊⍵}
nvn←{⎕fr←1287 ⋄ (9 11○⍤1 0⊢⍺)(>-<)(9 11○⍤1 0⊢⍵)}

⍺{11::0 ⋄ ⍺≡⍵}⍵:0                    ⍝ match (trap DECF v complex)
|c←(0∊⍴⍵)-(0∊⍴⍺):c                   ⍝ one is empty & the other not
(0∊⍴⍺)∧(0∊⍴⍵):(⍺⍴⍨1+⍴⍺)∇(⍵⍴⍨1+⍴⍵)    ⍝ both are empty
(⍴⍴⍺)≢(⍴⍴⍵): c+((⍴⍴⍺)∇(⍴⍴⍵))×0=c←(r rk ⍺)∇(r rk ⍵) ⊣ r←(⍴⍴⍺)⌈(⍴⍴⍵)
( ⍴⍺)≢( ⍴⍵): c+((⌽⍴⍺)∇(⌽⍴⍵))×0=c←fnz(m⍴⍺)∇¨(m⍴⍵)   ⊣ m←(⍴⍺)em(⍴⍵)

s←tp ⍺                               ⍝ simple arrays
t←tp ⍵
(s≠t)∧×s×t:×s-t                      ⍝ different simple types
1=s×t: 0                             ⍝ both ⎕null
4=s×t: fnz ⍺ nvn ⍵                   ⍝ both nums
9=s×t: fnz ×(⎕ucs ⍺)-(⎕ucs ⍵)        ⍝ both chars

|c←(⊃⍺)∇(⊃⍵):c ⋄ (1↓,⍺)∇(1↓,⍵)       ⍝ non-simple arrays
}

```

IV. Tests

Each assertion is listed with the key axioms used in its resolution, which may not be the ones used by the cmp model for that assertion.

```assert←{⍺←'assertion failure' ⋄ 0∊⍵:⍺ ⎕signal 8 ⋄ shy←0}

z←cmpQA
⍝ testing cmp

assert ¯1='a'cmp'b'                        ⍝ compatible
assert  0='abc'cmp'abc'                    ⍝ compatible
assert ¯1='ABC'cmp'abc'                    ⍝ compatible
assert ¯1='abc 'cmp'xyz'                   ⍝ shape
assert  1='abc 'cmp'abc'                   ⍝ shape
assert  1=('abc',⎕ucs 0)cmp'abc'           ⍝ shape
assert ¯1='abc'cmp'z'                      ⍝ rank; shape
assert ¯1=(1 3⍴'abc')cmp'xyz'              ⍝ rank

assert ¯1=3 cmp 4                          ⍝ compatible
assert  0=3 cmp 3                          ⍝ compatible
assert ¯1=3 cmp 3+⎕ct÷2                    ⍝ compatible
assert  1=1e308 cmp ¯1e308                 ⍝ compatible
assert ¯1=3j¯4 cmp 3j5                     ⍝ complex
assert ¯1=3 cmp 3j5                        ⍝ complex
assert  1=3 cmp 3j¯5                       ⍝ complex
assert  1=1e1000 cmp 1j1                   ⍝ complex

assert ¯1=(⊂'abc')cmp⊂'abx'                ⍝ itemwise
assert ¯1=(⊂'chthonic')cmp⊂'syzygy'        ⍝ itemwise; shape
assert ¯1=(⊂1 2 3 4)cmp⊂3 5 7 11           ⍝ itemwise
assert ¯1=(⊂1 2 3 4)cmp⊂3 5 7              ⍝ itemwise; shape

assert ¯1=3 cmp,3                          ⍝ rank
assert ¯1='abc'cmp 1 3⍴'abc'               ⍝ rank
assert ¯1=(⊂'ab')cmp 1 1 1⍴⊂'ab'           ⍝ rank; itemwise

assert ¯1=0 cmp'0'                         ⍝ type
assert ¯1=0 cmp(⎕ucs 0)                    ⍝ type
assert ¯1=3j4 cmp'a'                       ⍝ type
assert  1='xyz'cmp⊂'pqr'                   ⍝ rank; itemwise; shape
assert ¯1='abc'cmp⊂'pqr'                   ⍝ rank; itemwise; shape
assert ¯1='pqr'cmp⊂'pqr'                   ⍝ rank; itemwise; shape
assert  1='pqr'cmp⊂3 4⍴⍳12                 ⍝ rank; itemwise; type
assert ¯1=2 3 4 cmp⊂2 3 4⍴⎕d               ⍝ rank; itemwise; type

assert  0=(1 2,⎕null)cmp 1 2,⎕null         ⍝ itemwise; type
assert ¯1=(1 2,⎕null)cmp 1 2,¯2            ⍝ itemwise; type
assert ¯1=(1 2,⎕null)cmp 1 2,'a'           ⍝ itemwise; type
assert ¯1=1 2j3 cmp 1 2j3,⎕null            ⍝ shape
assert ¯1='hart'cmp'hart',⎕null            ⍝ shape
assert ¯1=(3⍴⎕null)cmp 4⍴⎕null             ⍝ shape
assert ¯1=(0⍴⎕null)cmp ⍬                   ⍝ empty; type
assert ¯1=(0⍴⎕null)cmp''                   ⍝ empty; type

assert ¯1=(,3)cmp,⊂,3                      ⍝ itemwise; rank
assert  1=(,4)cmp,⊂,3                      ⍝ itemwise; rank
assert ¯1=(,'a')cmp,⊂,'a'                  ⍝ itemwise; rank
assert  1=(,'b')cmp,⊂,'a'                  ⍝ itemwise; rank
assert ¯1=(,3)cmp,⊂,'3'                    ⍝ itemwise; type
assert  1=(,'z')cmp,⊂,0                    ⍝ itemwise; type

assert  1=((2 2⍴⍳4),¯1)cmp 3 2⍴⍳6          ⍝ shape
assert  1=((2 2⍴⍳4),99)cmp 3 2⍴⍳6          ⍝ shape

assert ¯1=⍬ cmp⌈/⍬                         ⍝ shape
assert ¯1=''cmp ⎕ucs 0                     ⍝ shape
assert ¯1=⍬ cmp,⊂⍬                         ⍝ shape
assert ¯1=''cmp⊂''                         ⍝ shape
assert ¯1=(0 4 5⍴0)cmp'a'                  ⍝ shape
assert ¯1=(4 0 5⍴0)cmp'a'                  ⍝ shape

assert ¯1=⍬ cmp''                          ⍝ empty
assert ¯1=⍬ cmp 0⍴⊂'abc'                   ⍝ empty
assert ¯1=(2 0⍴0)cmp 0 2⍴0                 ⍝ empty
assert ¯1=(2 0⍴0)cmp 0 2⍴'a'               ⍝ empty
assert  1=(2 0⍴'a')cmp 0 2⍴0               ⍝ empty
assert ¯1=(2 0⍴'a')cmp 0 2⍴'a'             ⍝ empty
assert ¯1=(2 0 0⍴0)cmp 0 0 2⍴0             ⍝ empty
assert ¯1=(2 0 0⍴0)cmp 0 0 2⍴'a'           ⍝ empty
assert  1=(2 0 0⍴'a')cmp 0 0 2⍴0           ⍝ empty
assert ¯1=(2 0 0⍴'a')cmp 0 0 2⍴'a'         ⍝ empty

assert  1=(0⍴⊂2 3 4⍴5)cmp 0⍴⊂2 3 2⍴5       ⍝ empty; shape
assert ¯1=(0⍴⊂2 3 4⍴5)cmp 0⍴⊂2 3 5⍴5       ⍝ empty; shape
assert  1=(0⍴⊂1 3⍴'a')cmp 0⍴⊂3⍴'a'         ⍝ empty; rank
assert ¯1=(0⍴⊂1 3⍴'a')cmp 0⍴⊂1 1 1 3⍴'a'   ⍝ empty; shape

z←1

```

V. Discussion

The best that one can expect for a set of axioms is that they are consistent, complete, and minimal. The present set is minimal because none of them can be derived from the others. They are complete because any two arrays (the arrays we care about) can then be compared. Are they consistent? We believe they are, but we don’t have a proof, and all it takes for non-consistency is one example which shows that the axioms give contradictory results or which contradicts deeply held intuitions about ordering.

Reasoning about the TAO is easier as axioms than as an algorithmic description of how to produce the result of cmp . Among other things, the axioms can be considered in any order, and they can be considered independently of each other.

As usual with axioms, properties which can be readily derived from them have been omitted. For example ⍺ cmp ⍵ ←→ (⊂⍺) cmp (⊂⍵) is seen to be true by the itemwise axiom. There is one non property which should be mentioned: It is tempting to think that ifandare vectors and x is a scalar, then

```   ⍺ cmp ⍵  ←→  (x,⍺) cmp (x,⍵)
```

However, prepending loses information: ⍬≢'' but (0,⍬)≡0,'' . This is a property of Dyalog arrays and a TAO can not compensate for that.

A TAO depends on the exact details of arrays. Therefore, the TAO derived from these axioms likely differs from the TAO in other dialects and languages.

Appendix. Theorems

T0.  Ifandhave the same rank but different shapes, and are not both empty, then

```   ⍺ cmp ⍵  ←→  ⊃ 0 ~⍨ ((m⍴⍺)cmp¨(m⍴⍵)), (⌽⍴⍺)cmp(⌽⍴⍵)
```

where
m ← ×/ (∨\ ⌽<\⌽ (⍴⍺)≠(⍴⍵)) / (⍴⍺)⌊(⍴⍵)

T1.  If exactly one oforis empty, then

```   ⍺ cmp ⍵  ←→  (0∊⍴⍵)-(0∊⍴⍺)

```

A “leaf” is a subarray which is simple (1≥|≡⍵) or empty. The function firstleaf←{⊃⍣((1=≡⍵)⍱0∊⍴⍵)⊢⍵}⍣≡ computes the first (leftmost) leaf.

T2.  If exactly one of oris simple, then

 ⍺ cmp ⍵ ←→  ⍺ cmp (firstleaf¨ ⍵) ⍺ is simple; ⍵ is not ←→  (firstleaf¨ ⍺) cmp ⍵ ⍵ is simple; ⍺ is not

Corollary. If exactly one oforis simple, then ⍺ cmp ⍵ obtains without recursion.

T3.  There is no largest array.

Corollary. There is no largest scalar.

Corollary. There is no largest array for a given shape.

T4.  There is no smallest array.

T5.   cmp and le in terms of :

```   ⍺ cmp ⍵  ←→  1-1⊃⍋⍺ ⍵ ⍺ ⊣ ⎕io←0
⍺ le  ⍵  ←→  (⍳2)≡⍋⍺ ⍵
```

Except: Ifandare scalars and one is a DECF and the other a complex number, then

```   ⍺ cmp ⍵  ←→  1-1⊃⍋(,⍺)(,⍵)(,⍺) ⊣ ⎕io←0
⍺ le  ⍵  ←→  (⍳2)≡⍋(,⍺)(,⍵)
```

 created: 2018-01-04 10:50 updated: 2018-06-22 13:25