Chapter 27: Representations and Conversions
In this chapter we look at various transformations of functions and data.
27.1 Classes and Types
Given an assignment, name =: something, then something is an expression denoting a noun or a verb or an adverb or a conjunction. That is, there are 4 classes to which something may belong.
There is a built-in verb 4!:0 which here we can call class.
class =: 4!:0
We can discover the class of something by applying class to the argument <'name'. For example,
The result of 0 for the class of n means that n is a noun. The cases are:
0 noun 1 adverb 2 conjunction 3 verb
and two more cases: the string 'n' is not a valid name, or n is valid as a name but no value is assigned to n.
_2 invalid _1 unassigned
The argument of class identifies the object of interest by quoting its name to make a string, such as 'C'.
Why is the argument not simply the object? Because, by the very purpose of the class function, the object may be a verb, noun, adverb or conjunction, and an adverb or conjunction cannot be supplied as argument to any other function.
Why not? Suppose the object of interest is the conjunction C. No matter how class is defined, whether verb or adverb, any expression of the form (class C) or (C class) is a bident or a syntax error. In no case is function class applied to argument C. Hence the need to identify C by quoting its name.
A noun may be an array of integers, or of floating-point numbers or of characters, and so on. The type of any array may be discovered by applying the built-in verb 3!:0
type =: 3!:0
The result of 8 means floating-point and the result 2 means character. Possible cases for the result are (amongst others):
1 boolean 2 character 4 integer 8 floating point 16 complex 32 boxed 64 extended integer 128 rational 65536 symbol
There is a built-in verb ". (doublequote dot, called "Execute"). Its argument is a character-string representing a valid J expression, and the result is the value of that expression.
". '1+2' 3
The string can represent an assignment, and the assignment is executed:
If the string represents a verb or adverb or conjunction, the result is null, because Execute is itself a verb and therefore its results must be nouns. However we can successfully Execute assignments to get functions.
27.3 On-Screen Representations
When an expression is entered at the keyboard, a value is computed and displayed on-screen. Here we look at how values are represented in on-screen displays. For example, if we define a function foo:
foo =: +/ % #
and then view the definition of foo:
we see on the screen some representation of foo. What we see depends on which of several options is currently in effect for representing functions on-screen.
By default the current option is the "boxed representation", so we see above foo depicted graphically as a structure of boxes. Other options are available, described below. To select and make current an option for representing functions on-screen, enter one of the following expressions:
(9!:3) 2 NB. boxed (default) (9!:3) 5 NB. linear (9!:3) 6 NB. parenthesized (9!:3) 4 NB. tree (9!:3) 1 NB. atomic
27.3.1 Linear Representation
If we choose the the linear representation, and look at foo again:
(9!:3) 5 NB. linear foo +/ % #
we see foo in a form in which it could be typed in at the keyboard, that is, as an expression.
Notice that the linear form is equivalent to the original definition, but not necessarily textually identical: it tends to minimize parentheses.
bar =: (+/) % # bar +/ % #
Functions, that is, verbs, adverbs and conjunctions, are shown in the current representation. By contrast, nouns are always shown in the boxed representation, regardless of the current option. Even though linear is current, we see:
The parenthesized representation is like linear in showing a function as an expression. Unlike linear, the parenthesized form helpfully adds parentheses to make the logical structure of the expression more evident.
27.3.3 Tree Representation
Tree representation is another way of displaying structure graphically:
(9!:3) 4 NB. tree zot +- f +- @: -+- g -- @: -+- h
27.3.4 Atomic Representation
Before continuing, we return the current representation option to linear.
27.4 Representation Functions
Regardless of the current option for showing representations on-screen, any desired representation may be generated as a noun by applying a suitable built-in verb.
If y is a name with an assigned value, then a representation of y is a noun produced by applying one of the following verbs to the argument <'y'
br =: 5!:2 NB. boxed lr =: 5!:5 NB. linear pr =: 5!:6 NB. parenthesized tr =: 5!:4 NB. tree ar =: 5!:1 NB. atomic
For example, the boxed and parenthesized forms of zot are shown by:
We can get various representations of a noun, for example the boxed and the linear:
Representations produced by 5!:n are themselves nouns. The linear form of verb foo is a character-string of length 6.
The 6 characters of s represent an expression denoting a verb. To capture the verb expressed by string s, we could prefix the string with characters to make an assignment, and Execute the assignment.
27.4.1 Atomic Representation
We saw in Chapter 10 and Chapter 14, that it is useful to be able to form sequences of functions. By this we mean, not trains of verbs, but gerunds. A gerund, regarded as a sequence of verbs, can for example be indexed to find a verb applicable in a particular case of the argument.
To be indexable, a sequence must be an array, a noun. Thus we are interested in transforming a verb into a noun representing that verb, and vice versa. A gerund is a list of such nouns, containing atomic representations. The atomic representation is suitable for this purpose because it has an inverse. None of the other representation functions have true inverses.
The atomic representation of anything is a single box with inner structure. For an example, suppose that h is a verb defined as a hook. (A hook is about the simplest example of a verb with non-trivial structure.)
h =: + %
compare the boxed and the atomic representations of h
The inner structure is an encoding which allows the verb to be recovered from the noun efficiently without reparsing the original definition. It mirrors the internal form in which a definition is stored. It is NOT meant as yet another graphic display of structure.
The encoding is described in the Dictionary. We will not go into much detail here. Very briefly, in this example we see that h is a hook (because 2 is an encoding of "hook") where the first verb is + and the second is %.
The next example shows that we can generate atomic representations of a noun, a verb, an adverb or a conjunction.
N =: 6 V =: h A =: / C =: &
27.4.2 Inverse of Atomic Representation
The inverse of representation is sometimes called "abstraction", (in the sense that for example a number is an abstract mathematical object represented by a numeral.) The inverse of atomic representation is 5!:0 which we can call ab.
ab =: 5!:0
ab is an adverb, because it must be able to generate any of noun, verb, adverb or conjunction. For example, we see that the abstraction of the atomic representation of h is equal to h
and similarly for an argument of any type. For example for noun N or conjunction C
27.4.3 Execute Revisited
Here is another example of the use of atomic representations. Recall that Execute evaluates strings expressing nouns but not verbs. Since Execute is itself a verb it cannot deliver verbs as its result.
To evaluate strings expressing values of any class we can define an adverb eval say, which delivers its result by abstracting an atomic representation of it.
eval =: 1 : 0 ". 'w =. ' , u (ar < 'w') ab )
27.4.4 The Tie Conjunction Revisited
Recall from Chapter 14 that we form gerunds with the Tie conjunction `. Its arguments can be two verbs.
G =: (+ %) ` h
Its result is a list of atomic representations. To demonstrate, we choose one, say the first in the list, and abstract the verb.
The example shows that Tie can take arguments of expressions denoting verbs. By contrast, the atomic representation function (ar or 5!:1) must take a boxed name to identify its argument.
Here is a conjunction T which, like Tie, can take verbs (not names) as arguments and produces atomic representations.
T =: 2 : '(ar <''u.'') , (ar <''v.'')'
27.5 Conversions of Data
Consider a graphics file holding an image in the "bitmap" format. Published descriptions of the bitmap format are something like this:
Offset Size ... Description 0 2 The characters BM for bitmap 2 4 The total size of the file : : 28 2 Color bits per pixel 1 4 8 or 24
We see here the layout of the first few bytes in the file, described as characters, 16-bit numbers or 32-bit numbers. such descriptions are ultimately descriptions, in terms independent of any particular programming language, of how strings of bits are to be interpreted. Data described in this way is called "binary" data.
Now we look at functions for converting between values in J arrays and binary forms of such values, with a view to handling files with binary data. Data files will be covered in Chapter 28 .
(In the following it is asserted that a character occupies one byte and a floating point number occupies 8. That is, we assume J version 4.05 or similar, running on a PC.)
A J array, of floating-point numbers for example, is stored in the memory of the computer. Storage is required to hold information about the type, rank and shape of the array, together with storage for each number in the array. Each floating-point number in the array needs 8 bytes of storage.
There are built-in functions to convert a floating-point number to a character-string of length 8, and vice versa.
cf8 =: 2 & (3!:5) NB. float to 8 chars c8f =: _2 & (3!:5) NB. 8 chars to float
In the following example, we see that the number n is floating-point, n is converted to give the string s which is of length 8, and s is converted back to give a floating-point number equal to n.
Characters in the result s are mostly non-printable. We can inspect the characters by locating them in the ASCII character-set:
a. i. s 154 153 153 153 153 153 185 63
Now consider converting arrays of numbers. A list of numbers is converted to a single string, and vice versa::
The monadic rank of cf8 is infinite: cf8 applies just once to its whole argument.
RANKS =: 1 : 'u b. 0' cf8 RANKS _ 0 _
but the argument must be a scalar or list, or else an error results.
A floating-point number is convertible to 8 characters. There is an option to convert a float to and from a shorter 4-character string, sacrificing precision for economy of storage.
cf4 =: 1 & (3!:5) NB. float to 4 chars c4f =: _1 & (3!:5) NB. 4 chars to float
As we might expect, converting a float to 4 characters and back again can introduce a small error.
p =: 3.14159265358979323
A J integer needs 4 bytes of storage. There are functions to convert between J integers and 4-character strings.
ci4 =: 2 & (3!:4) NB. integer to 4 char c4i =: _2 & (3!:4) NB. 4 char to integer
We see that the length of s is 8 because s represents two integers.
Suppose k is an integer and c is the conversion of k to 4 characters.
Since characters in c are mostly non-printable, we inspect them by viewing their locations in the ASCII alphabet. We see that the characters are the base-256 digits in the value of k, stored in c in the order least-significant first (on a PC)..
Integers in the range _32768 to 32767 can be converted to 2-character strings and vice versa.
ci2 =: 1 & (3!:4) NB. integer to 2 char c2i =: _1 & (3!:4) NB. 2 char to int
Integers in the range 0 to 65535 can be converted to 2-character strings and vice versa. Such strings are described as "16bit unsigned".
ui2 =: ci2 NB. integer to 2-char, unsigned u2i =: 0 & (3!:4) NB. 2 char to integer, unsigned
This is the end of Chapter 27
Table of Contents
The examples in this chapter
were executed using J version 601-o-beta.
This chapter last updated 27 Jun 2006 .
Copyright © Roger Stokes 2006. This material may be freely reproduced, provided that this copyright notice is also reproduced.