>>  <<  Usr  Pri  JfC  LJ  Phr  Dic  Rel  Voc  !:  Help  Learning J

Chapter 27: Representations and Conversions

In this chapter we look at various transformations of functions and data.

27.1 Classes and Types

If we are transforming things into other things, it is useful to begin with functions which tell us what sort of thing we are dealing with.

27.1.1 Classes

Given an assignment, name =: something, then something is an expression denoting a noun or a verb or an adverb or a conjunction. That is, there are 4 classes to which something may belong.

There is a built-in verb 4!:0 which here we can call class.

   class =: 4!:0
We can discover the class of something by applying class to the argument <'name'. For example,

n =: 6 class < 'n'
6 0

The result of 0 for the class of n means that n is a noun. The cases are:

          0  noun
          1  adverb
          2  conjunction
          3  verb
and two more cases: the string 'n' is not a valid name, or n is valid as a name but no value is assigned to n.
         _2  invalid
         _1  unassigned
For example:

C =: & class <'C' class <'yup' class <'1+2'
& 2 _1 _2

The argument of class identifies the object of interest by quoting its name to make a string, such as 'C'.

Why is the argument not simply the object? Because, by the very purpose of the class function, the object may be a verb, noun, adverb or conjunction, and an adverb or conjunction cannot be supplied as argument to any other function.

Why not? Suppose the object of interest is the conjunction C. No matter how class is defined, whether verb or adverb, any expression of the form (class C) or (C class) is a bident or a syntax error. In no case is function class applied to argument C. Hence the need to identify C by quoting its name.

27.1.2 Types

A noun may be an array of integers, or of floating-point numbers or of characters, and so on. The type of any array may be discovered by applying the built-in verb 3!:0
   type =: 3!:0
For example

type 0.1 type 'abc'
8 2

The result of 8 means floating-point and the result 2 means character. Possible cases for the result are (amongst others):

               1  boolean
               2  character  (that is, 8-bit characters)
               4  integer
               8  floating point
              16  complex
              32  boxed
              64  extended integer
             128  rational
           65536  symbol
          131072  wide character (16-bit)

27.2 Execute

There is a built-in verb ". (doublequote dot, called "Execute"). Its argument is a character-string representing a valid J expression, and the result is the value of that expression.
   ". '1+2'
3
The string can represent an assignment, and the assignment is executed:

". 'w =: 1 + 2' w
3 3

If the string represents a verb or adverb or conjunction, the result is null, because Execute is itself a verb and therefore its results must be nouns. However we can successfully Execute assignments to get functions.

". '+' ". 'f =: +' f
    +

27.3 On-Screen Representations

When an expression is entered at the keyboard, a value is computed and displayed on-screen. Here we look at how values are represented in on-screen displays. For example, if we define a function foo:
   foo =: +/ % #
and then view the definition of foo:
   foo
+-----+-+-+
|+-+-+|%|#|
||+|/|| | |
|+-+-+| | |
+-----+-+-+
we see on the screen some representation of foo. What we see depends on which of several options is currently in effect for representing functions on-screen.

By default the current option is the "boxed representation", so we see above foo depicted graphically as a structure of boxes. Other options are available, described below. To select and make current an option for representing functions on-screen, enter one of the following expressions:

            (9!:3) 2  NB. boxed (default)
            (9!:3) 5  NB. linear
            (9!:3) 6  NB. parenthesized
            (9!:3) 4  NB. tree
            (9!:3) 1  NB. atomic
The current option remains in effect until we choose a different option.

27.3.1 Linear Representation

If we choose the the linear representation, and look at foo again:
   (9!:3) 5  NB. linear 

   foo
+/ % #
we see foo in a form in which it could be typed in at the keyboard, that is, as an expression.

Notice that the linear form is equivalent to the original definition, but not necessarily textually identical: it tends to minimize parentheses.

   bar =: (+/) % #
   
   bar
+/ % #
Functions, that is, verbs, adverbs and conjunctions, are shown in the current representation. By contrast, nouns are always shown in the boxed representation, regardless of the current option. Even though linear is current, we see:
   noun =: 'abc';'pqr'
   
   noun
+---+---+
|abc|pqr|
+---+---+

27.3.2 Parenthesized

The parenthesized representation is like linear in showing a function as an expression. Unlike linear, the parenthesized form helpfully adds parentheses to make the logical structure of the expression more evident.
   (9!:3) 6  NB. parenthesized

   zot =: f @: g @: h
   
   zot
(f@:g)@:h

27.3.3 Tree Representation

Tree representation is another way of displaying structure graphically:
   (9!:3) 4  NB. tree

   zot
              +- f
       +- @: -+- g
-- @: -+- h       
   

27.3.4 Atomic Representation

See below

Before continuing, we return the current representation option to linear.

   (9!:3) 5

27.4 Representation Functions

Regardless of the current option for showing representations on-screen, any desired representation may be generated as a noun by applying a suitable built-in verb.

If y is a name with an assigned value, then a representation of y is a noun produced by applying one of the following verbs to the argument <'y'

   br =:  5!:2    NB. boxed 
   lr =:  5!:5    NB. linear
   pr =:  5!:6    NB. parenthesized
   tr =:  5!:4    NB. tree
   ar =:  5!:1    NB. atomic
For example, the boxed and parenthesized forms of zot are shown by:

br < 'zot' pr < 'zot'
+--------+--+-+
|+-+--+-+|@:|h|
||f|@:|g||  | |
|+-+--+-+|  | |
+--------+--+-+
(f@:g)@:h

We can get various representations of a noun, for example the boxed and the linear:

br <'noun' lr <'noun'
+---+---+
|abc|pqr|
+---+---+
<;._1 ' abc pqr'

Representations produced by 5!:n are themselves nouns. The linear form of verb foo is a character-string of length 6.

foo s =: lr <'foo' $ s
+/ % # +/ % # 6

The 6 characters of s represent an expression denoting a verb. To capture the verb expressed by string s, we could prefix the string with characters to make an assignment, and Execute the assignment.

s $ s a =: 'f =: ' , s ". a f 1 2
+/ % # 6 f =: +/ % #   1.5

27.4.1 Atomic Representation

We saw in Chapter 10 and Chapter 14, that it is useful to be able to form sequences of functions. By this we mean, not trains of verbs, but gerunds. A gerund, regarded as a sequence of verbs, can for example be indexed to find a verb applicable in a particular case of the argument.

To be indexable, a sequence must be an array, a noun. Thus we are interested in transforming a verb into a noun representing that verb, and vice versa. A gerund is a list of such nouns, containing atomic representations. The atomic representation is suitable for this purpose because it has an inverse. None of the other representation functions have true inverses.

The atomic representation of anything is a single box with inner structure. For an example, suppose that h is a verb defined as a hook. (A hook is about the simplest example of a verb with non-trivial structure.)

   h =: + %
compare the boxed and the atomic representations of h

br <'h' ar < 'h'
+-+-+
|+|%|
+-+-+
+---------+
|+-+-----+|
||2|+-+-+||
|| ||+|%|||
|| |+-+-+||
|+-+-----+|
+---------+

The inner structure is an encoding which allows the verb to be recovered from the noun efficiently without reparsing the original definition. It mirrors the internal form in which a definition is stored. It is NOT meant as yet another graphic display of structure.

The encoding is described in the Dictionary. We will not go into much detail here. Very briefly, in this example we see that h is a hook (because 2 is an encoding of "hook") where the first verb is + and the second is %.

The next example shows that we can generate atomic representations of a noun, a verb, an adverb or a conjunction.

   N =: 6
   V =: h
   A =: /
   C =: &

ar <'N' ar <'V' ar <'A' ar <'C'
+-----+
|+-+-+|
||0|6||
|+-+-+|
+-----+
+-+
|h|
+-+
+-+
|/|
+-+
+-+
|&|
+-+

27.4.2 Inverse of Atomic Representation

The inverse of representation is sometimes called "abstraction", (in the sense that for example a number is an abstract mathematical object represented by a numeral.) The inverse of atomic representation is 5!:0 which we can call ab.
   ab =: 5!:0
ab is an adverb, because it must be able to generate any of noun, verb, adverb or conjunction. For example, we see that the abstraction of the atomic representation of h is equal to h

h r =: ar < 'h' r ab
+ % +---------+
|+-+-----+|
||2|+-+-+||
|| ||+|%|||
|| |+-+-+||
|+-+-----+|
+---------+
+ %

and similarly for an argument of any type. For example for noun N or conjunction C

N rN=: ar <'N' rN ab C (ar <'C') ab
6 +-----+
|+-+-+|
||0|6||
|+-+-+|
+-----+
6 & &

27.4.3 Execute Revisited

Here is another example of the use of atomic representations. Recall that Execute evaluates strings expressing nouns but not verbs. Since Execute is itself a verb it cannot deliver verbs as its result.

". '1+2' ". '+'
3  

To evaluate strings expressing values of any class we can define an adverb eval say, which delivers its result by abstracting an atomic representation of it.

   eval =: 1 : 0
". 'w =. ' , u
(ar < 'w') ab
)
   

'1+2' eval mean =: '+/ % #' eval mean 1 2
3 +/ % # 1.5

27.4.4 The Tie Conjunction Revisited

Recall from Chapter 14 that we form gerunds with the Tie conjunction `. Its arguments can be two verbs.
   G =: (+ %) ` h  
Its result is a list of atomic representations. To demonstrate, we choose one, say the first in the list, and abstract the verb.

G r =: 0 { G r ab
+---------+-+
|+-+-----+|h|
||2|+-+-+|| |
|| ||+|%||| |
|| |+-+-+|| |
|+-+-----+| |
+---------+-+
+---------+
|+-+-----+|
||2|+-+-+||
|| ||+|%|||
|| |+-+-+||
|+-+-----+|
+---------+
+ %

The example shows that Tie can take arguments of expressions denoting verbs. By contrast, the atomic representation function (ar or 5!:1) must take a boxed name to identify its argument.

Here is a conjunction T which, like Tie, can take verbs (not names) as arguments and produces atomic representations.

   T =: 2 : '(ar <''u.'') , (ar <''v.'')'
   

(+ %) T h (+ %) ` h
+---------+-+
|+-+-----+|h|
||2|+-+-+|| |
|| ||+|%||| |
|| |+-+-+|| |
|+-+-----+| |
+---------+-+
+---------+-+
|+-+-----+|h|
||2|+-+-+|| |
|| ||+|%||| |
|| |+-+-+|| |
|+-+-----+| |
+---------+-+

27.5 Conversions of Data

Consider a graphics file holding an image in the "bitmap" format. Published descriptions of the bitmap format are something like this:
         Offset  Size  ...   Description
       
          0       2          The characters BM for bitmap
          2       4          The total size of the file
                                   :
                                   :
         28       2          Color bits per pixel 1 4 8 or 24
We see here the layout of the first few bytes in the file, described as characters, 16-bit numbers or 32-bit numbers. such descriptions are ultimately descriptions, in terms independent of any particular programming language, of how strings of bits are to be interpreted. Data described in this way is called "binary" data.

Now we look at functions for converting between values in J arrays and binary forms of such values, with a view to handling files with binary data. Data files will be covered in Chapter 28 .

(In the following it is asserted that a character occupies one byte and a floating point number occupies 8. That is, we assume J version 4.05 or similar, running on a PC.)

A J array, of floating-point numbers for example, is stored in the memory of the computer. Storage is required to hold information about the type, rank and shape of the array, together with storage for each number in the array. Each floating-point number in the array needs 8 bytes of storage.

There are built-in functions to convert a floating-point number to a character-string of length 8, and vice versa.

   cf8 =:   2 & (3!:5)   NB. float to 8 chars
   c8f =:  _2 & (3!:5)   NB. 8 chars to float 
In the following example, we see that the number n is floating-point, n is converted to give the string s which is of length 8, and s is converted back to give a floating-point number equal to n.

n =: 0.1 $ s =: cf8 n c8f s
0.1 8 0.1

Characters in the result s are mostly non-printable. We can inspect the characters by locating them in the ASCII character-set:

   a. i. s 
154 153 153 153 153 153 185 63
Now consider converting arrays of numbers. A list of numbers is converted to a single string, and vice versa::

a =: 0.1 0.1 $ s =: cf8 a c8f s
0.1 0.1 16 0.1 0.1

The monadic rank of cf8 is infinite: cf8 applies just once to its whole argument.

   RANKS =: 1 : 'u b. 0'
   cf8 RANKS
_ _ _
   
but the argument must be a scalar or list, or else an error results.

b =: 2 2 $ a $ w =: cf8 b $ w =: cf8"1 b
0.1 0.1
0.1 0.1
error 2 16

A floating-point number is convertible to 8 characters. There is an option to convert a float to and from a shorter 4-character string, sacrificing precision for economy of storage.

   cf4 =:  1 & (3!:5)   NB. float to 4 chars
   c4f =: _1 & (3!:5)   NB. 4 chars to float
   
As we might expect, converting a float to 4 characters and back again can introduce a small error.
   p =: 3.14159265358979323
   

p $ z =: cf4 p q =: c4f z p - q
3.14159 4 3.14159 _8.74228e_8

A J integer needs 4 bytes of storage. There are functions to convert between J integers and 4-character strings.

   ci4 =:  2 & (3!:4)  NB. integer to 4 char
   c4i =: _2 & (3!:4)  NB. 4 char  to integer
   

i =: 1 _100 $ s =: ci4 i c4i s
1 _100 8 1 _100

We see that the length of s is 8 because s represents two integers.

Suppose k is an integer and c is the conversion of k to 4 characters.

k =: 256+65 $ c =: ci4 k
321 4

Since characters in c are mostly non-printable, we inspect them by viewing their locations in the ASCII alphabet. We see that the characters are the base-256 digits in the value of k, stored in c in the order least-significant first (on a PC)..

k a. i. c 256 256 256 256 #: k
321 65 1 0 0 0 0 1 65

Integers in the range _32768 to 32767 can be converted to 2-character strings and vice versa.

   ci2 =:  1 & (3!:4)  NB. integer to 2 char
   c2i =: _1 & (3!:4)  NB. 2 char  to int
   

i $ s =: ci2 i c2i s
1 _100 4 1 _100

Integers in the range 0 to 65535 can be converted to 2-character strings and vice versa. Such strings are described as "16bit unsigned".

   ui2 =: ci2         NB. integer to 2-char,  unsigned  
   u2i =: 0 & (3!:4)  NB. 2 char  to integer, unsigned
   

m =: 65535 $ s =: ui2 m u2i s
65535 2 65535

This is the end of Chapter 27


NEXT
Table of Contents
Index


The examples in this chapter were executed using J version j701/beta/2010-11-24/22:45. This chapter last updated 08 Jan 2011
Copyright © Roger Stokes 2010. This material may be freely reproduced, provided that this copyright notice is also reproduced.


>>  <<  Usr  Pri  JfC  LJ  Phr  Dic  Rel  Voc  !:  Help  Learning J