11. Boxing (structures)

The nouns we have encountered so far have all had items with identical shapes, and with all atoms of the same type, either numeric or character. You may be afraid that such regular arrays are all that J supports and that you will have to forgo C structures; and you may wonder how J will fare in the rough-and-tumble of the real world where data is not so regular. This chapter will put those fears to rest.

I think we should get a formal understanding of boxing in J before we try to relate it to structures in C, because the ideas are just different enough to cause confusion. The box is an atomic data type in J, along with number and character. As with the other types, a single box is a scalar, with rank 0 and empty shape. Just as a numeric or character atom has a value, so a boxed atom has a value, called its contents. The box is special in that its contents can be an array while the box itself is an atom. The boxing protects the contents and allows them to be treated as an atom.

Arrays of boxes are allowed, and as always all the atoms in an array must have the same type: if any element is boxed, all must be boxed.

Various verbs create boxes. Monad < has infinite rank and exists for the sole purpose of boxing its operand: < y creates a box whose contents are y, for example:

+-+

|1|

+-+

<1 2 3

+-----+

|1 2 3|

+-----+

<'abc'

+---+

|abc|

+---+

When a box is displayed, the contents are surrounded by the boxing characters as seen in the examples.

Only certain primitives can accept boxes as operands; generally, you cannot perform arithmetic on boxes but you can do other things like monad and dyad #, monad and dyad $, and other primitives that do not perform arithmetic. The significant exception to this rule is that you can use monad and dyad /: and \: to order arrays of boxes. Comparison for equality between two atoms is not strictly an arithmetic operation--you can compare two characters or a character and a number for equality, for example--and it is allowed on boxes, both explicitly using dyad = and dyad -:or implicitly using dyad i. and dyad e.; but there is an arithmetic flavor to the operation: if the contents of corresponding components of the boxes are both numbers, tolerant comparison is used.

Most primitives that accept boxes as operands do not examine the contents of the boxes, but instead perform their operation on the box atoms themselves. Any deviation from this behavior will be noted in the definition of the verb (we have not encountered any yet). Examples:

3 $ <'abc'

+---+---+---+

|abc|abc|abc|

+---+---+---+

The dyad $ was applied to the box, creating a list of identical boxes.

(<1 2),(<5),(<'abc')

+---+-+---+

|1 2|5|abc|

+---+-+---+

The boxes were concatenated, resulting in a list of boxes. Note that the contents of the boxes do not have to have the same shape or type.

1 0 1 # (<1 2),(<5),(<'abc')

+---+---+

|1 2|abc|

+---+---+

The selection performed by dyad # is performed on the boxes, not on their contents.

Since applying a verb may result in the addition of verb fills and framing fills, we need to meet the fill element used for boxed nouns. It is the noun a: . a: is defined to be <0$0, i. e. an atom which is a box containing an empty numeric list (note that this is not the same thing as a box containing an empty string or an empty list of boxes).

3 {. <5

+-+++

|5|||

+-+++

a: was used for the fills added by overtaking from this boxed noun.

The contents of a box can be any noun, including a box or array of boxes:

< < 2 3

+-----+

|+---+|

||2 3||

|+---+|

+-----+

(<'abc'),(<<1 2)

+---+-----+

|abc|+---+|

| ||1 2||

| |+---+|

+---+-----+

The contents of a box can be recovered by opening the box with monad > . Monad > has rank 0 (since it operates only on boxes, which are atoms), and its result is the contents of the box:

> < 'abc'

abc

The contents of the box is the character string.

a =. (<1 2),(<<5),(<'abc')

> 0 { a

1 2

> 1 { a

+-+

|5|

+-+

Here we recover the contents of the selected box (which may be a box).

> (<1 2 3),(<4)

1 2 3

4 0 0

Remember that monad > has rank 0, so it is applied to each box and the results are collected using the frame. Here framing fills were added to the shorter result. 0 was used as the framing fill because the contents of the boxes were numeric.

|domain error

| >a

Here the results on the different cells were of different types, so it was impossible to collect them into an array.

If y is unboxed, the result of > y is y .

Terminology

Before we go further with boxes, let's agree on some terminology. Because every atom in an array must have the same type, it is reasonable to speak of an array as being 'boxed' if its atoms are boxed, 'unboxed' otherwise (some writers use 'open' as a synonym for 'unboxed'). Trouble arises with a phrase like 'boxed list'. If I box a list (e. g. <1 2 3), does that give me a boxed list? If I have a list whose atoms are boxes, (e. g. (<1),(<2)), is that also a boxed list? Unfortunately, writers on J have not agreed on terminology for these cases.

In this book, a boxed list will be a list that has been put into a box (it is therefore an atom), and a list of boxes is a list each atom of which is a box. Higher ranks are described similarly. If we say that a noun 'is boxed', that simply means that its atoms are boxes.

Boxing As an Equivalent For Structures In C

A C struct corresponds to a list of boxes in J, where each box in the list corresponds to one structure element. Referencing a structure element in C corresponds to selecting and opening an item in J. For example,

struct {

int f[ ] = {1,2,3};

char g[ ] = "abc";

float h = 1.0;

)

is equivalent to

(<1 2 3) , (<'abc') , (<1.0)

A C n-dimensional array of structures is equivalent to a J array of boxes of rank n+1. The last axis of the J array corresponds to the structure, and its length is the number of structure elements in the C structure.

C has no exact equivalent for a single box.