20. Input And Output

Foreigns

In J, all file operations are handled by foreigns which are created by the foreign conjunction !: . The foreigns are a grab-bag of functions and you would do well to spend some time glancing over their descriptions in the Dictionary so you'll have an idea what is available. All the foreigns defined to date in J are of the form m!:n with numeric m and n, and they are organized with m identifying the class of the foreign. For example, the foreigns for operations on files have m=1.

File Operations 1!:n; Error Handling

The foreigns 1!:n perform file operations. For ease of use I have given several of them names in jforc.ijs . To see the details of what they do, read the Dictionary.

J's native file operations are file-oriented rather than stream-oriented; that is, they read an entire file at a time, and there is no notion of a 'current file pointer', or a newline character, or a readline verb that returns just one record. Such facilities are easy to write, but it is usually better to work with entire files at a time, just as for ordinary computation J works with whole arrays at a time. Read your file in, split it into records, and work on the list of records.

Monad 1!:1 (ReadFile) has rank 0 . 1!:1 y takes a filename as y and produces a result that is a list of characters containing the text of the file. In our examples, the filename is a boxed character string; y can be a file number but we won't get into that.

s =. 1!:1 <'system\packages\misc\jforc.ijs'

NB. File definitions for'J For C Programmers'

...

Dyad 1!:2 (WriteFile) has rank _ 0 . x 1!:2 y writes to the file y, using the character string x to provide the contents. Any existing file y is overwritten. y is a boxed character string or a file number.

('Test Data',CR,LF) 1!:2 <'c:\Temp\temp.dat'

Dyad 1!:3 (AppendFile) has rank _ 0 . x 1!:3 y appends the character string x to the of file y (creating the file if it does not exist):

('Line 2',CR,LF) 1!:3 <'c:\Temp\temp.dat'

Monad 1!:55 (EraseFile) has rank 0 . 1!:55 y erases the file y with no prompting. Be careful: the power of J makes it possible for you to delete every file on your system with one sentence.

The special file number 2 sends output to the screen. The verb monad Display in jforc.ijs uses file number 2 to display its operand (which must be a character string) on the screen; you can put Display sentences in a long verb to see intermediate results. In most cases you will prefer to use printf, described below.

There are many other 1!:n foreigns to manage file locks, query directories, handle index read/write, and do other useful things. The Dictionary describes them.

Error Handling: u ::v, 13!:11, and 9!:8

When you deal with files you have to expect errors, which will look something like

s =. 1!:1 <'c:\xxx\yyy.dat'

|file name error

| s=. 1!:1<'c:\xxx\yyy.dat'

indicating that the file was not found.

You can set up your verbs to catch errors instead of interrupting with a message at the console. We will learn one way here and another later when we study control structures for J verbs. The compound u ::v has infinite rank, and can be applied either monadically or dyadically. u ::v executes the verb u first (monad or dyad, as appropriate); if u completes without error, its result becomes the result of u ::v; but if u encounters an error, v is then executed, and its result becomes the result of u ::v :

rerr =: 1!:1 :: (13!:11@(''"_))

rerr y will execute 1!:1 to read y; if that fails it will execute the foreign 13!:11 '' . 13!:11 '' produces as result the error number of the last error encountered. This means that if 1!:1 succeeds, the result of rerr will be a string, while if 1!:1 fails, the result will be a number:

rerr <'system\packages\misc\jforc.ijs'

NB. File definitions for'J For C Programmers'

...

rerr <'c:\xxx\yyy.dat'

You could use 3!:0 to see whether the result of rerr is a string or a number (2 means string, 1, 4, 8, or 16 means number). If you want to see the error message associated with an error number, there's a foreign 9!:8 to give you the list of errors:

25 { 9!:8 ''

+-----------------+

|file number error|

+-----------------+

Treating a File as a Noun: Mapped Files

Rather than reading a file and assigning the data to a noun, you can leave the data in a file and create a noun that points to the data. The data will be read only when the noun is referred to. This is called mapping the file to a noun.

J's facilities for mapping files are described in the Lab 'Mapped Files'. A quick example of a mapped file is

require 'jmf'

JCHAR map_jmf_ 'text'; 'system\packages\misc\jforc.ijs'

after which the noun text contains the data in the file:

45 {. text

NB. File definitions for'J For C Programmers'

Moreover, if you assign a new value to text, the file will be modified.

If you are dealing with large files, especially read-only files or files that don't change much, mapping the files can give a huge performance improvement because you don't have to read the whole file. You must be very careful, though, if you map files to nouns, because there are unexpected side effects. If we executed

temp =: text

temp =: 'abcdefgh'

we would find that the value of text had changed too! (If you try this, use a file you don't mind losing). The assignment of text to temp did not create a copy of data of text, and when temp was modified, the change was applied to the shared data. If a file is mapped to a noun, you have to make sure that the noun, or any copy made of the noun in any verb you pass the noun to, is not changed unless you are prepared to have some or all of the other copies changed. This topic is examined in greater depth under 'Aliasing of Variables' in the chapter on DLLs.

If the faster execution is enticement enough for you to take that trouble, you can consult the Lab to get all the details.

Format Data For Printing: Monad And Dyad ":

Since J reads and writes all files as character strings, you will need to be able to convert numbers to their character representation. Dyad ": has rank 1 _ . y may be numeric or boxed, of any rank (we will not consider boxed y here). If x is a scalar it is replicated to the length of a 1-cell of y . If y is numeric, each 1-cell of y produces a character-list result, with each atom in the 1-cell of y being converted to a character string as described by the corresponding atom of x and with the results of adjacent 0-cells being run together; the results from all 1-cells are collected into an array using the frame with respect to 1-cells. That sounds like the description of a verb with rank 1; almost so but not quite, because ": looks at the entire y and adds extra spaces as necessary to make all the columns line up neatly.

A field descriptor in x needs two numbers, w and d . These are represented as the real and imaginary parts of a complex number, so that only a single scalar is needed to hold the field descriptor. The real part of the field descriptor is w, and it gives the width of the field in characters. If the result will not fit into the field, the entire field is filled with asterisks; to prevent this you may use a w of 0 which will cause the interpreter to make the field as wide as necessary. The imaginary part of the field descriptor is d, giving the number of digits following the decimal point. If either w or d is less than zero, the result is in exponential form (and the absolute value of w or d gives the corresponding width), otherwise the result is in standard form. Examples:

0 ": i. 4 4

0 1 2 3

4 5 6 7

8 9 10 11

12 13 14 15

Note that extra spaces were added to the one-digit values to make them line up. Note also that 'enough space' includes a leading space to keep adjacent results from running together.

1 ": i. 4 4

0123

4567

89**

When you specify a width, you are expected to mean it, and no extra space is added. Here two-digit results would not fit and were replaced with '*'.

0 0j3 0j_3 ": 100 %~ i. 3 3

0 0.010 2.000e_2

0 0.040 5.000e_2

0 0.070 8.000e_2

A complex number has its parts separated by 'j' . Here the first field is an integer, the second has 3 decimal places, and the third is in exponential form.

When dyad ": is applied to an array, the result has the same rank as y . If you need to write that result to a file, you will need to use monad , to convert it to a list, possibly after adding CR or LF characters at the ends of lines.

Monad ":

Monad ": resembles dyad ": with a default x, except that x depends on the value of the corresponding atom of y . The simple description of monad ": is that it formats numbers the way you'd like to see them formatted. The full description is as follows: The precision d is given by a system variable that can be set by the foreign 9!:11 y (and queried by 9!:10 y; initially it is 6) ; as a quick override you may use the fit conjunction ":!.d to specify the value of d for a single use of ": . If an atom of y is an integer, a field descriptor of 0 is applied; if the atom has significance in the first 4 decimal places, a field descriptor of 0jd is applied; otherwise a field descriptor of 0j_3 is applied. Trailing zeros below the decimal point are omitted.

9!:10 ''

": 0 0.01 0.001 0.000015 0.12345678 2

0 0.01 0.001 1.5e_5 0.123457 2

In spite of all the detail I gave about the default formatting, in practice you just apply monad ": to any numeric operand and you get a good result. Monad ": accepts character arguments as well and leaves them unchanged:

; ":&.> 'Today ';'is ';2002 1 24

Today is 2002 1 24

We went inside the boxes with &.> and formatted each box's contents with monad ":; this made all the contents strings and we could run them together using monad ; .

Monad ": also converts boxed operands to the character arrays, including boxing characters, that we have seen in the display of boxed nouns.

Format an Array: 8!:n

To format an entire array, perhaps for display using the Grid Control, use the 8!:n family of foreigns. x 8!:0 y formats each atom of y into a boxed character string under control of the format phrases in x :

'3.0,m<$(>p<$>n<)>q< >c14.2' 8!:0 (10 500.32 ,: 22 _123456)

+---+--------------+

| 10| $500.32 |

+---+--------------+

| 22| $(123,456.00)|

+---+--------------+

The format phrases are separated by commas; here there were two, the simple 3.0 and the more advanced m<$(>p<$>n<)>q< >c13.2 . The last thing in a format phrase is the field-width specifier w.d, which means what you expect. If w is 0, a computed width is used for the entire column, while if w. is omitted, the minimum width is used for each number individually.

Preceding the w.d field come the modifiers. They include the single letters c (put commas in the formatted number) and l (left justify), and a number of modifiers, called format strings, that have a character-string value. These are specified as m<string>, m(string), or m[string], where m is the one-character name of a format string--you pick a bracket character <, (, or [ that doesn't occur in your string.

The format-string modifiers replace the default format strings for the column, except for the s modifier described presently. The format strings have the names b, d, m, n, p, q, r, and s. The default values are s='e,.-*', b='0', d=infinity, m=(2{s if n is omitted, '' if n is given), others='' . The s modifier is set differently: it is in the form s[abab...] where each a character is one of the default e,.-* and the following b character replaces that a character in the format string.

A number is classified according to its value; the value is converted to the value string v according to w.d; then the formatted result is created from a combination of the format strings and the value string, based on the value:

Infinite/indeterminate: d (default is _, __, or _. according to value)

Negative: mvn

Zero: b

Positive: pvq

The elements of s give characters to use for exponent, thousands separator, decimal point, minus sign, and insufficient field-width. If the formatted result calculated above is shorter than the field width, the field is first initialized with the r format string, repeated cyclically as necessary, and the formatted value is overlaid on that.

There are three verbs in the 8!:n family. All have infinite rank. 8!:0 produces a box for each atom of y, 8!:1 produces a box for each column of y, and 8!:2 produces an unboxed character array:

'3.0,14.2' 8!:0 (10 500.32 ,: 22 _123456)

+---+--------------+

| 10| 500.32|

+---+--------------+

| 22| -123456.00|

+---+--------------+

'3.0,14.2' 8!:1 (10 500.32 ,: 22 _123456)

+---+--------------+

| 10| 500.32|

| 22| -123456.00|

+---+--------------+

'3.0,14.2' 8!:2 (10 500.32 ,: 22 _123456)

10 500.32

22 -123456.00

Format binary data: 3!:n

If you need to write numbers to files in binary format, you need something that will coerce your numbers into character strings without changing the binary representation. J provides the foreigns 3!:4 and 3!:5 for this purpose. Each is a family of conversions, invoked as a dyad where the left operand selects the conversion to be performed.

2 (3!:4) y converts the numeric list y to a character string, representing each item of y by 4 characters whose values come from the binary values of the low-order 32 bits of y :

16b31424334

826426164

This is an integer whose value is 0x31424344. We can convert it to a character string:

2 (3!:4) 826426164

4CB1

The 4 characters (with values 0x34, 0x43, 0x42, 0x31) correspond to the bits of the number 826426164 in little-endian order.

We can use a. i. to look at the codes for each character in case they are not printable:

a. i. 2 (3!:4) 1000 100000

232 3 0 0 160 134 1 0

232 3 0 0 corresponds to 1000 and 160 134 1 0 to 100000 .

_2 (3!:4) y is the inverse of 2 (3!:4) y, converting a character string to integers:

_2 (3!:4) '4CB15CB1'

826426164 826426165

The other integer conversions are similar. 1 (3!:4) y converts the low-order 16 bits of each item of y to 2 characters, and _1 (3!:4) y converts back to integers, 2 characters per integer. 0 (3!:4) y is like _1 (3!:4) y but the integers are unsigned (i. e. in the range 0-65535).

The floating-point conversions are analogous. 2 (3!:5) y converts each item of y to 8 characters representing long floating-point form, and _2 (3!:5) y converts back; 1 (3!:5) y and _1 (3!:5) y use 4-character short floating-point form.

printf, sprintf, and qprintf

When you need formatted lines for printing you may feel at home using printf and sprintf, which work like their C counterparts. printf displays a line, while sprintf produces a string result:

'The population of %s is %d\n' printf 'Raleigh';240000

The population of Raleigh is 240000

s =. 'The total of %j is %d.\n' sprintf 1 2 3;+/1 2 3

The total of 1 2 3 is 6.

You need to execute

load 'printf'

to get the printf verbs defined. J's printf contains a few features beyond C's such as the %j field type seen above. You should run the Lab 'Formatting with printf' for details.

One feature with no analogue in C is qprintf, which produces handy typeout for debugging:

a =. 3 4 5 [ b =. 'abc';'def';5 [ c =: i. 3 3

qprintf 'a b c '

a=3 4 5 b=

+---+---+-+

|abc|def|5|

+---+---+-+

0 1 2

3 4 5

6 7 8

qprintf is described in the 'Formatting with printf' lab.

Convert Character To Numeric: Dyad ".

Dyad ". has infinite rank. x ". y looks for numbers in the words of y, which must be a character array. Words representing valid numbers are converted to numbers; words not representing valid numbers are replaced by x . If 1-cells of y have differing numbers of numbers, x is used to pad short rows. 'Valid numbers' to dyad ". are anything you could type into J as a number, and more: the negative sign may be '-' rather than '_'; numbers may contain commas, which are ignored; and a number may start with a decimal point rather than requiring a digit before the decimal point. With the relaxed rules you can import numbers from a spreadsheet into J, using x&". to convert them to numbers:

999 ". '12 .5 3.6 -4 1,000 x25'

12 0.5 3.6 _4 1000 999

All legal numbers except for 'x25' .

If your character string is a valid J number, either monad ". or dyad ". can be used to convert it, but dyad ". is much faster.