10. Continuing to Write in J

Now that we have a formidable battery of verbs at our command, let's continue writing programs in J. The data definitions are repeated here for convenience:

empno[nemp] (in J, just empno) - employee number for each member of the staff. The number of employees is nemp.

payrate[nemp] - number of dollars per hour the employee is paid

billrate[nemp] - number of dollars per hour the customer is billed for the services of this employee

clientlist[nclients] - every client has a number; this is the list of all of them. The number of clients is nclients.

emp_client[nemp] - number of the client this employee is billed to

hoursworked[nemp][31] - for each employee, and for each day of the month, the number of hours worked

Problem 4: Find the amount to bill a given client. C code:

// cno is number of client we are looking up

// result is amount to bill

int billclient(int cno)

{

int i, j, temp, total;

total = 0;

for(i = 0;i<nemp;++i) {

if(emp_client[i]==cno) {

for(j = 0, temp = 0;j<31;++j)temp += hoursworked[i][j];

total += billrate[i] * temp;

}

return(total);

}

The function is implemented in C by looping over the employees and picking the ones that are working for the specified client. In J we deal with entire arrays rather than with elements one at a time, and the general plan is:

get the mask of employees billed to the requested client;

select the hoursworked records for the applicable employees;

for(each employee) // 1

for(each day) accumulate total hours; // 2

for(each employee)multiply hours by billing rate;

for(each employee)get total billing;

This example is the first one in which an argument is passed into the J verb. Within the verb the right argument is referred to as y and the left argument (if the verb is a dyad) is referred to as x . As usual in C we might use fewer big loops, but in J we stick to small loops. The mask of employees billed to the client is given by emp_client = y which is a mask with 1 for the selected employees, 0 for the others (remember that = is a test for equality, not an assignment). We can select the hoursworked items for the specified client by (emp_client = y) # hoursworked; then the sum for each day will be a sum within each 1-cell, resulting in a list of hours for each selected employee. The line +/"1 (emp_client = y) # hoursworked performs the functions of loops 1 and 2 in the pseudocode: loop 1 within each cell and loop 2 in the implied loop over the cells. Then, it is a simple matter to select the billing rates for the employees, multiply each billing rate by the number of hours worked, and take the total over all employees billed to the customer. The solution (using a temporary variable to hold the mask) is

billclient =: monad define

mask =. emp_client = y

+/ (mask # billrate) * +/"1 mask # hoursworked

)

Problem 5: For each day, find the worker who billed the most hours (don't worry about a tie for first place--just pick any one worker from the top billers for each day). C code:

// result: drudges[] is empno of worker with most hours each day

void dailydrudge(int drudges[31])

{

int i, j, highhours;

for(i = 0;i<31;++i) {

highhours = -1;

for(j = 0;j<nemp;++j)

if(hoursworked[j][i]>highhours) {

drudges[i] = empno[j];

highhours = hoursworked[j][i];

}

We note that the inner loop, which records the employee number of any employee who worked more than the previous high, does not correspond to any of our J verbs. So, we break this loop into operations that do correspond to J verbs:

for(each day)find the maximum number of hours worked; // 1

for(each day)find the index of the employee who worked that much; // 2

for(each day)translate the index to an employee number; // 3

Loop 1 is simply >./ hoursworked . Loop 2 calls for dyad i. to perform the search, but there is a little problem: each search must go through a column of hoursworked, giving the hours worked by each employee on that day, but the column is not an item of hoursworked; each item is a row, giving hours worked on each day by one employee. The solution is to transpose hoursworked (by |: hoursworked), making the items correspond to days rather than employees. Then we match up the resulting1-cells with the 0-cells of the maximum found by loop 1 and find the index of each maximum, using i."1 0 or the equivalent i."_1 . Loop 3 is a simple application of dyad { . The final code is

dailydrudge =: monad define

((|: hoursworked) i."_1 >./ hoursworked) { empno

)

Problem 6: Order the employees by the amount of profit produced. This is a bagatelle for J and we won't even bother with C code. We have a verb that returns the profit for each employee, so we call it and use the result as the keys for sorting the employee numbers into descending order. Note that a verb must be given an argument when it is executed; we use 0 as a convenient value. The final J code is

producers=: monad : 'empno \: empprofit 0'

which makes use of the verb we defined earlier:

empprofit =: monad define

(billrate - payrate) * +/"1 hoursworked

)

Problem 7 is similar: Order the clients by the amount of profit produced. It requires more ingenuity, and the C code would be more than I want to show, so let's try to write in J directly. We start with the list of clients clientlist, the list that tells which client each employee worked for emp_client, and the profit per employee given by empprofit 0 . For each client, we need to find the employees that worked for the client and add up the profit they brought in. For this kind of problem we want an array with employees for one axis and clients for the other, with a 1 in the positions where the employee is assigned; then we can do array * empprofit 0 and do some suitable summing of the result. Let's work out what array must look like. Since dyad * has rank 0, array * empprofit 0 is going to replicate 0-cells of the shorter-frame argument (empprofit 0) which means that a single profit value is going to multiply an entire 1-cell of array . So we see that the leading axis of array must be clients and the second axis must be employees; each item will have the client mask for one employee. The way to create that is by clientlist ="1 0 emp_client which will compare the entire client list against each 0-cell of emp_client and form the results into an array. Then, (clientlist ="1 0 emp_client) * empprofit 0 will have one item per employee, each item having the amount billed to each client; we sum these items to produce a list with the profit from each client, and use that to order the client numbers. Solution:

custbyprofit =: monad define

clientlist \: +/ (clientlist="1 0 emp_client) * empprofit 0

)

For our final problem in the payroll department, consider how to calculate withholding tax on each employee's earnings. The tax rate within tax brackets will be assumed to be the same for all employees. C code for this would look like:

// result: withholding for each employee in shekels[]

int renderuntocaesar(float shekels[])

{

// tax-bracket table: start of each bracket, ending with high value

float bktmin[4] = {0,10000,20000,1e9};

// tax-bracket table: rate in each bracket

float bktrate[3] = {0.05,0.10,0.20};

int earns[nemp];

int i, j;

empearnings(earns); // get earnings for each employee

for(i = 0;i<nemp;++i) {

shekels[i]= 0.0;

for(j = 0;j<sizeof(bktrate)/sizeof(bktrate[0]);++j) {

if(earns[i] > bktmin[j]) {

float bktval = bktmin[j+1];

if(earns[i] < bktval)bktval = earns[i];

shekels[i] += bktval * bktrate[j];

}

In J, we will sum over the tax brackets and we must create a suitable array for the summation. The items in this array will be the amounts earned in each tax bracket. Corresponding to the two if statements we will use conditional verbs to discard amounts earned outside the tax bracket. The conditionals will operate on each row, so they will have rank 1, and they will look something like 0 >. (bracket_top <. earned) - bracket_bottom which will give the amount earned in the bracket, set to 0 if the earnings are below the bracket and to the size of the bracket if above. We will create bracket_top by shifting the brackets left and filling with infinity (this corresponds to the bktmin[j+1] reference in the C code). We could create earned by replicating the earnings for each employee to form a 1-cell for each bracket--the code for this would be (#bktmin)$"0 empearnings ''--but it's not necessary to create that array explicitly: we just use the implicit looping to cause each cell of empearnings '' to be replicated during the comparison with bracket_top. Making all these substitutions, noting that all the operations have rank 1, and summing at the end over the items within each 1-cell, we get the final code:

renderuntocaesar =: monad define

bktmin =. 0 10000 20000

bktrate =. 0.05 0.10 0.20

t=. ((1 |.!._ bktmin) <."1 0 empearnings '') -"1 bktmin

+/"1 bktrate *"1 (0) >. t

)

We used the temporary variable t so our sentences would fit on the page.

Example: Counting Words and Lines

Let's write a program to count the lines and words in a file. This is a simple task, and in C it might look like:

// f points to filename

// result is words,lines

int[2] wc(char *f)

{

FILE fid;

int ct[2]; /* # words, # lines */

char c;

fid = fopen(f);

while(EOF != (c = fgetc(fid)) {

if(c == ' ')++ct[0];

if(c == LF)++ct[1];

}

return (ct);

}

Rather than loop for each character, in J we process files by reading the whole file into a list of characters and then operating on the list. The monad ReadFile (defined in the script jforc.ijs) has infinite rank; it takes a filename y and yields as result the contents of the file, as a list of characters. Once the file is a list, it is trivial to compare each character against space and linefeed, yielding for each a list of 1s at each position filled by the character; then summing the list gives the number of each character. J code to do this is:

NB. y is string filename, result is (#words),(#lines)

wc =: monad define

+/"1 (' ',LF) ="0 1 ReadFile y

)

Suppose our user complains that wc needs improvement: specifically, it should also return the number of characters in the file, should not count multiple consecutive whitespace characters (including TAB) as separate words, and should treat the trailing LF as a word delimiter as well as a line delimiter. In C, we would respond by adding a few more tests and flags, but in J we realize that a major change to the program's function calls for a thorough rethinking.

To ignore multiple whitespace characters we need to define what an ignored whitespace is, without using flags and loops. This part of the problem often calls for creative thought; here we realize that a whitespace character is to be ignored if the previous character is whitespace. That's easy, then: we just calculate a Boolean vector, 1 if the character is whitespace, and then use shift and Boolean AND to ignore multiple whitespace. The code to do this is worth looking at; it is

sw =. (-. _1 |. w) *. w =. f e. ' ',TAB,LF

where f is the contents of the file. Note that we assign a value to w just before we right-shift w . This is legal in J: sentences are processed right-to-left, and the interpreter has not seen the reference to w at the time w is assigned. A similar statement in C, for example x = w + (w = 4); , would be undefined. Of course, even though it's legal in J, some would cavil--we will eventually learn ways to do this without defining a variable at all--but I leave it to you to decide how far you will detour to avoid jaywalking. Once sw has been calculated, the rest of the program is trivial. The final result is:

NB.Version 2. Discard multiple whitespace,

NB.and return (#chars),(#words),(#lines)

wc2 =: monad define

f =. ReadFile y

sw =. (-. _1 |. w) *. w =. f e. ' ',TAB,LF

(#f),(+/sw),(+/ LF = f)

)