KenLettow posed the following problem to the J Programming Forum on 2008-11-11:

Each line of a file contains strings of the form attribute=value separated by the & character (see below). Generate a table of the values for each line of the file for a specified list of attributes.

```y=: 0 : 0
att0=4010&att7=2457&att2=439
att3=902&att2=413&att5=4262&att4=4967
att5=4040&att1=465

att4=2733
att3=2397&att2=1104&att6=2625
)```

Contents

## A Solution

The problem can be solved by using cut (;.) as follows. The solution works on all the lines at once rather than a line at a time.

```NB. y: lines of  att=value&att=value&att=value& ...  terminated by LF
NB. x: required attributes
tab=: 4 : 0
av=. a: -.~ (y e. LF,'=&') <;._2 y  NB. attribute-value pairs
a=. av #~ (#av)\$1 0                 NB. attributes
v=. av #~ (#av)\$0 1                 NB. values
n=. (y=LF) +/;._2 y='='             NB. # attributes in each line
}:"1 v (<"1 (I.n),.x i. a)} ((#n),1+#x)\$a:
)```

For example:

```   ] x=: <;._1 ' att0 att1 att2'
┌────┬────┬────┐
│att0│att1│att2│
└────┴────┴────┘
x tab y
┌────┬───┬────┐
│4010│   │439 │
├────┼───┼────┤
│    │   │413 │
├────┼───┼────┤
│    │465│    │
├────┼───┼────┤
│    │   │    │
├────┼───┼────┤
│    │   │    │
├────┼───┼────┤
│    │   │1104│
└────┴───┴────┘```

## Program Logic

0. If y is cut on trailing LF , = , and & characters, and empty boxes are removed, the result is a boxed vector of   attribute value attribute value ...

```   ] av=. a: -.~ (y e. LF,'=&') <;._2 y
┌────┬────┬────┬────┬────┬───┬────┬───┬────┬───┬────┬────┬────┬────┬────┬────┬────┬───┬────┬────┬
│att0│4010│att7│2457│att2│439│att3│902│att2│413│att5│4262│att4│4967│att5│4040│att1│465│att4│2733│...
└────┴────┴────┴────┴────┴───┴────┴───┴────┴───┴────┴────┴────┴────┴────┴────┴────┴───┴────┴────┴```

1. The even-numbered entries are the attribute names.

```   ] a=. av #~ (#av)\$1 0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│att0│att7│att2│att3│att2│att5│att4│att5│att1│att4│att3│att2│att6│
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘```

2. The odd-numbered entries are the corresponding values.

```   ] v=. av #~ (#av)\$0 1
┌────┬────┬───┬───┬───┬────┬────┬────┬───┬────┬────┬────┬────┐
│4010│2457│439│902│413│4262│4967│4040│465│2733│2397│1104│2625│
└────┴────┴───┴───┴───┴────┴────┴────┴───┴────┴────┴────┴────┘```

3. The number of a-v pairs on each line obtains by a partitioned sum on the number of = on the line.

```   ] n=. (y=LF) +/;._2 y='='
3 4 2 0 1 3```

4. The overall result has shape (# lines),(# attributes of interest). The value part of each a-v pair amends entry i,j where i is the line number and j is x i. a . The program temporarily works with a table with one extra column, with the values for attributes not in x amending that extra column.

```   (#n),1+#x
6 4
] i=. (I.n) ,. x i. a
0 0
0 3
0 2
1 3
1 2
1 3
1 3
2 3
2 1
4 3
5 3
5 2
5 3
v (<"1 i)} 6 4\$a:
┌────┬───┬────┬────┐
│4010│   │439 │2457│
├────┼───┼────┼────┤
│    │   │413 │4967│
├────┼───┼────┼────┤
│    │465│    │4040│
├────┼───┼────┼────┤
│    │   │    │    │
├────┼───┼────┼────┤
│    │   │    │2733│
├────┼───┼────┼────┤
│    │   │1104│2625│
└────┴───┴────┴────┘```

## Line-at-a-Time

```line=: 4 : 0
av=. a: -.~ (y e. LF,'=&') <;._2 y
a=. av #~ (#av)\$1 0
v=. av #~ (#av)\$0 1
(a i. x) { v,a:
)

tab2=: 4 : 'x&line;.2 y'

x (tab -: tab2) y
1```

## Notes

• If lines of y are terminated by CRLF rather than by LF , the CR characters must first be removed: x tab y-.CR

• If lines of y are separated by LF rather than terminated by LF , append a LF to y : x tab y,LF

• The program does not handle cases where a value contains = or & (or even LF), nor a value enclosed in quotes.

Contributed by RogerHui.

Essays/Attribute-Value Processing (last edited 2008-12-08 10:45:50 by anonymous)