Differences between revisions 20 and 21
 ⇤ ← Revision 20 as of 2008-03-21 03:35:43 → Size: 6868 Editor: RogerHui Comment: ← Revision 21 as of 2008-12-08 10:45:28 → ⇥ Size: 6870 Editor: anonymous Comment: converted to 1.6 markup Deletions are marked like this. Additions are marked like this. Line 4: Line 4: [[TableOfContents(2)]] <> Line 222: Line 222: [[BR]] <
> Line 225: Line 225: [http://www.jsoftware.com/pipermail/programming/2008-March/010037.html J programming forum] [[http://www.jsoftware.com/pipermail/programming/2008-March/010037.html|J programming forum]]

## Introduction

We represent a tree as a list p of indices where i{p  is the parent of node i . (i=i{p if node i is a root and is parentless.) For example:

```   p=: _1 p: i.15
(i.15) ,: p
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 0 0 1 2 2 3 3 4 4  4  4  5  5  6

┌─ 6 ─── 14
┌── 1 ─── 3 ─┴─ 7
│            ┌─ 8
─ 0 ─┤      ┌─ 4 ─┼─ 9
│      │     ├─ 10
└── 2 ─┤     └─ 11
│     ┌─ 12
└─ 5 ─┴─ 13```

List v are numeric values for each node. The tree sum of a node is the sum of the values of the subtree rooted at that node. (This is also known as summarized explosion or bill of materials.) Given p and v , find the tree sums of all the nodes.

```   v=: 15 ?.@\$ 20
(i.15) , p ,: v
0  1  2  3  4  5 6  7 8  9 10 11 12 13 14
0  0  0  1  2  2 3  3 4  4  4  4  5  5  6
6 15 19 12 14 19 0 17 0 14  6 18 13 18 11

┌─ 6  ( 0 11) ─── 14 (11 11)
┌── 1 (15  55) ─── 3 (12 40) ─┴─ 7  (17 17)
│                             ┌─ 8  ( 0  0)
─ 0 (6 182) ─┤               ┌─ 4 (14 52) ─┼─ 9  (14 14)
│               │             ├─ 10 ( 6  6)
└── 2 (19 121) ─┤             └─ 11 (18 18)
│             ┌─ 12 (13 13)
└─ 5 (19 50) ─┴─ 13 (18 18)

v tsum p
182 55 121 40 52 50 11 17 0 14 6 18 13 18 11```

In the diagram above, each node is labelled with its index, value, and tree sum.

The following verb can be used to generate test data. dataset k creates trees with _1+2^k nodes.

```testdata=: 3 : 0
m =: _1+2^y
v =: m ?@\$ 20
pa=: _1 p: i.m
pb=: 0,2#i.<.m%2                           NB. complete binary tree
pc=: 0>.<:i.m                              NB. tall skinny tree
pd=: m (\$,) (c*i.c) ([,.+/) i.<:c=. <.%:m  NB. forest
i.0 0
)```

The tree sum is the product of the reflexive-transitive closure of the adjacency matrix and the list of values. Thus:

```am   =: i.@,~@# e. # #. ] ,. i.@#
rc   =: +. =@i.@#
tc   =: +./ .*.~^:(2>.@^.#)
tsum_mat=: tc@rc@am@] +/ .* [```

```   {&'.x'&.> (am ; rc@am ; tc@rc@am) p
┌───────────────┬───────────────┬───────────────┐
│xxx............│xxx............│xxxxxxxxxxxxxxx│
│...x...........│.x.x...........│.x.x..xx......x│
│....xx.........│..x.xx.........│..x.xx..xxxxxx.│
│......xx.......│...x..xx.......│...x..xx......x│
│........xxxx...│....x...xxxx...│....x...xxxx...│
│............xx.│.....x......xx.│.....x......xx.│
│..............x│......x.......x│......x.......x│
│...............│.......x.......│.......x.......│
│...............│........x......│........x......│
│...............│.........x.....│.........x.....│
│...............│..........x....│..........x....│
│...............│...........x...│...........x...│
│...............│............x..│............x..│
│...............│.............x.│.............x.│
│...............│..............x│..............x│
└───────────────┴───────────────┴───────────────┘
(tc rc am p) +/ .* v
182 55 121 40 52 50 11 17 0 14 6 18 13 18 11
v tsum_mat p
182 55 121 40 52 50 11 17 0 14 6 18 13 18 11```

## Descendants

An adjacency matrix is impractical for large trees. For example, a tree with a million nodes has an adjacency matrix of size 1e6 by 1e6, requiring at least 1e12 bytes. A more efficient alternative is to work with lists of descendants, starting with lists of sons.

```sons=: (~. i. i.@#) { a: ,~ (</. i.@#)
gen =: 3 : '(*c) #^:_1 (I.c) <@;/. (;y){y [ c=. #&> y'
ad  =: 4 : '(*c) #^:_1 (I.c)  +//. (;y){x [ c=. #&> y'

tsum_des=: 4 : 0
x=. x + t=. x ad d=. (sons y) -.&.> i.#y
while. 0 +./@:~: t do. x=. x + t=. x ad d=. gen d end.
)
v tsum_des p
182 55 121 40 52 50 11 17 0 14 6 18 13 18 11```

gen d computes the descendants between generations k and 2*k , if d are the descendants between generations 0.5*k and k . Starting with the list of sons, the loop will terminate after at most 2^.#p iterations.

tsum_des is effective for trees with hundreds of thousands of nodes.

In scalar-oriented languages an efficient way to solve this problem can be described in 3 words: depth first search. That is:

treesum(x) is

• value(x) if x is a leaf

• value(x) + +/ treesum"0 (sons of x) if not

The time and space should be within a small factor of the time and space to do +/\values .

The following solution is based on the depth first search idea. In each iteration, tree sums for the leaves are added to the parents and then the edges with leaves at one end are removed. The iteration terminates when no edges remain.

```tsum_dfs=: 4 : 0
e=. y (~: #"1 ,:) i.#y
while. {:\$e do.
b=. e.~/e
'p q'=. (-.b)#"1 e
e=. b#"1 e
j=. ~.p
x=. ((j{x)+p+//.q{x) j}x
end.
x
)```

tsum_dfs is more efficient than the others for trees that are shallow relative to the number of nodes (say, logarithmic depth), and less so for tall skinny trees.

## Collected Definitions

```testdata=: 3 : 0
m =: _1+2^y
v =: m ?@\$ 20
pa=: _1 p: i.m
pb=: 0,2#i.<.m%2                           NB. complete binary tree
pc=: 0>.<:i.m                              NB. tall skinny tree
pd=: m (\$,) (c*i.c) ([,.+/) i.<:c=. <.%:m  NB. forest
i.0 0
)

am   =: i.@,~@# e. # #. ] ,. i.@#
rc   =: +. =@i.@#
tc   =: +./ .*.~^:(2>.@^.#)
tsum_mat=: tc@rc@am@] +/ .* [

sons =: (~. i. i.@#) { a: ,~ (</. i.@#)
gen  =: 3 : '(*c) #^:_1 (I.c) <@;/. (;y){y [ c=. #&> y'
ad   =: 4 : '(*c) #^:_1 (I.c)  +//. (;y){x [ c=. #&> y'

tsum_des=: 4 : 0
x=. x + t=. x ad d=. (sons y) -.&.> i.#y
while. 0 +./@:~: t do. x=. x + t=. x ad d=. gen d end.
)

tsum_dfs=: 4 : 0
e=. y (~: #"1 ,:) i.#y
while. {:\$e do.
b=. e.~/e
'p q'=. (-.b)#"1 e
j=. ~.p
x=. ((j{x)+p+//.q{x) j}x
e=. b#"1 e
end.
x
)```

Contributed by RogerHui. The tree sum problem was posed by Raul Miller to the J programming forum on 2008-03-11.

Essays/Tree Sum (last edited 2008-12-08 10:45:28 by anonymous)