Here's an Aggregation prototype based on notes above used in pre-release stages of JDB. By JDB release, aggregation is integrated into query syntax. This approach still can be used as a template for extending further features, such as expressions and functions.
Download: agg.ijs
First load jdb using a few utilities from strings.
require 'data/jdb strings' coinsert 'jdb'
We separate Select Clause parser as nested splitting over by, comma and alias.
NB. parse 'alias:func col,... by alias:col,...'
NB. return lists: ra rf rc ba bc
sel_parse=: 3 : 0
t=. 2{.' by ' splitstring y
c=. (<',') splitstring^:(0<#@]) each t
'r b'=. (<':') (_2 {. splitstring)^:(0<#@]) every L:1 c
'ra re'=. <"1 |:r
'rf rc'=. <"1 |:(<' ') (_2 {. splitstring)^:(0<#@]) every re
'ba bc'=. <"1 |:b
ra ; rf ; rc ; ba ,&< bc
)
Note 'Test sel_parse'
hd=. (;:'ra rf rc ba bc') ,: ]
hd sel_parse 'a,b,c'
hd sel_parse 'a,b,c by qq,zz'
hd sel_parse 'q:a,b,x:c'
hd sel_parse 'a,q:b,c by qq,z:zz'
hd sel_parse 'sum a,q:avg b,min c by qq,z:zz'
)Parsing test results gives the idea:
hd sel_parse 'a,q:b,c by qq,z:zz'
+-----+----+-------+----+-------+
|ra |rf |rc |ba |bc |
+-----+----+-------+----+-------+
|++-++|++++|+-+-+-+|++-+|+--+--+|
|||q|||||||||a|b|c||||z|||qq|zz||
|++-++|++++|+-+-+-+|++-+|+--+--+|
+-----+----+-------+----+-------+
hd sel_parse 'sum a,q:avg b,min c by qq,z:zz'
+-----+-------------+-------+----+-------+
|ra |rf |rc |ba |bc |
+-----+-------------+-------+----+-------+
|++-++|+---+---+---+|+-+-+-+|++-+|+--+--+|
|||q||||sum|avg|min|||a|b|c||||z|||qq|zz||
|++-++|+---+---+---+|+-+-+-+|++-+|+--+--+|
+-----+-------------+-------+----+-------+Next we define aggregator operations and a helper function.
The Reada verb follows the algorithm in Aggregation notes.
'`sum avg min max count first last'=: (+/)`(+/%#)`(<./)`(>./)`#`{.`{:
ovr=: [^:(0 < #@[) each NB. x if nonempty overrides y
Reada=: 4 : 0
if. 1=#y=. boxopen y do. Reads__x y return. end.
'tbl wh sl'=. 3{.y NB. where and select clauses
if. -.' by ' +./@E. sl do. Reads__x y return. end.
'ra rf rc ba bc'=. sel_parse sl NB. parse select clause
q=. (','joinstring bc,rc),' from ',tbl,(0<#wh)#' where ',wh
s=. {:Read__x q NB. fetch filtered base columns
c=. i.~|:i.~every (#bc){.s NB. classi-map from by-cols
ff=. rf,~(#bc)${.`{. NB. aggregs with {. for by-cols
r=. ''
for_i. i.#bc,rc do.
r=. r,< c ff@.i/. i {::s NB. apply aggregs over c
end.
((ba ovr bc),ra ovr rc),:tocolumn r NB. columnar format
)
Note 'sandp'
buildsandp_jdb_ F=. jpath '~temp'
d=. opensandp_jdb_ F
'j p s sp spj'=: GetTable__d 'j p s sp spj'
Reads__d 'from j'
Reads__d 'from p'
Reads__d'from s'
Reads__d '** from sp where s.city=p.city'
Reads__d '** from spj where color=Blue'
d Reada 'j';'';'job_cnt:count jname by city'
d Reada 'p';'city<>Paris';'sum weight by color'
d Reada 'p';'';'sum weight by color,city'
d Reada 'p';'';'avg weight by city,color'
d Reada 'p';'';'avg weight,pone:first pid by where:city,color'
d Reada 's';'';'min_st:min status by city'
d Reada 'sp';'';'sum weight,min status by supplier:s.city'
d Reada 'sp';'';'sum weight by supp:s.city,part:p.city'
d Reada 'spj';'';'avg_qty:avg qty by supp:sname,part:p.city'
d Reada 'spj';'';'avg_qty:avg qty by part_col:p.color'
resetall_jdb_''
)Example output:
j Reada '*';'job_cnt:count jname by city'
+------+-------+
|city |job_cnt|
+------+-------+
|Paris |1 |
|Rome |1 |
|Athens|2 |
|London|2 |
|Oslo |1 |
+------+-------+
p Reada 'city<>Paris';'sum weight by color'
+-----+------+
|color|weight|
+-----+------+
|Red |45 |
|Blue |17 |
+-----+------+
p Reada '*';'sum weight by color,city'
+-----+------+------+
|color|city |weight|
+-----+------+------+
|Red |London|45 |
|Green|Paris |17 |
|Blue |Oslo |17 |
|Blue |Paris |12 |
+-----+------+------+
p Reada '*';'avg weight by city,color'
+------+-----+------+
|city |color|weight|
+------+-----+------+
|London|Red |15 |
|Paris |Green|17 |
|Oslo |Blue |17 |
|Paris |Blue |12 |
+------+-----+------+
p Reada '*';'avg weight,pone:first pid by where:city,color'
+------+-----+------+----+
|where |color|weight|pone|
+------+-----+------+----+
|London|Red |15 |P1 |
|Paris |Green|17 |P2 |
|Oslo |Blue |17 |P3 |
|Paris |Blue |12 |P5 |
+------+-----+------+----+
s Reada '*';'min_st:min status by city'
+------+------+
|city |min_st|
+------+------+
|London|20 |
|Paris |10 |
|Athens|30 |
+------+------+
sp Reada '*';'sum weight,min status by supplier:s.city'
+--------+------+------+
|supplier|weight|status|
+--------+------+------+
|London |134 |20 |
|Paris | 46 |10 |
+--------+------+------+
sp Reada '*';'sum weight by supp:s.city,part:p.city'
+------+------+------+
|supp |part |weight|
+------+------+------+
|London|London|59 |
|London|Paris |58 |
|London|Oslo |17 |
|Paris |London|12 |
|Paris |Paris |34 |
+------+------+------+
spj Reada '*';'avg_qty:avg qty by supp:sname,part:p.city'
+-----+------+-------+
|supp |part |avg_qty|
+-----+------+-------+
|Smith|London| 450|
|Jones|Oslo |442.857|
|Jones|Paris | 100|
|Blake|Oslo | 200|
|Blake|London| 500|
|Clark|London| 300|
|Adams|Paris | 260|
|Adams|London| 400|
|Adams|Oslo | 200|
+-----+------+-------+
spj Reada '*';'avg_qty:avg qty by part_col:p.color'
+--------+-------+
|part_col|avg_qty|
+--------+-------+
|Red | 400|
|Blue |353.846|
|Green | 150|
+--------+-------+