Nearer Larger Neighbor: J Programs Compared

The Nearer Larger Neighbor problem is introduced. J programs for solving it are defined and compared by speed, space, and spread.

Introduction

The “Nearer Larger Neighbor” problem is:

Variants of this problem arise in several areas of mathematics and computer science, notably computational geometry for choosing Euclidean distances between points. See references below. Fundamentally, the problem entails bidirectional search yet generalizes naturally to higher dimensional arrays. It is instructional to develop programs with different approaches for a list first.

Asano et al [1] posed the “All Nearest Larger Neighbors” problem and presented several 1-D algorithms. Although their problem is slightly different, the following programs begin with my attempts to code theirs in J for comparisons here.

Asano’s Algorithms

Asano’s Algorithm 1 is a right-and-left stack scan where the output is a list of associated indexes (converted to origin 0), and ties are resolved to the leftmost index:

A1 =: 3 : 0
n =. #y
s =. i.0
nln =. n#0
lnln =. n#0
rnln =. n#0
for_i. i.n
   do.  while.  ((#s)~:0)*.(i{y)>:({.s){y
           do.  s =. }.s
          end.
        if. (#s)=0 
        do. lnln =.(-n) i} lnln 
      else. lnln =. ({.s) i} lnln
       end.
        s =. i,s
  end.
s =. i.0
for_i. |.i.n            
   do.  while.  ((#s)~:0)*.(i{y)>:({.s){y
           do.  s =. }.s
          end.
        if. (#s)=0 
        do. rnln =. (3*n) i} rnln 
      else. rnln =. ({.s) i} lnln 
       end.
        s =. i,s
  end.
for_i. i.n
   do.  if. (i-i{lnln)<:(i{rnln)-i 
        do. nln =. (i{lnln) i} nln 
      else. nln =. (i{rnln) i} nln 
       end.
end.
nln
)

Example from [1]:

   A1  87 32 12 54 28 35 14 61 18 53
_10 0 1 0 3 3 5 0 7 7           NB. indexes with left preference

Notice that _10 is the default index for the largest item. Another example:

   A1 4 5 3 8 6 3 2 4 5 1
1 _10 1 _10 3 4 5 4 4 8

A1 incorrectly returns -n for the second item (5) when there is no larger item to the left even though there is a larger to the right and returns the wrong index for the eighth item (4) which has a nearer larger item (5) on its right. This program can be corrected and significantly simplified in J style programming:

A1j =: 3 : 0
n =. #y
nln =. i.0
for_i.  i.n
do.     item =. i{y
        left =. i
        while.  left>:0
        do.     if. item<left{y do. break. end.
                left =. <:left
        end.
        left =. (left<0){left,-n
        right =. i
        while.  right<n
        do.     if. item<right{y do. break. end.
                right =. >:right
        end.
        right =. (right=n){right,3*n
        larger =. (i-left)<:right-i
        nln =. nln,larger{right,left
end.
)

  A1j 4 5 3 8 6 3 2 4 5 1
1 3 1 _10 3 4 5 8 4 8

This program is much shorter and increasingly faster than A1 and uses much less space.

It is desirable to change default index (-n) to i-n (and 3*n to i+n) so that the output can index actual items directly. So, use a negative default index for the largest item and eliminate the inner loops:

A1j =: 3 : 0
n =. #y
nln =. i.0
for_i.  i.n
do.     item =. i{y
        left =. |.item<i{.y
        left =. (+./left){n,>:left i. 1
        right =. item<}.i}.y
        right =. (+./right){n,>:right i. 1
        nln =. nln,(left<:right){i+right,-left
end.
)

Example:

   ,i  =.  A1j v =. 4 5 3 8 6 3 2 4 5 1
1 3 1 _7 3 4 5 8 4 8

Then, to index items, simply enter:

    i{v
5 8 5 8 8 6 3 5 6 5                     NB. default is largest item

Now change Asano’s problem to return items instead of indexes and to not prefer the left item in case of ties – that is, to solve the Nearer Larger Neighbor problem stated above.

A1a =: 3 : 0
n =. #y
all =. i.0
for_i.  i.n
do.     left =. right =. i
        while.  look=.left>:0
        do.     large =. left{y
                if. large>i{y do. break. end.
                left =. <:left
        end.
        left =. left%look
        large =. large*look
        while.  look=.right<n
        do.     larger =. right{y
                if. larger>i{y do. break. end.
                right =. >:right
        end.
        right =. right%look
        larger =. larger*look
        nearer =. (i-left)(>,>:)right-i
        all =. all,>./nearer{large,larger
end.
)

Example:

   A1a 4 5 3 8 6 3 2 4 5 1
5 8 8 0 8 6 4 5 6 5                     NB. items (with 0 default)

This program is much faster and much slimmer than A1 despite doing more work to find the larger of two items at equal distance.

Asano’s Algorithm 2 is a bidirectional scan by increasing distance, where the output is indexes with ties resolved to the leftmost index and -n is the default index (like A1).

A2 =: 3 : 0
n =. #y
nln =. n#0
for_i. 0 To n-1 do.
        nln =. (-n) i} nln
        for_k. 1 To Larger(i,n-i+1) do.
                if. (i-k)>:0 do. if. (i{y)<(i-k){y do.
                        nln =. (i-k) i} nln
                        break. end.
                end.
                if. (i+k)<n do. if. (i{y)<(i+k){y do.
                        nln =. (i+k) i} nln
                        break. end.
                end.
        end.
  end.
nln
)
To =: [ + i.@>:@-~
Larger =: >./

   A2 4 5 3 8 6 3 2 4 5 1
1 3 1 _10 3 4 5 8 4 8

Again, change Asano’s problem to return items instead of indexes and to return the larger item at equal distance for the Nearer Larger Neighbor problem:

A2a =: 3 : 0
n =. #y
nln =. i.0
for_i.  i.n
do.     for_d.  >:i.i>.n->:i
        do.     left =. right =. 0
                if. i>:d do. left=.(i-d){y end.
                if. i<n-d do. right=.(i+d){y end.
                larger =. left>.right
                if. larger>i{y do. break. end.
                larger =. 0
        end.
        nln =. nln,larger
end.
)

Same example:

   A2a 4 5 3 8 6 3 2 4 5 1
5 8 8 0 8 6 4 5 6 5             NB. items with 0 default

This program is about as inefficient as A2, mainly because of the algorithm. (Incidentally, using a while loop instead of a for loop is slower but saves a lot of space.)

Proposed A2 Algorithm

I propose an algorithm that searches in two stages -- a minimum distance (near) left and right, then the remaining distance left or right:

A2a =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n
   do.  d =. 1
        near =. i <. n-i+1
        while.  d<:near
           do.  left =. (i-d){y
                right =. (i+d){y
                larger =. left >. right
                if. larger>i{y do. goto_next. end.
                d =. d+1
          end.
        near =. n-1+2*near
        if. i<n%2 do. d =. n-near else. d =. near-n+1 end.
        while. near>0
           do.  larger =. d{y
                if. larger>i{y do. goto_next. end.
                d =. d+*d
                near =. near-1
          end.  
        larger =. 0                             NB. default
label_next.     i =. i+1
        all =. all,larger
  end.
)

This program is much more efficient than A2 even though it returns items instead of indexes and does more work by computing the larger of two items at the same distance instead of exiting after finding a larger item on the left. Indeed, it is 10+ times faster than A2 and uses only 37% space for n=1e5 and increasingly faster for higher n. It can be improved further by using a two-item index (d) for distances left and right:

A2a =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n
   do.  d =. i + _1 1   
        while. *./d~:_1,n
           do.  larger =. >./ d{y
                if. larger>i{y do. goto_next. end.
                d =. d+_1 1
          end.
        direction =. i<-:n 
        d =. direction{d        
        direction =. direction{_1 1     
        while. *./d~:_1,n       
           do.  larger =. d{y
                if. larger>i{y do. goto_next. end.
                d =. d+direction
          end.  
        larger =. 0     
label_next.     i =. >:i
                all =. all,larger
  end.
)

This provides a 20% speed-up in slightly less space. This is the slimmest program here.

Try combining the two inner while loops and removing goto, using a default of –infinity:

A2a =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n
   do.  d =. i
                whilst. (larger<:i{y)*.+./OK
                   do.  d =. d + _1 1 * OK=.d~:0,n-1
                                larger =. >./OK#d{y                     NB. default __
                  end.
                i =. i+1
                all =. all,larger
  end.
)

This is much shorter but slower, using slightly more space.

A better idea is to find the largest item first and exit the inner loop:

A2a =: 3 : 0
n =. #y
max =. >./y
i =. 0
all =. i.0
while. i<n
   do.  it =. i{y
        if. it=max do. larger=.0 goto_next. end.
        d =. i
        while.  it>:larger=.>./d{y
           do.  d =. d + _1 1 * d~:0,n-1
          end.
label_next.     i =. i+1
                all =. all,larger
  end.
)

This is 40% faster in the same space.

Lastly, try using a for loop and replace 0 defaults with each larger item found.

A2b =: 3 : 0
n =. #y
all =. y<>./y
for_i. all#i.n
   do.  d =. i
        while.  (i{y)>:larger=.>./d{y
           do.  d =. d+_1 1*d~:0,n-1
          end.
        all =. larger i} all
  end.
)

Nicely shorter and almost as speedy, although 40% fatter.

New Algorithms

Now consider my algorithms, presented in J. All programs allow any list of positive numbers input and return the same number of selected numbers output, with 0 as a default for no larger neighbor (unless noted otherwise).

Begin with a straightforward vector processing approach. Program NLN0 computes the nearer larger neighbor for each index position by creating a table of Neighbors from lists Lefts and Rights, finding the Largers in parallel, then selecting the Nearer item whose First indexed item is less than the input:

     EACH =: "0 _
NLN0 =: All NLN EACH ]
        All =: i.@#
        NLN =: { Nearer Larger@Neighbors
                Neighbors =: Lefts ,: Rights
                        Lefts =: |.@{.
                        Rights =: }.@}.
                Larger =: >./
                Nearer =: < First ]
                        First =: {.@#

For example:

   NLN0 4 5 3 8 6 3 2 4 5 1
5 8 8 0 8 6 4 5 6 5

This could be written as a one-liner: NLN0 =: i.@# ({ (< {.@# ]) |.@{. >./@:,: }.@}.)"0 _ ]

Obviously the shortest program here, this runs a little faster in a little more space. It could also be defined explicitly:

NLN0x =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n
do.     x =. i{y
        lefts =. |.i{.y
        rights =. }.i}.y
        i =. i+1
        all =. all , x Nearer Larger lefts,:rights
end.
)

3% slower in 10% less space for n=1e5.

A slightly different tacit definition is:

NLN0a =: All Nearer@Larger@Neighbors EACH ]
        All =: i.@#
        Neighbors =: Lefts,: Rights
                Lefts =: 0: , |.@{.
                Rights =: }.
                Larger =: >./
        Nearer =: (> {.) First ]
                First =: {.@#
                Larger =: >./
                Larger =: >./

This has about the same efficiency as NLN0.

Also, ^: can be used to drop the Largers iteratively until the first is larger:

NLN0b =: All {.@Neibors EACH ]
        All =: i.@#
        Neibors =: { Near ^: While ^: _ Lefts Larger@,: Rights
        Near =: }.@]
        While =: >: {.
        Lefts =: |.@{.
        Rights =: }.@}.
        Larger =: >./

Short, but slow and fat.

A different approach computes each distance in parallel instead of each position:

        EACH1 =: "0 1
        FLIP =: &. |.
NLN1 =: ] Nearer EACH1 |:@LN1
        Nearer =: < First ]
                First =: {.@#
        LN1 =: All1 Neighbors1 EACH ]
                All1 =: }.@i.@#                 NB. distances
                Neighbors1 =: Lefts1 >. Rights1
                        Rights1 =: #@] {. }.
                        Lefts1 =: Rights1 FLIP

This program is 3 times slower than NLN0 and runs out of memory at n=1e5. It is possible to use Transpose, but it also requires excessive time and even more space:

         T =: &. |:
NLN1 =: ] Nearer EACH1 T All1 Neighbors1 EACH ]

Try shifting in Lefts and Rights:

NLNs =: All NLN EACH ]
        All =: i.@#
        NLN =: { Nearer Neighbors
                Neighbors =: Lefts >. Rights
                        Lefts =: #@] {. |.@{.
                        Rights =: #@] {. }.@}.
                Nearer =: < First ]
                        First =: {.@#

This uses 30% less space than NLN0.

A different approach indexes each distance left and right. Explicitly:

NLN2 =: 3 : 0
all =. i.0
while. all <&# y
do. all =. all , (#all) NLNx y
end.
)
NLNx =: 3 : 0
:
item =. x{y
lefts =. (-#y){.x{.y
rights =. (#y){.x}.y
d =. 1
while.  d<#y
do.     larger =. (d{rights)>.(-d){lefts
        if. larger>item do. larger return. end. 
        d =. d+1
end. 0
)

This is much more efficient than NLN0 in time and space.

Another approach is to mask indexes progressively and store found largers for positions at distances d=1, 2, etc.

NLN3 =: 3 : 0
n =. #y
all =. n#0              
alli =. i.n
d =. 1                  
while. d<n
do.     y =. y,0        
        lefts =. (alli-d){y
        rights =. (alli+d){y
        largers =. lefts >. rights
        mask =. largers > alli{y
        i =. mask # alli
        largers =. mask # largers
        all =. largers i} all
        alli =. alli -. i
        d =. d+1
end.    all
)

Very fast but fat.

Better yet, remove the index of the largest item before searching.

NLN3 =: 3 : 0
n =. #y
all =. y < >./y 
alli =. all # i.n       
d =. 1  
while.  0<#alli
do.     y =. y,0
        largers =. (alli-d){y
        largers =. largers >. (alli+d){y
        all =. largers alli} all
        largers =. largers <: alli{y
        alli =. largers#alli
        d =. d+1
end.    all
)

Faster and much less space for high n.

A variant using a +/ table:

NLN3a =: 3 : 0
n =. #y
all =. y < >./y
alli =. all # i.n
d =. 0
while. (#alli)>0
   do.  y =. y,0
                d =. d + 1 _1
                largers =. >./ (d +/ alli){y
                all =. largers alli} all
                largers =. largers <: alli{y
                alli =. largers#alli
  end.  all
)

3% faster in 16% more space.

Try another masking definition:

NLN3c =: 3 : 0
n =. #x =. y
y =. __ * y = >./y              NB. __ default
d =. 1
whilst. 0 e. y
do.     lefts =. (-n){.(-d)}.x
        rights =. n {. d }. x
        largers =. lefts >. rights
        largers =. largers * largers>x
        y =. y + largers * y=0
        d =. d+1
end. y
)

This conserves space but takes much more time.

Dare a matrix approach (without looping):

NLN4 =: 3 : 0
ns =. ->:i.#y
right =. }:ns {."0 1 y
left =. }:|."1 ns {."0 1 |.y 
larger =. left >. right
mask =. larger >"1 y
new =. larger *&{: mask
first1 =. |. </\ |.mask *"1 -.{:mask
old =. +/ larger * first1
new + old
)

Slow and very fat.

Now try a Boolean approach within an inner loop:

NLN5 =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n 
do.     x =. i{y
        d =. 1
        whilst. {./:~neighbors<:x
        do.     neighbors =. d < (i+1),(n-i)
                neighbors =. neighbors # i + d * _1 1
                neighbors =. neighbors { y
                d =. d+1
        end.
        i =. i+1
        all =. all,>./neighbors         NB. __ default
end.
)

Slim, but not so fast.

Use Infix \ (scan) to search like a spider stepping simultaneously to both sides in increasingly larger steps:

NLN6 =: 3 : 0
n =. #y
all =. n#0
smaller =. y < >./y
half =. -:n
d =. 1
while.  d<half
do.     dd =. >:+:d
        largers =. dd Spider\ y
        largers =. (d{.d}.y),largers,(-d){.(-d)}.y
        larger =. smaller*.largers > y
        all =. all+larger*largers
        smaller =. larger~:smaller
        d =. d+1
end.
while.  d>0
   do.  largers =. (-n){.d{.y
        largers =. largers >. n{.(-d){.y
        larger =. smaller*.largers > y
        all =. all+larger*largers
        smaller =. larger~:smaller
        d =. d-1
end.    all
)
Spider =: {. >. {:

Very slow in lots of space.

The Nearer Larger Neighbor problem can be solved recursively:

   EACH =: “0 _
   ELSE =: `
   WHEN =: @.
NLN7 =: All {.@NLNr EACH ]
   All =: i.@#
   NLNr =: Leftr Nearerr Rightr
      Leftr =: |.@{. , {                NB.  item on end
      Rightr =: 1: |. }.        
      Nearerr =: Nearerr&}. ELSE 0: WHEN Done ELSE Largerr WHEN (Largerr>N)
         Largerr =: >.&{.
         N =: >.&{: 
         Done =: +.&# = 0:

Not efficient, of course.

Comparisons

Now compare the most competitive programs by ratios of time, space, and length -- where 1.00 is the best – for the same n=1e6 non-distinct random positive integers. Finally, sum the ratios for a composite of overall program efficiency. See Appendix below for benchmark details and copies of these programs.

            Time     Space    Length   Overall


     A1a     32.9     1.67     1.21     35.8


     A2a     10.1     1.00     1.40     12.5


     A2b     10.4     1.42     1.00     12.8


     NLN3     1.03    4.08     1.25      6.37


     NLN3a    1.00    4.75     1.17      6.92


     NLN5    19.2     1.00     1.40     21.6

So, the winner is NLN3, using the Masking approach.

Readers may wish to use a higher n, average over several samples, and weight the three criteria.

Conclusion

The Nearer Larger Neighbor problem was introduced and programmed in J. Definitions of programs were compared, and the best was determined. (It was not the fastest, nor slimmest, nor shortest.) To follow up, see Essays/NearestLargerNeighbor for programs for 2-D, 3-D, and n-D arrays.

References

[1] T. Asano, S. Bereg, and D. Kirkpatrick, "Finding nearest larger neighbors: a case study in algorithm design and analysis", Lecture Notes in Computer Science, Efficient Algorithms, pp. 249-260, Albers, Alt & Naeher (eds.) Springer, 2009

[2] Weisstein, Eric W. “Nearest Neighbor Problem”, http://mathworld.wolfram.com

[3] Lifshits, Y. “The Homepage of Nearest Neighbors and Similarity Search”, http://simsearch.yury.name

Appendix

Benchmarks for time and space were obtained on a Dell Latitude E6410 laptop (64-bit OS M620 at 2.67 GHz with 4GB RAM) running J6.02.

Length = total number of characters in a program definition body, counting only 1 for each name.

Utility programs used:

Time =: 6 !: 2
Space =: 7 !: 2
Odom =: #~ #: i.@^

Example to match all lists of 6 integers:

   *./(NLN3a -: NLN3)"1 Odom~6
1

Example plots:

   load 'plot'
   plot n ; (Time,10&^.@Space)"1 'NLN0 ?#~',"1 ":,.n=.1000*>:i.10     NB. Note log space for common scale.

Special cases:

For an empty input, all programs return an empty output. For a list of constants, all programs return defaults.

Programs compared:

A1a =: 3 : 0
n =. #y
all =. i.0
for_i.  i.n
do.     left =. right =. i
        while.  look=.left>:0
        do.     large =. left{y
                if. large>i{y do. break. end.
                left =. <:left
        end.
        left =. left%look
        large =. large*look
        while.  look=.right<n
        do.     larger =. right{y
                if. larger>i{y do. break. end.
                right =. >:right
        end.
        right =. right%look
        larger =. larger*look
        nearer =. (i-left)(>,>:)right-i
        all =. all,>./nearer{large,larger
end.
)

A2a =: 3 : 0
n =. #y
max =. >./y
i =. 0
all =. i.0
while. i<n
   do.  it =. i{y
        if. it=max do. larger=.0 goto_next. end.
        d =. i
        while.  it>:larger=.>./d{y
           do.  d =. d + _1 1 * d~:0,n-1
          end.
label_next.     i =. i+1
                all =. all,larger
  end.
)

A2b =: 3 : 0
n =. #y
all =. y<>./y
for_i. all#i.n
   do.  d =. i
        while.  (i{y)>:larger=.>./d{y
           do.  d =. d+_1 1*d~:0,n-1
          end.
        all =. larger i} all
  end.
)

NLN3 =: 3 : 0
n =. #y
all =. y < >./y 
alli =. all # i.n       
d =. 1  
while.  0<#alli
do.     y =. y,0
        largers =. (alli-d){y
        largers =. largers >. (alli+d){y
        all =. largers alli} all
        largers =. largers <: alli{y
        alli =. largers#alli
        d =. d+1
end.    all
)

NLN3a =: 3 : 0
n =. #y
all =. y < >./y
alli =. all # i.n
d =. 0
while. (#alli)>0
   do.  y =. y,0
        d =. d + 1 _1
        largers =. >./ (d +/ alli){y
        all =. largers alli} all
        largers =. largers <: alli{y
        alli =. largers#alli
  end.  all
)

NLN5 =: 3 : 0
n =. #y
i =. 0
all =. i.0
while. i<n 
do.     x =. i{y
        d =. 1
        whilst. {./:~neighbors<:x
        do.     neighbors =. d < (i+1),(n-i)
                neighbors =. neighbors # i + d * _1 1
                neighbors =. neighbors { y
                d =. d+1
        end.
        i =. i+1
        all =. all,>./ neighbors                NB. __ default
end.
)

Contributed by HowardPeelle

Essays/NearerLargerNeighbor (last edited 2011-09-15 12:17:21 by HowardPeelle)