Excuse me, this is my first Stackoverflow question, so, any tips/advice on what I can do to improve it would be wonderful, in addition to some help.


The problem:

I have a slice that I am trying to group into smaller slices by certain criteria. I then need to merge the newly created slices with each other if they contain any of the same values in the slice. (Essentially, appending slices together that have "overlapping" values).

Some additional notes about the problem:

  • The number of items in the original slice will likely be between 1-50, in most cases, with outliers rarely exceeding 100.

  • Once gropued, the size of the 'inside' slices will be between 1-10 values.

  • Performance is a factor, as this operation will be run as part of a webservice where a single request will perform this operation 20+ times, and there can be many (hundreds - thousands) of requests per minute at peak times. However, clarity of code is also important.

  • My implementation is using ints, the final implementation would have more complex structs, though I was considering making a map and then use the implementation shown below based upon the keys. Is this a good idea?

I have broken the problem down into a few steps:

  1. Create a 2D slice with groupings of values, based up criteria (the initial grouping phase)
  2. Attempt to merge slices in place if they include a duplicated value.

I am running into two problems:

First, I think my implementation might not scale super well, as it tends to have some nested loops (however, these loops will be iterating on small slices, so that might be ok)

Second, my implementation is requiring an extra step at the end to remove duplicated values, ideally we should remove it.


Input: [ 100, 150, 300, 350, 600, 700 ] Expected Output: [[100 150 300 350] [600 700]]

This is with the 'selection criteria' of grouping values that are within 150 units of at least one other value in the slice.

package main

import (
    "fmt"
    "sort"
)

func filter(vs []int, f func(int) bool) []int {
    vsf := make([]int, 0)
    for _, v := range vs {
        if f(v) {
            vsf = append(vsf, v)
        }
    }
    return vsf
}

func unique(intSlice []int) []int {
    keys := make(map[int]bool)
    list := []int{} 
    for _, entry := range intSlice {
        if _, value := keys[entry]; !value {
            keys[entry] = true
            list = append(list, entry)
        }
    }    
    return list
}

func contains(intSlice []int, searchInt int) bool {
    for _, value := range intSlice {
        if value == searchInt {
            return true
        }
    }
    return false
}

func compare(a, b []int) bool {
    if len(a) != len(b) {
        return false
    }

    if (a == nil) != (b == nil) {
        return false
    }

    b = b[:len(a)] 
    for i, v := range a {
        if v != b[i] { 
            return false
        }
    }

    return true
}



func main() {
    fmt.Println("phase 1 - initial grouping")
    s := []int{100, 150, 300, 350, 600, 700}
    g := make([][]int, 0)

    // phase 1
    for _, v := range s {
        t := filter(s, func(i int) bool { return i - v >= -150 && i - v <= 150 })
        for _, v1 := range t {
            t1 := filter(s, func(i int) bool { return i - v1 >= -150 && i - v1 <= 150})
            t = unique(append(t, t1...))
            sort.Ints(t)
        }

        g = append(g, t)
        fmt.Println(g)
    }
    // phase 2
    fmt.Println("phase 2 - merge in place")

    for i, tf := range g {
        for _, death := range tf {
            if i < len(g) - 1 && contains(g[i+1], death) {
                g[i+1] = unique(append(g[i], g[i+1]...))
                g = g[i+1:] 
            } else if i == len(g) - 1 {
                fmt.Println(g[i], g[i-1])
                // do some cleanup to make sure the last two items of the array don't include duplicates
                if compare(g[i-1], g[i]) {
                    g = g[:i]
                }
            }
        }
        fmt.Println(i, g)
    }
}