[Gambas-user] Get distinct array from large array

Tobias Boege tobs at taboege.de
Fri Dec 10 11:50:57 CET 2021


On Fri, 10 Dec 2021, Tobias Boege via User wrote:
> In theory, the following should be faster (at the expense of using more
> memory for the added Collection), but please do try it out on real data:
> 
>   Public Function Deduplicate(xxx As String[]) As String[]
>     Dim yyy As New String[]
>     Dim stash As New Collection
> 
>     For Each x As String in xxx
>       If stash.Exist(x) Then Continue
>       stash[x] = True
>       yyy.Add(x)
>     Next
>     Return yyy
>   End
> 

PS: Since Gambas 3.17 Collections have a new Keys property which makes the
array yyy redundant:

  Public Function Deduplicate(xxx As String[]) As String[]
    Dim stash As New Collection

    For Each x As String in xxx
      stash[x] = True
    Next
    Return stash.Keys
  End

I think the current implementation of Collection even ensures that you get
the Keys back in the same order that you put them in. That is documented at
least for iteration using For Each [1], so I would say that you can rely on
it for future Gambas versions. This may or may not be important for you.
In the worst case, you get the deduplicated items in a random order by
using the above code.

Best,
Tobias

[1] http://gambaswiki.org/wiki/comp/gb/collection/_next


More information about the User mailing list