[Gambas-user] I need a hint on how to deleted duplicate items in a array
nando_f at ...951...
nando_f at ...951...
Sun Jul 2 01:24:56 CEST 2017
there are much faster ways...just a little more
intricate.
Nando
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
From: Fernando Cabral <fernandojosecabral at ...626...>
To: mailing list for gambas users <gambas-
user at lists.sourceforge.net>
Sent: Sat, 1 Jul 2017 00:18:42 -0300
Subject: Re: [Gambas-user] I need a hint on how to
deleted duplicate items in a array
> I thank you guys for the hints on counting and
eliminating duplicates. In
> the end, I resorted to something that is very
simple and does the trick in
> three steps. In the first step I sort the array.
> In the second step I count the number of
occurrences and prepend it to the
> word itself (with a separator). In the third step
I sort the array again,
> so now I have it sorted by the number of
occurrences from the largest to
> the smallest.
>
> That is all I need.
>
> Nevertheless, I am concerned with the performance.
For 69,725 words, from
> which 8,987 were unique, it took 28 seconds for
the code below to execute.
> I will survive this 28 seconds, if I have to. But
I still would like to
> find a faster solution.
>
> On the other hand, I think I am close to the
fastest possible solution.
> Basically, the array will be traversed once only,
no matter how many terms
> and how many repetitions it may have.
>
> (What do you think about this efficiency, Tobi?)
>
> *MatchedWords.Sort(gb.ascent + gb.language +
gb.IgnoreCase) For i = 0 To
> MatchedWords.Max n = 1 For j = i + 1 To
MatchedWords.Max If
> (Comp(MatchedWords[i], MatchedWords[j],
gb.language + gb.ignorecase) = 0)
> Then n += 1 Else Break
Endif Next
> UniqWords.Push(Format(n, "0###") & "#" &
MatchedWords[i]) i += (n - 1)
> NextUniqWords.Sort(gb.descent + gb.language +
gb.ignorecase)For i = 0 To
> UniqWords.Max Print UniqWords[i]Next*
>
> 2017-06-30 15:10 GMT-03:00 Gianluigi
<bagonergi at ...626...>:
>
> > Just for curiosity, on my computer, my function
(double) processes 10
> > million strings (first and last name) in about 3
seconds.
> > Very naif measurement using Timers and a limited
number of names and
> > surnames eg Willy Weber has come up 11051 times
> >
> > To demonstrate the goodness of Tobias'
arguments, about 1 million 3 cents a
> > second I really understood (I hope) what he
wanted to say.
> >
> > Sorry my response times but today my modem works
worse than my brain.
> >
> > Regards
> > Gianluigi
> >
> > 2017-06-30 17:58 GMT+02:00 Gianluigi
<bagonergi at ...626...>:
> >
> > > Sorry Tobias,
> > > other explanations are not necessary.
> > > I would not be able to understand :-(
> > > I accept what you already explained to me as a
dogma and I will try to
> > put
> > > it into practice by copying your code :-).
> > >
> > > Thanks again.
> > >
> > > Gianluigi
> > >
> > > 2017-06-30 17:44 GMT+02:00 Gianluigi
<bagonergi at ...626...>:
> > >
> > >>
> > >> 2017-06-30 17:21 GMT+02:00 Tobias Boege
<taboege at ...626...>:
> > >>
> > >>>
> > >>> I wouldn't say there is anything *wrong*
with it, but it also has
> > >>> quadratic
> > >>> worst-case running time. You use
String[].Push() which is just another
> > >>> name
> > >>> for String[].Add(). Adding an element to an
array (the straightforward
> > >>> way)
> > >>> is done by extending the space of that array
by one further element and
> > >>> storing the value there. But extending the
space of an array could
> > >>> potentially
> > >>> require you to copy the whole array
somewhere else (where you have
> > enough
> > >>> free memory at the end of the array to
enlarge it). Doing worst-case
> > >>> analysis,
> > >>> we have to assume that this bad case always
occurs.
> > >>>
> > >>> If you fill an array with n values, e.g.
> > >>>
> > >>> Dim a As New Integer[]
> > >>> For i = 1 To n
> > >>> a.Add(i)
> > >>> Next
> > >>>
> > >>> then you loop n times and in the i-th
iteration there will be already
> > >>> i-many elements in your array. Adding one
further element to it will,
> > >>> in the worst case, require i copy operations
to be performed.
> > 9-year-old
> > >>> C.F. Gauss will tell you that the amount of
store operations is about
> > >>> n^2.
> > >>>
> > >>>
> > >> Tobias you are always kind and thank you very
much.
> > >> Is possible for you to explain this more
elementarily, for me (a poorly
> > >> educated boy :-) )
> > >>
> > >>
> > >>
> > >>> And your function does two jobs
simultaneously but only returns the
> > >>> result
> > >>> of one of the jobs. The output you get is
only worth half the time you
> > >>> spent.
> > >>>
> > >>>
> > >> I did two functions in one, just to save
space, this is a simple
> > example.
> > >> :-)
> > >>
> > >> Regards
> > >> Gianluigi
> > >>
> > >
> > >
> > ------------------------------------------------
------------
> > ------------------
> > Check out the vibrant tech community on one of
the world's most
> > engaging tech sites, Slashdot.org!
http://sdm.link/slashdot
> > _______________________________________________
> > Gambas-user mailing list
> > Gambas-user at lists.sourceforge.net
> >
https://lists.sourceforge.net/lists/listinfo/gambas-
user
> >
>
> --
> Fernando Cabral
> Blogue: http://fernandocabral.org
> Twitter: http://twitter.com/fjcabral
> e-mail: fernandojosecabral at ...626...
> Facebook: f at ...3654...
> Telegram: +55 (37) 99988-8868
> Wickr ID: fernandocabral
> WhatsApp: +55 (37) 99988-8868
> Skype: fernandojosecabral
> Telefone fixo: +55 (37) 3521-2183
> Telefone celular: +55 (37) 99988-8868
>
> Enquanto houver no mundo uma só pessoa sem casa ou
sem alimentos,
> nenhum político ou cientista poderá se gabar de
nada.
> --------------------------------------------------
----------------------------
> Check out the vibrant tech community on one of the
world's most
> engaging tech sites, Slashdot.org!
http://sdm.link/slashdot
> _______________________________________________
> Gambas-user mailing list
> Gambas-user at lists.sourceforge.net
>
https://lists.sourceforge.net/lists/listinfo/gambas-
user
------- End of Original Message -------
More information about the User
mailing list