[Gambas-user] problems in trie class
Charlie Reinl
Karl.Reinl at ...2345...
Mon Dec 1 22:54:35 CET 2014
Am Montag, den 01.12.2014, 17:47 +0100 schrieb Tobias Boege:
> On Sat, 29 Nov 2014, Charlie Reinl wrote:
> > Am Samstag, den 29.11.2014, 20:05 +0100 schrieb Tobias Boege:
> > > On Tue, 18 Nov 2014, Karl Reinl wrote:
> > > > Salut Tobi,
> > > >
> > > > played with you trie example (trietest) it crash if
> > > > p = h.GetPrefix("texte") find nothing (p=null), even when change
> > > > to p = h.GetPrefix("Texte")
> > > >
> > > > My change is h["texte"] to h["Texte"] (source attached)
> > >
> > > Can you run your tests with #6688?
> > >
> > > Indeed there were bit width errors (I think) but what caused your particular
> > > error here was that a TriePrefix object would erroneously drop reference
> > > counts of its parent Trie object if it couldn't be created. Total nonsense.
> > >
> > > Thanks for your report!
> > >
> > > As for the leak you showed in a follow-up, I couldn't reproduce that. I'll
> > > try harder if the problem persists with #6688. But I can tell you that this
> > > does not necessarily indicate a severe problem / corruption, as I don't
> > > terminate strings in my trie backend code and it could just be some length
> > > calculations that went wrong. (Most probably that is because the Gambas
> > > string functions automatically use strlen() to determine a string's length
> > > when I give 0 as a length parameter. However, strlen() must not be used on
> > > the strings from my trie.)
> > >
> > > Regards,
> > > Tobi
> > >
> >
> > Salut Tobi,
> >
> > yes, now no more crash, only an error raises, thats oK.
> > The leak shown, I can't reproduce any more now..... BUT
> > Now TriePrefix is case sensitive, in my follow-up the TriePrefix "d"
> > showed me "D" AND "d"entries, now only the "d", may be thats how trie
> > work normally,
> >
>
> It should be case-sensitive. If it wasn't before on your system, that was
> a bug (I can't imagine where it came from, though).
>
> > but for my behaves non case sensitive would be better (or
> > a switch to do like that)
> > And we talked about something like <trie>.Add(Value,Key) for simplifing
> > filling.
> >
>
> OK, you get an Add() and Remove() method in #6699, similar to what
> Collection has.
>
> As for the case-insensitivity: I can add an optional constructor argument,
> Mode, which can be gb.Binary or gb.IgnoreCase (just as Collection has).
>
> (After writing about half an hour complaining how hard it would be to get
> case-insensitivity right and efficient) I just had a magnificent idea: I
> will extend the native Trie class in Gambas and do something like that:
>
> ' Written from scratch, may contain syntax, etc. errors
>
> Public Struct _Trie_Entry
> Key As String
> Value As Variant
> End Struct
>
> Public Sub Add(Value As Variant, Key As String)
> Dim hEntry As New _Trie_Entry
> Dim sKey As String = Key
>
> If $iMode = gb.IgnoreCase Then sKey = String.Upper(Key)
> hEntry.Key = Key
> hEntry.Value = Value
> Super.Add(Value, sKey)
> End
>
> Similarly I can augment _get, _put, etc. so that you won't notice that the
> _Trie_Entry structure exists at all.
>
> If you request a case-insensitive Trie, all keys are upper-case'd internally
> and the real keys are saved as part of the stored object. So you can get
> your original key back later in an enumeration and I don't have to add
> branches in the hot paths of the trie code to support case-insensitivity
> (which would make the whole thing slower -- I don't know if it would be
> noticeable, but...). I will see if I can do that (_next() may impose a
> little problem or maybe not).
>
> @Benoit: I am not familiar with the caveats of UTF-8 strings. If I want to
> implement case-insensitivity by internally converting all characters to
> upper-case, is it sufficient to use String.Upper() or are there hidden
> pitfalls?
>
> Regards,
> Tobi
>
Salut Tobi,
I think it would be best, to stay close to the original definition of
trie..."should be case-sensitive".
Every thing else, I can do by myself.
It was just a question because off the first outputs.
--
Amicalement
Charlie
More information about the User
mailing list