[Gambas-user] problems in trie class

Charlie Reinl Karl.Reinl at ...2345...
Mon Dec 1 22:54:35 CET 2014


Am Montag, den 01.12.2014, 17:47 +0100 schrieb Tobias Boege:
> On Sat, 29 Nov 2014, Charlie Reinl wrote:
> > Am Samstag, den 29.11.2014, 20:05 +0100 schrieb Tobias Boege:
> > > On Tue, 18 Nov 2014, Karl Reinl wrote:
> > > > Salut Tobi,
> > > > 
> > > > played with you trie example (trietest) it crash if 
> > > > p = h.GetPrefix("texte") find nothing (p=null), even when change
> > > > to p = h.GetPrefix("Texte")
> > > > 
> > > > My change is h["texte"] to h["Texte"] (source attached)
> > > 
> > > Can you run your tests with #6688?
> > > 
> > > Indeed there were bit width errors (I think) but what caused your particular
> > > error here was that a TriePrefix object would erroneously drop reference
> > > counts of its parent Trie object if it couldn't be created. Total nonsense.
> > > 
> > > Thanks for your report!
> > > 
> > > As for the leak you showed in a follow-up, I couldn't reproduce that. I'll
> > > try harder if the problem persists with #6688. But I can tell you that this
> > > does not necessarily indicate a severe problem / corruption, as I don't
> > > terminate strings in my trie backend code and it could just be some length
> > > calculations that went wrong. (Most probably that is because the Gambas
> > > string functions automatically use strlen() to determine a string's length
> > > when I give 0 as a length parameter. However, strlen() must not be used on
> > > the strings from my trie.)
> > > 
> > > Regards,
> > > Tobi
> > > 
> > 
> > Salut Tobi,
> > 
> > yes, now no more crash, only an error raises, thats oK.
> > The leak shown, I can't reproduce any more now..... BUT
> > Now TriePrefix is case sensitive, in my follow-up the TriePrefix "d"
> > showed me "D" AND "d"entries, now only the "d", may be thats how trie
> > work normally,
> >
> 
> It should be case-sensitive. If it wasn't before on your system, that was
> a bug (I can't imagine where it came from, though).
> 
> > but for my behaves non case sensitive would be better (or
> > a switch to do like that) 
> > And we talked about something like <trie>.Add(Value,Key) for simplifing
> > filling.
> > 
> 
> OK, you get an Add() and Remove() method in #6699, similar to what
> Collection has.
> 
> As for the case-insensitivity: I can add an optional constructor argument,
> Mode, which can be gb.Binary or gb.IgnoreCase (just as Collection has).
> 
> (After writing about half an hour complaining how hard it would be to get
> case-insensitivity right and efficient) I just had a magnificent idea: I
> will extend the native Trie class in Gambas and do something like that:
> 
>   ' Written from scratch, may contain syntax, etc. errors
> 
>   Public Struct _Trie_Entry
>     Key As String
>     Value As Variant
>   End Struct
> 
>   Public Sub Add(Value As Variant, Key As String)
>     Dim hEntry As New _Trie_Entry
>     Dim sKey As String = Key
> 
>     If $iMode = gb.IgnoreCase Then sKey = String.Upper(Key)
>     hEntry.Key = Key
>     hEntry.Value = Value
>     Super.Add(Value, sKey)
>   End
> 
> Similarly I can augment _get, _put, etc. so that you won't notice that the
> _Trie_Entry structure exists at all.
> 
> If you request a case-insensitive Trie, all keys are upper-case'd internally
> and the real keys are saved as part of the stored object. So you can get
> your original key back later in an enumeration and I don't have to add
> branches in the hot paths of the trie code to support case-insensitivity
> (which would make the whole thing slower -- I don't know if it would be
> noticeable, but...). I will see if I can do that (_next() may impose a
> little problem or maybe not).
> 
> @Benoit: I am not familiar with the caveats of UTF-8 strings. If I want to
> implement case-insensitivity by internally converting all characters to
> upper-case, is it sufficient to use String.Upper() or are there hidden
> pitfalls?
> 
> Regards,
> Tobi
> 

Salut Tobi,

I think it would be best, to stay close to the original definition of
trie..."should be case-sensitive".
Every thing else, I can do by myself.
It was just a question because off the first outputs.

-- 
Amicalement
Charlie





More information about the User mailing list