[Gambas-user] problems in trie class

Tobias Boege taboege at ...626...
Mon Dec 1 23:39:26 CET 2014


On Mon, 01 Dec 2014, Charlie Reinl wrote:
> Am Montag, den 01.12.2014, 17:47 +0100 schrieb Tobias Boege:
> > On Sat, 29 Nov 2014, Charlie Reinl wrote:
> > > Am Samstag, den 29.11.2014, 20:05 +0100 schrieb Tobias Boege:
> > > > On Tue, 18 Nov 2014, Karl Reinl wrote:
> > > > > Salut Tobi,
> > > > > 
> > > > > played with you trie example (trietest) it crash if 
> > > > > p = h.GetPrefix("texte") find nothing (p=null), even when change
> > > > > to p = h.GetPrefix("Texte")
> > > > > 
> > > > > My change is h["texte"] to h["Texte"] (source attached)
> > > > 
> > > > Can you run your tests with #6688?
> > > > 
> > > > Indeed there were bit width errors (I think) but what caused your particular
> > > > error here was that a TriePrefix object would erroneously drop reference
> > > > counts of its parent Trie object if it couldn't be created. Total nonsense.
> > > > 
> > > > Thanks for your report!
> > > > 
> > > > As for the leak you showed in a follow-up, I couldn't reproduce that. I'll
> > > > try harder if the problem persists with #6688. But I can tell you that this
> > > > does not necessarily indicate a severe problem / corruption, as I don't
> > > > terminate strings in my trie backend code and it could just be some length
> > > > calculations that went wrong. (Most probably that is because the Gambas
> > > > string functions automatically use strlen() to determine a string's length
> > > > when I give 0 as a length parameter. However, strlen() must not be used on
> > > > the strings from my trie.)
> > > > 
> > > > Regards,
> > > > Tobi
> > > > 
> > > 
> > > Salut Tobi,
> > > 
> > > yes, now no more crash, only an error raises, thats oK.
> > > The leak shown, I can't reproduce any more now..... BUT
> > > Now TriePrefix is case sensitive, in my follow-up the TriePrefix "d"
> > > showed me "D" AND "d"entries, now only the "d", may be thats how trie
> > > work normally,
> > >
> > 
> > It should be case-sensitive. If it wasn't before on your system, that was
> > a bug (I can't imagine where it came from, though).
> > 
> > > but for my behaves non case sensitive would be better (or
> > > a switch to do like that) 
> > > And we talked about something like <trie>.Add(Value,Key) for simplifing
> > > filling.
> > > 
> > 
> > OK, you get an Add() and Remove() method in #6699, similar to what
> > Collection has.
> > 
> > As for the case-insensitivity: I can add an optional constructor argument,
> > Mode, which can be gb.Binary or gb.IgnoreCase (just as Collection has).
> > 
> > (After writing about half an hour complaining how hard it would be to get
> > case-insensitivity right and efficient) I just had a magnificent idea: I
> > will extend the native Trie class in Gambas and do something like that:
> > 
> >   ' Written from scratch, may contain syntax, etc. errors
> > 
> >   Public Struct _Trie_Entry
> >     Key As String
> >     Value As Variant
> >   End Struct
> > 
> >   Public Sub Add(Value As Variant, Key As String)
> >     Dim hEntry As New _Trie_Entry
> >     Dim sKey As String = Key
> > 
> >     If $iMode = gb.IgnoreCase Then sKey = String.Upper(Key)
> >     hEntry.Key = Key
> >     hEntry.Value = Value
> >     Super.Add(Value, sKey)
> >   End
> > 
> > Similarly I can augment _get, _put, etc. so that you won't notice that the
> > _Trie_Entry structure exists at all.
> > 
> > If you request a case-insensitive Trie, all keys are upper-case'd internally
> > and the real keys are saved as part of the stored object. So you can get
> > your original key back later in an enumeration and I don't have to add
> > branches in the hot paths of the trie code to support case-insensitivity
> > (which would make the whole thing slower -- I don't know if it would be
> > noticeable, but...). I will see if I can do that (_next() may impose a
> > little problem or maybe not).
> > 
> > @Benoit: I am not familiar with the caveats of UTF-8 strings. If I want to
> > implement case-insensitivity by internally converting all characters to
> > upper-case, is it sufficient to use String.Upper() or are there hidden
> > pitfalls?
> > 
> > Regards,
> > Tobi
> > 
> 
> Salut Tobi,
> 
> I think it would be best, to stay close to the original definition of
> trie..."should be case-sensitive".
> Every thing else, I can do by myself.
> It was just a question because off the first outputs.
> 

Attached is a project that does this. It should be a complete case-
insensitive version of the Trie class. Should work properly with the
latest revision (the Trie had a memory leak when objects were replaced
until a few minutes ago).

Regards,
Tobi

-- 
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: InsensitiveTrie-0.0.1.tar.gz
Type: application/octet-stream
Size: 5232 bytes
Desc: not available
URL: <http://lists.gambas-basic.org/pipermail/user/attachments/20141201/fd4c75a9/attachment.obj>


More information about the User mailing list