[Gambas-devel] [Gambas-user] gb.adt component

Sun Aug 19 21:20:45 CEST 2012

Le 18/08/2012 14:47, Tobias Boege a écrit :
> Hi,
>
> as of rev5052 you will notice a new gb.adt component as a playground for
> abstract datatypes implementations.
> I had most of the algorithms already lying around from another project.
> Everything is, however, written from scratch and the code depends on nothing
> but the C library.
>
> Only hurdle was to incorporate the algorithms into the Gambas memory
> management: You can imagine, that it is tricky to implement circular
> double-linked lists of objects without creating circular references on the
> list nodes.
>
> The implementation of linked lists features two new classes:
> - List, which represents a single list node. Technically speaking this list
>    node is a fully capable linked list itself because it is linked to itself.
> - ListRoot which is just a List object marked special so that special
>    properties can be applied to it.
>
> About List:
> The first I want to state is that List nodes can be either used standalone
> and created to _carry_ an object or they can be embedded into a new class
> and _be carried_ by the object. You must tell the List constructor what kind
> of mode your List node shall perform in. This is important because an
> embedded List node which references its container creates unresolvable
> circular references - which we don't want - otherwise a non-embedded list
> nodes which does not reference its linked object may see the latter vanish
> away because of its reference count becoming zero - not good either.
>
> List nodes have a Next-element, Prev-element and Data-element pointer. As
> said above, they are linked lists itself because initially their Next and
> Prev pointers point to themselves.
> Note carefully that the List.Add*() functions add _entire lists_. This is
> normally not a problem because most people will add one node (which is
> linked to itself into an empty list) but it is also possible to:
>
> ' List1 -> List2
> List1.AddAfter(List2)
>
> ' GlobalList -> List1 -> List2
> GlobalList.AddAfter(List1)
>
> As far as a List node is in a non-empty list (i.e. a list which is not only
> made up of itself), it gets a reference count. By the nature of circular
> linked lists, this creates circular references. Consequently, if you create
> a list, you must tidy it up by means of List.Clear() or differently,
> depending on your application.
>
> As imaginable with circular double-linked lists, you can do everything from
> any node in the list: Add new lists/nodes, traverse the list, clear the
> list, enumerate all objects in a list, unlink single nodes, extract lists
> from others, etc..
>
> About ListRoot:
> Don't fear the ListRoot because it is the means I intended to solve the
> memory obstacle of circular references. Actually ListRoot may be a
> misleading name (help me native speakers!). ListRoot semantics differ from
> normal List nodes in the following details:
>
> - ListRoots don't carry data
> - In a 'For Each data In List' they are skipped (s.a.)
> - When linked in a list they get _no_ reference count.
> - They cannot be linked to a list but the list must be linked to them. This
>    helps prevent multiple ListRoots in a List infrastructure.
>
> You got the two main reasons for which the List and ListRoot classes are
> native now: I play tricks with reference counts (with embedded Lists and
> ListRoots).
> If you like to adhere to just two constraints you can have your lists be
> tidied up for you automatically:
> 1. Link a ListRoot to your list (the For-Each detail above tries to hide the
>     presence of ListRoot from your normal work)
> 2. Always keep the ListRoot in a variable as long as you want the list to be
>     existent. When the ListRoot leaves scope, is thus freed, it calls
>     List.Clear() in its _free() special method so that all List nodes are
>     freed properly. It is possible for a linked ListRoot to get a zero
>     reference count because it just doesn't get a ref when in a list - in
>     contrast to normal List nodes, as stated above.
>
> You will see how it works in the attached test project.
>
> Bruce, I hope you will never have to write linked lists from scratch
> anymore ;-) Tell me if the current implementation is useful if you find
> time.
>
> Next are the classes Deque, Stack, Queue and PrioQueue (priority queue)
> which are quite self-explanatory and nothing special. Except, maybe, that
> they are built on top of the List interface so you can decide if you need to
> maintain highly dynamic sort of Deque (use my implementation) or a
> fixed-size one (use Variant[] then).
>
> Last so far is the Circular class. It provides a fixed-size circular/ring
> buffer with one reader and one writer.
>
> I invite you to try these classes out. Report problems and suggestions. I
> haven't read this Knuth so alternate ideas and interfaces are likely to be
> around and are welcome!
>
> Beno�t, there is a datastructure CLIST_INTF declared in gb.adt/src/c_list.h.
> I have not yet worried about GB.GetInterface() or something so please excuse
> me taking an unduly independent line. I saw that there already is a
> rudimentary linked list in main/share/gb_list_temp.h. I personally use
> linked lists quite often and thought it could be part of the interpreter API
> just as Memory and Gambas arrays are. What do you think about that?
>
> Regards,
> Tobi
>

Hi Tobi,

If Deque, Stack... classes can be useful, I'm not sure about your List 
and ListRoot classes.

I think you tried to mimic in Gambas the way linked list are implemented 
in C: consequently, the Gambas interface is over-complicated.

First, you must know that linked list are far less useful nowadays. As 
they trash the CPU cache, using an array is way faster, the speed gain 
usually compensating the need for reallocation. Bet let's ignore that point.

I think a List class should have the following interface:
- Create a new List. (_new)
- Put one Variant at the beginning. (Add / Append)
- Put one Variant at the end. (Prepend)
- Enumerate the contents. (_next)
- Have an internal cursor, on which we could do a MoveNext() and 
MovePrevious(). (MoveNext, MovePrevious, MoveFirst, MoveLast).
- Have a current item, and be allow to retrieve it. (Current)
- A Count property. (Count)
- An array method to retrieve the i-th element (slowly of course). 
(_get), and eventually to modify it (_put)
- Take the current element from the list (i.e. removing it from the list 
and returning it). (Take)
- Find a value inside the list. (Find or FindFirst, maybe FindLast too)
- Find a value from the current one, forward or backward. (FindNext, 
FindPrevious).

Internally, the List can be managed with a double-linked list of nodes, 
each node having a GB_VARIANT_VALUE inside.

But, for performance reasons, it can later be implemented as a 
double-linked list of arrays of Variant. Faster, but harder to implement.

As for "embedding" the list inside an object, i.e. doing what we usually 
do in C, I think I understood what you want to do, but I don't think it 
is really useful in Gambas.

The only advantage I see is being able to find the next (or previous) 
object of the list as soon as you have a reference on the object. Which 
is not possible with the previous List class.

To do that, maybe I would create another class (something like 
'StaticList'). But anyway it will not be easy to have a practical interface!

As for having a list interface in the interpreter API: why not. But it 
would be done if this is really useful, as implementation of linked list 
is done in a few lines of codes, and usually need to be adapted for 
performance reasons. Anyway see '/trunk/main/share/gb_list.h'
and '/trunk/main/share/gb_list_temp.h', which is my own implementation 
of embedded linked lists in C.

So tell me what you think.

Regards,

-- 
Benoît Minisini