[Gambas-user] Size of a result in memory

Christof Thalhofer chrisml at deganius.de
Thu Oct 19 09:26:38 CEST 2017


Hello Tobias,

Am 18.10.2017 um 17:23 schrieb Tobias Boege:

> You know, I was about to write that the result object, or the library
> that is used by the database driver in Gambas, probably only does some
> prefetching of a fixed number > 1 of results to reduce the overhead
> incurred by network communication. But you were right: I was only assuming
> what I said last time and since I have a table at hand, in a local SQLite3
> database, which is 5.5 GiB in size, I thought I could provide numbers as
> well. I wrote a Gambas program that connects to the database and does a
> "SELECT * FROM table", then iterates through the entire result.
> 
> Guess what, watching its memory usage in htop, I see that it consumes
> about 2 GiB, then segfaults. This is not what I expected, even having
> the SQLite3 limits[1] in mind. In fact, my perl6 program, which usually
> operates this database, has no trouble doing exactly the same thing,
> and it uses a constant amount of memory the whole time.

A sql SELECT, that fetches 2GB and more is pushing the thing to the
extreme. AFAIK for such sizes there exist better tools (depending on the
dbms vendor) like postgresql COPY, mysql LOAD DATA or maybe an
equivalent in sqlite.

> I would dare to say that there is something wrong in some part of gb.db.
> I'm curious about that but my leisure time for today was used up by these
> experiments.

For now I am quite happy with the fact that Gambas fetches the output of
a SELECT in one piece, as you see: I can cache it for later usage.

Anything else would be obscure to me.

The segfault you are watching could indeed be a sign that there is sth
wrong. But when Gambas segfaults on a sql query, that fetches 2 GB, it
is nothing that disturbs my sleep.

> My proposal would be this: take the *row count* as an approximation to the
> cache size and limit the cache in terms of records. You can easily[*] get
> the row count of a result and maintain the number of rows stored in your
> cache.

This is as good idea. I can calculate the average size of the first 10
rows by iterating through all fields and then deliver it * rowcount as
Size_Maybewrong.

:-)


Alles Gute

Christof Thalhofer

-- 
Dies ist keine Signatur

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.gambas-basic.org/pipermail/user/attachments/20171019/19e85fa2/attachment.sig>


More information about the User mailing list