[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New routine in the development version : Tokenize()


Hi,
Very useful. I'm working on a parser for SVG files and have a working
version.
Compared to XML, it's very slow (3-4 times slower), but the resulting files
are 20% smaller. For me, size is more important than speed.

I've done a proof of concept, and Tokenizer is fast and simplifies
development.
The results with a tiny SVG file:
- My parser: 1.02 msec
- XML: 0.31 msec
- Using Tokenizer: 0.43 msec

It's a huge improvement!
Best regards.

El vie, 3 oct 2025 a las 14:25, Fabien Bodard (<gambas.fr@xxxxxxxxx>)
escribió:

> Hi, Benoit
>
> Wow... really useful thing, I use this kind of thing so often .
>
> The first example I've there is parsing Custom Settings files  or Custom
> formats files.
>
> But can be wiki meta code or any things. (Implementing Yaml Setting R/W)...
>
> Thank you for this improvement :-).
>
>
> Le ven. 3 oct. 2025 à 10:04, Benoît Minisini <
> benoit.minisini@xxxxxxxxxxxxxxxx> a écrit :
>
>> Le 03/10/2025 à 09:43, Benoît Minisini a écrit :
>> > Hi,
>> >
>> > I have just added a general-purpose 'Tokenize()' routine to Gambas,
>> that
>> > splits a string into tokens (identifiers, numbers, strings...),
>> > according to a few arguments.
>> >
>> > The documentation is at: https://gambaswiki.org/wiki/lang/tokenize
>> >
>> > You must login to get the last version that fixes many typos.
>> >
>> > The aim of that routine is being useful without taking too many
>> > arguments - as a real dynamic parser is like 'gb.highlight', it needs a
>> > compiler!
>> >
>> > Please tell me what you think about 'Tokenize()', if you find it
>> useful,
>> > if you think some options can be added or removed, and so on.
>> >
>> > Regards,
>> >
>>
>> Maybe 'Tokenize()' will be renamed as 'Parse()'. I'm not sure yet.
>>
>> --
>> Benoît Minisini.
>>
>>
>>
>
> --
> Fabien Bodard
>

Follow-Ups:
Re: New routine in the development version : Tokenize()Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
References:
New routine in the development version : Tokenize()Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
Re: New routine in the development version : Tokenize()Benoît Minisini <benoit.minisini@xxxxxxxxxxxxxxxx>
Re: New routine in the development version : Tokenize()Fabien Bodard <gambas.fr@xxxxxxxxx>