[Gambas-devel] gb.pcre 0.0.4

Benoit Minisini gambas at ...1...
Wed Oct 6 17:14:07 CEST 2004


On Tuesday 05 October 2004 17:07, Rob wrote:
> On Tuesday 05 October 2004 10:51, Benoit Minisini wrote:
> > Good work, Rob.
> > I suggest you the following interface:
> > 1) Rename the class from 'Regex' to 'RegExp'.
>
> OK, I only made it "Regex" because I can't physically say the
> word "RegExp" ;)
>
> > 2) Make a virtual class '.RegExpSubmatches' and make a
> > property 'SubMatch' that returns it.
>
> [...]
>
> > This way, we could have the following syntax:
> > MyRegExp = NEW RegExp("...", "...")
> > PRINT MyRegExp.SubMatch[1].Text
>
> Actually, with my existing code it's shorter:
>
> MyRegExp = NEW RegEx("...", "...")
> PRINT MyRegExp.SubMatch(1)

I know. It is just that I try to use the following implicit logic:

- Use [] when you get date from something already computed, or that is very 
fast to compute.

- Use () when there is some processing behind (heavy or not), that may access 
external data.

In a few false words: [] means fast and () means slow.

>
> But I see the value of making it a virtual class.  I almost did
> that originally but I just wanted to get the code working.  Is
> there any way I can make a "default property" of a virtual class
> such that
>
> MyRegExp.Submatch[1]
>
> returns the submatch text the same as MyRegExp.Submatch[1].Text,
> but
>
> MyRegExp.Submatch[1].Offset
>
> would still work?
>

This is not possible, as there is no default property in Gambas.

> > If you are not easy with virtual classes, look at the ListBox
> > code to see how it returns ListBox items.
>
> Thank you, I certainly will be consulting the Qt component for
> all this "virtual class within virtual class" stuff.
>
> > As for a 'MATCH' keyword, this would need support from the
> > interpreter, exactly like the Eval() subroutine. So I'm not
> > sure I will do it now.
>
> I don't see how you could do it anyway unless I made a static
> version of the class.  I have made a module (which i added to my
> local copy of the IDE) to do a match or replace in 5 or 6
> different ways anyway, so maybe once gambas code can be used in
> components I can make a gb.pcre.match component or something.

This is not a problem. Look how the interpreter internally creates an 
Expression object each time you call Eval().

More important, the following points:

1) Why storing the text or the pattern if you don't use them once the 
expresson is executed ?

You could store the pattern, it is rarely a very long string. But storing the 
text...

I suggest you split the compilation process and the execution process in the 
RegExp class, the same way I did in the Expression class in the gb.eval 
component.

MyRegExp = NEW RegExp
MyRegExp.Pattern = "the pattern"
MyRegExp.Compile()
MyRegExp.Execute("the text") ' or .Exec("the text"), or .Matches("the text")
==> returns the return value of pcre_exec().

Of course, you can keep the 'NEW RegExp(Text, Pattern)' syntax with optional 
arguments.

2) pcre_exec() wants a pointer and a length. So you don't need to call 
GB.ToZeroString(), that you must use only if you need a zero-terminated 
string.

Instead of:

       THIS->rc = pcre_exec(code,
			    NULL,
			    GB.ToZeroString(ARG(subject)),
			    LENGTH(subject),
			    0,
			    0,
			    ovector,
			    99);

Do:

       THIS->rc = pcre_exec(code,
			    NULL,
			    STRING(subject),
			    LENGTH(subject),
			    0,
			    0,
			    ovector,
			    99);

3) There are many options and error constants in libpcre. You should create 
Gambas constants for them: RegExp.NotBOL, RegExp.NotEmpty... This way, the 
complete power of libpcre is accessible to the user ! :-) Especially UTF-8...

Regards,

-- 
Benoit Minisini
mailto:gambas at ...1...




More information about the Devel mailing list