[Gambas-devel] gb.pcre 0.0.4
Benoit Minisini
gambas at ...1...
Wed Oct 6 17:14:07 CEST 2004
On Tuesday 05 October 2004 17:07, Rob wrote:
> On Tuesday 05 October 2004 10:51, Benoit Minisini wrote:
> > Good work, Rob.
> > I suggest you the following interface:
> > 1) Rename the class from 'Regex' to 'RegExp'.
>
> OK, I only made it "Regex" because I can't physically say the
> word "RegExp" ;)
>
> > 2) Make a virtual class '.RegExpSubmatches' and make a
> > property 'SubMatch' that returns it.
>
> [...]
>
> > This way, we could have the following syntax:
> > MyRegExp = NEW RegExp("...", "...")
> > PRINT MyRegExp.SubMatch[1].Text
>
> Actually, with my existing code it's shorter:
>
> MyRegExp = NEW RegEx("...", "...")
> PRINT MyRegExp.SubMatch(1)
I know. It is just that I try to use the following implicit logic:
- Use [] when you get date from something already computed, or that is very
fast to compute.
- Use () when there is some processing behind (heavy or not), that may access
external data.
In a few false words: [] means fast and () means slow.
>
> But I see the value of making it a virtual class. I almost did
> that originally but I just wanted to get the code working. Is
> there any way I can make a "default property" of a virtual class
> such that
>
> MyRegExp.Submatch[1]
>
> returns the submatch text the same as MyRegExp.Submatch[1].Text,
> but
>
> MyRegExp.Submatch[1].Offset
>
> would still work?
>
This is not possible, as there is no default property in Gambas.
> > If you are not easy with virtual classes, look at the ListBox
> > code to see how it returns ListBox items.
>
> Thank you, I certainly will be consulting the Qt component for
> all this "virtual class within virtual class" stuff.
>
> > As for a 'MATCH' keyword, this would need support from the
> > interpreter, exactly like the Eval() subroutine. So I'm not
> > sure I will do it now.
>
> I don't see how you could do it anyway unless I made a static
> version of the class. I have made a module (which i added to my
> local copy of the IDE) to do a match or replace in 5 or 6
> different ways anyway, so maybe once gambas code can be used in
> components I can make a gb.pcre.match component or something.
This is not a problem. Look how the interpreter internally creates an
Expression object each time you call Eval().
More important, the following points:
1) Why storing the text or the pattern if you don't use them once the
expresson is executed ?
You could store the pattern, it is rarely a very long string. But storing the
text...
I suggest you split the compilation process and the execution process in the
RegExp class, the same way I did in the Expression class in the gb.eval
component.
MyRegExp = NEW RegExp
MyRegExp.Pattern = "the pattern"
MyRegExp.Compile()
MyRegExp.Execute("the text") ' or .Exec("the text"), or .Matches("the text")
==> returns the return value of pcre_exec().
Of course, you can keep the 'NEW RegExp(Text, Pattern)' syntax with optional
arguments.
2) pcre_exec() wants a pointer and a length. So you don't need to call
GB.ToZeroString(), that you must use only if you need a zero-terminated
string.
Instead of:
THIS->rc = pcre_exec(code,
NULL,
GB.ToZeroString(ARG(subject)),
LENGTH(subject),
0,
0,
ovector,
99);
Do:
THIS->rc = pcre_exec(code,
NULL,
STRING(subject),
LENGTH(subject),
0,
0,
ovector,
99);
3) There are many options and error constants in libpcre. You should create
Gambas constants for them: RegExp.NotBOL, RegExp.NotEmpty... This way, the
complete power of libpcre is accessible to the user ! :-) Especially UTF-8...
Regards,
--
Benoit Minisini
mailto:gambas at ...1...
More information about the Devel
mailing list