[Gambas-user] gb.pcre: RegExp.Replace() problem

Tobias Boege taboege at ...626...
Sun Dec 21 16:18:34 CET 2014


On Sun, 21 Dec 2014, Beno?t Minisini wrote:
> Le 21/12/2014 14:33, Beno?t Minisini a ?crit :
> > Le 20/12/2014 09:27, Tobias Boege a ?crit :
> >> Hi,
> >>
> >> attached is a script which essentially executes
> >>
> >>    Print RegExp.Replace(" * * a *", "^[ ]*\\*", "'")
> >>
> >> On my system the output is
> >>
> >>    '' a *
> >>
> >> which I don't understand. As the regular expression indicates I want only
> >> the one match at the beginning of the line to be replaced. Not the second
> >> one behind it. The "a" acts like a border, to see if the "^" is
> >> completely
> >> ignored -- but it isn't.
> >>
> >> The error seems to come from
> >>
> >> --8<-- [ trunk/gb.pcre/src/regexp.c
> >> ]---------------------------------------
> >> 389                         r.subject = &subject[offset];
> >> --8<------------------------------------------------------------------------
> >>
> >>
> >> I guess that this should prevent recursive (indefinite) substitutions
> >> in the
> >> replacement string... But if you ask me, the result of my script is
> >> wrong.
> >>
> >> Regards,
> >> Tobi
> >>
> >>
> >
> > Mmm. I'm far from being a regexp user, but I will try to check that...
> >
> 
> I think you should not use Replace() for that. It tries to mimic the 
> behaviour of Replace$(), i.e. finding each match and replace it, by 
> going forward in the string.
> 
> Maybe could I add a special case for pattern beginning with "^", which 
> then means "just replace once"?
> 

Hmm, that would solve my problem and indeed "^" is a special case ("$"
should not have this problem). I would say that this addition is worth it
if one can document the new behaviour in just a few sentences that are not
utterly complicated.

But as I see you already committed. What is the concrete behaviour now? If
the pattern starts with "^", then
  (a) the rest of the pattern is replaced only once (not necessarily at the
      beginning of the subject); or
  (b) the pattern is replaced once at the beginning of the subject?

In (a), the "^" acts merely as a flag and in (b) it acts as a flag AND
retains its anchor meaning.

Regards,
Tobi

-- 
"There's an old saying: Don't change anything... ever!" -- Mr. Monk




More information about the User mailing list