[Gambas-user] Reg expression still beating me up

Tobias Boege taboege at ...626...
Mon May 29 00:30:56 CEST 2017


On Sun, 28 May 2017, Fernando Cabral wrote:
> In the piece of code bellow, RegExp.Replace will never return.
> 
> Sentencas[i] = "Test string."
> Print "Before replacing"
> Sentencas[i] = RegExp.Replace(Sentencas[i], "[.:!?;]*[ ]*?\n*?", "",
> RegExp.UTF8)
> Print "After replacing"
> 
> It beats me, because what it should do is very simple: optionally find one
> of the punction marks (.:?!;) optionally followed by any number of white
> space, optionally followed by any number of "\n" (end of line). Replace
> whatever is found with an empty string.
> 
> In the text string, it should find the dot (.) and replace it with nothing.
> So, the returned string should be "Test string".
> 
> Alas! It will never come back. Same if I replace the test string with
> "Test string. \n" or "Test string.\n"
> 
> Now, this works as expected, but this is not what I need:  "[.:!?;][
> ]*?\n*?", ""
> To my eyes, "[.:!?;]*[ ]*?\n*?" is a perfectly valid regular expression.
> 
> Any hints?
> 

RegExp.Replace() wants to replace *all* occurences of the expression.
It is basically a loop of RegExp.Exec() followed by a substitution, as
long as the RegExp.Exec() call finds something.

Now look at your expression. Since everything is optional, your expression
matches the empty string. RegExp.Exec() will always find the empty string
and replace it with itself, giving you an infinite loop.

I think that the behaviour of RegExp.Replace() in this case is sound and
you should use a better expression, that is guaranteed to match a string
of positive length or not match at all.

Regards,
Tobi

-- 
"There's an old saying: Don't change anything... ever!" -- Mr. Monk




More information about the User mailing list