[Gambas-user] Problem with lazy regexp
taboege at ...626...
Mon Apr 24 09:57:37 CEST 2017
On Sun, 23 Apr 2017, T Lee Davidson wrote:
> According to http://gambaswiki.org/wiki/doc/pcre , using "*?" in a regular
> expression should lazily match 0 or more characters. However, it appears to
> act greedily.
> I am trying to do some very simple HTML tag stripping with
> 'Regex.Replace(sText, "<.*?>", "")', and it takes out way more than just the
> Have I misunderstood the documentation?
I believe you are correct. I get the same greedy behaviour from "<.*?>".
The Gambas wiki page seems to be copied from the libpcre documentation 
and the point, under QUANTIFIERS:
*? 0 or more, lazy
hardly gives room for misinterpretation. I just tried the following line:
RegExp.Replace("<tag abc=\"xyz\">content</tag>", "<.*>", "", RegExp.Ungreedy)
which correctly delivers "content", if you are interested in a workaround.
If no one else does it, I can (try to remember to) try to have a look at
gb.pcre this evening.
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
More information about the User