[Gambas-user] Regular expressions

Gianluigi bagonergi at gmail.com
Thu Dec 30 23:49:05 CET 2021


Il giorno gio 30 dic 2021 alle ore 20:39 Tobias Boege via User <
user at lists.gambas-basic.org> ha scritto:

> On Thu, 30 Dec 2021, T Lee Davidson wrote:
> > On 12/30/21 07:37, Hans Lehmann wrote:
> > > Hello.
> > >
> > > I am looking for 3 regular pattern expressions that check in a DokuWiki
> > >
> > > Hans
> >
> > DokuWiki stores all its data in UTF-8.[1] This is almost certainly the
> > reason it is not working since LIKE deals only with ASCII strings.[2]
> >
> > You should use RegExp.Replace (gb.pcre) [3] as LIKE is not a valid
> solution for this particular scenario.
> >
>
> In addition, the LIKE patterns quoted above fall into a very common trap
> with regular expressions (or similar patterns): if you match against
>
>   [_]{2}(.*)[_]{2}
>
> (a straightforward translation of the given pattern to PCRE) as suggested
> in another email, then problems will arise if there are more than one
> underline markups in your string because the `.*` in the middle is by
> default "greedy". The line
>
>   The __installation__ of a __SSH server__ on the remote __computer__ is
> worthwhile in any case!
>
> will get everything from the first __ to the last __ on the line replaced.
> This spans three different markups which are left unchanged!
>
> In gb.pcre, you would use the "frugal" quantifier `*?` instead of `*`:
>
>   [_]{2}(.*?)[_]{2}
>
> Since Gambas 3.5, you can use the convenient RegExp.Replace() function,
> which compiles quantifiers frugally by default. See the documentation for
> more information! Here is the solution, also incorporating the hint by
> T Lee about UTF8 (I don't know if that flag is the default in gb.pcre):
>
>   RegExp.Replace(sDokuWiki, "[_]{2}(.*)[_]{2}", "<uuu>&1<uuu>", gb.UTF8)
>
> Best,
> Tobias
>

Hi Tobias and Lee,

I admit that this solution is not very aesthetic, but avoids intervening
where it is not needed and this is also compatible with UTF-8:

Public Sub Main()

  Dim sLine As String = "The //installation// of a **SSH server on the
remote __computer__ is worthwhile in any case!"
  Dim sItalic, sBold, sUnderline As String

  Try sItalic = Scan(sLine, "*//*//*")[1]
  Try sBold = Scan(sLine, "*{**}*{**}*")[1]
  Try sUnderline = Scan(sLine, "*__*__*")[1]

  If sItalic Then sLine = Replace(sLine, "//", "<iii>")
  If sUnderline Then sLine = Replace(sLine, "__", "<uuu>")
  If sBold Then sLine = Replace(sLine, "**", "<bbb>")

  Print sLine

End

This is definitely more elegant but how to get control over replace when
needed?

Public Sub Main()

  Dim sLine As String = "The //installation// of a **SSH server on the
remote __computer__ is worthwhile in any case!"

  sLine = RegExp.Replace(sLine, "[_]{2}(.*)[_]{2}", "<uuu>&1<uuu>")
  sLine = RegExp.Replace(sLine, "[/]{2}(.*)[/]{2}", "<iii>&1<iii>")
  sLine = RegExp.Replace(sLine, "[*]{2}(.*)[*]{2}", "<bbb>&1<bbb>")

  Print sLine

End

Now I'm going to sleep ;-D

Good night & Regards

Gianluigi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gambas-basic.org/pipermail/user/attachments/20211230/12c1d454/attachment-0001.htm>


More information about the User mailing list