[Gambas-user] Isn't bracket regular expression compatible with UTF8?
Tobias Boege
taboege at ...626...
Wed Jul 5 11:37:59 CEST 2017
On Tue, 04 Jul 2017, Fernando Cabral wrote:
> I have been trying something like *poder[^[:alpha:]* so I could find the
> word "poder " ("poder" followed by an space) but not "poderão" ("ã" being
> an alpha character in Portuguese.)
>
> In English it could be like finding "power" but not "powerless".
>
> Problem is that it seems [^[alpha]] includes accented characters like "á",
> "é", "ã".
>
> That is, accented characters are not understood as alpha, but not alpha.
>
> Please, note that I have compiled it with the UTF8 flag:
> * re.Compile(poder[^[:alpha]], RegExp.utf8)*
>
> Any hints?
>
In your mail I can see three distinct attempts at writing down a
negative character class: [^[:alpha:], [^[alpha]], and [^[:alpha]],
but the correct syntax is
[[:^alpha:]]
You want to check this first.
Regards,
Tobi
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
More information about the User
mailing list