[Gambas-user] String processing in Wiki
Eilert
eilert-sprachen at ...221...
Wed Mar 29 08:46:58 CEST 2006
> An ASCII string is a valid UTF-8 string. And nothing prevents you from using
> native ASCII string functions with UTF-8 strings, provided that you know what
> you do. There is no need to "convert" a UTF-8 string to ASCII!
When you have German umlauts (or French accentuated letters) in the
string, and... ok :-) I don't remember what the reason was, but I missed
one function in the string class, and so had to convert the strings
forth and back to do the job with native string functions. This happened
in one of my apps, and as far as I remember, I missed an InStr and
RInStr function for UTF-8.
>
>> There is one drawback, however. If there is a string with characters
>> that are similar to UTF-8 but do not mean UTF-8, the function will have
>> to know how to handle the string. An additional flag would be necessary
>> to implement this, right? Just like gb.ASCII or gb.UTF-8 with a default
>> set to whatever one thinks is used more often?
>>
>
> What are you talking about? Are you sure that you know exactly how UTF-8
> works? Please give details about what you want to do exactly...
Was a bit tired last evening, sorry ;-) I meant bytes. Characters above
128 have two bytes, so the function has to recognize where characters
above 128 are.
But there are cases when you do not need "characters" but pure "bytes"
to be found in a string, and a universal function for this would have to
be told how to interpret the string: as a text string, or as a string of
bytes. Do you agree?
Rolf
More information about the User
mailing list