[Gambas-user] String processing in Wiki

Eilert eilert-sprachen at ...221...
Wed Mar 29 08:46:58 CEST 2006


> An ASCII string is a valid UTF-8 string. And nothing prevents you from using 
> native ASCII string functions with UTF-8 strings, provided that you know what 
> you do. There is no need to "convert" a UTF-8 string to ASCII!

When you have German umlauts (or French accentuated letters) in the 
string, and... ok :-) I don't remember what the reason was, but I missed 
one function in the string class, and so had to convert the strings 
forth and back to do the job with native string functions. This happened 
in one of my apps, and as far as I remember, I missed an InStr and 
RInStr function for UTF-8.

> 
>> There is one drawback, however. If there is a string with characters
>> that are similar to UTF-8 but do not mean UTF-8, the function will have
>> to know how to handle the string. An additional flag would be necessary
>> to implement this, right? Just like gb.ASCII or gb.UTF-8 with a default
>> set to whatever one thinks is used more often?
>>
> 
> What are you talking about? Are you sure that you know exactly how UTF-8 
> works? Please give details about what you want to do exactly...

Was a bit tired last evening, sorry ;-) I meant bytes. Characters above 
128 have two bytes, so the function has to recognize where characters 
above 128 are.

But there are cases when you do not need "characters" but pure "bytes" 
to be found in a string, and a universal function for this would have to 
be told how to interpret the string: as a text string, or as a string of 
bytes. Do you agree?

Rolf





More information about the User mailing list