[Gambas-user] String processing in Wiki
Eilert
eilert-sprachen at ...221...
Tue Mar 28 17:57:57 CEST 2006
Benoit Minisini schrieb:
> On Tuesday 28 March 2006 11:44, Eilert wrote:
>> Today I've translated the functions starting with C into German.
>>
>> There are some functions where there is a special hint that they only
>> deal with ASCII, but Gambas deals with UTF-8.
>>
>> Now there are two general questions of mine :-)
>>
>> - Why do some of the functions use ASCII and some do not, and are there
>> plans to make it uniform (all using UTF-8),
>
> No.
Ok...
>
>> or are there other important
>> reasons for not to do so?
>
> Yes.
>
> There is a difference between a string to translate (that will be displayed
> somewhere), and a string that must not be translated, i.e. that is just used
> by the program logic (a collection key, a english-only syntax...).
I had the case that there was heavy string processing necessary in one
of my apps, and converting forth and back to be able to use the
functions was a bit tricky. So there are times when the programmer
wished that all string processing would be uniform. Sometimes you need
to use them with strings that originate from UTF-8.
There is one drawback, however. If there is a string with characters
that are similar to UTF-8 but do not mean UTF-8, the function will have
to know how to handle the string. An additional flag would be necessary
to implement this, right? Just like gb.ASCII or gb.UTF-8 with a default
set to whatever one thinks is used more often?
>
> UTF-8 strings are heavier to process than ASCII strings, as the size of a
> character is not necessarily one byte.
And this would slow down those native functions?
>
>> - If it has to be as it is, shouldn't we add a little hint into every
>> string function to make clear if it is ASCII or UTF-8? Like a symbol or
>> so...
>
> All native string functions are ASCII-only, and all UTF-8 string functions
> were put in the String static class.
Yes, I know, I use it a lot. But a little hint for the newbies or people
like me who tend to forget those tricky things wouldn't be wrong, would it?
>
> The String class is not complete at the moment, and some of its methods do not
> have well-chosen names.
But I could use it very well.
>
>> And a special question:
>>
>> - I was wondering why the hint "Be careful! The current localization is
>> not used by this function." is mentioned for functions like CSng(). What
>> do such functions have to do with localisation, and why does CShort()
>> doesn't mention it then?
>
> The decimal separator can be different when the language changes.
>
> Integer numbers seems to be written the same way in every language.
Aaah yes, of course :-) I see.
Rolf
More information about the User
mailing list