[Gambas-user] String processing in Wiki

Eilert eilert-sprachen at ...221...
Tue Mar 28 17:57:57 CEST 2006


Benoit Minisini schrieb:
> On Tuesday 28 March 2006 11:44, Eilert wrote:
>> Today I've translated the functions starting with C into German.
>>
>> There are some functions where there is a special hint that they only
>> deal with ASCII, but Gambas deals with UTF-8.
>>
>> Now there are two general questions of mine :-)
>>
>> - Why do some of the functions use ASCII and some do not, and are there
>> plans to make it uniform (all using UTF-8), 
> 
> No.

Ok...

> 
>> or are there other important 
>> reasons for not to do so?
> 
> Yes.
> 
> There is a difference between a string to translate (that will be displayed 
> somewhere), and a string that must not be translated, i.e. that is just used 
> by the program logic (a collection key, a english-only syntax...).

I had the case that there was heavy string processing necessary in one 
of my apps, and converting forth and back to be able to use the 
functions was a bit tricky. So there are times when the programmer 
wished that all string processing would be uniform. Sometimes you need 
to use them with strings that originate from UTF-8.

There is one drawback, however. If there is a string with characters 
that are similar to UTF-8 but do not mean UTF-8, the function will have 
to know how to handle the string. An additional flag would be necessary 
to implement this, right? Just like gb.ASCII or gb.UTF-8 with a default 
set to whatever one thinks is used more often?

> 
> UTF-8 strings are heavier to process than ASCII strings, as the size of a 
> character is not necessarily one byte.

And this would slow down those native functions?

> 
>> - If it has to be as it is, shouldn't we add a little hint into every
>> string function to make clear if it is ASCII or UTF-8? Like a symbol or
>> so...
> 
> All native string functions are ASCII-only, and all UTF-8 string functions 
> were put in the String static class.

Yes, I know, I use it a lot. But a little hint for the newbies or people 
like me who tend to forget those tricky things wouldn't be wrong, would it?

> 
> The String class is not complete at the moment, and some of its methods do not 
> have well-chosen names.

But I could use it very well.

> 
>> And a special question:
>>
>> - I was wondering why the hint "Be careful! The current localization is
>> not used by this function." is mentioned for functions like CSng(). What
>> do such functions have to do with localisation, and why does CShort()
>> doesn't mention it then?
> 
> The decimal separator can be different when the language changes.
> 
> Integer numbers seems to be written the same way in every language.


Aaah yes, of course :-) I see.


Rolf





More information about the User mailing list