[Gambas-devel] Comparing Gambas strings

Benoît Minisini gambas at ...1...
Fri Aug 16 01:04:36 CEST 2013


Le 16/08/2013 00:07, Tobias Boege a écrit :
> Hi Benoit,
>
> I'm a little uncertain about the following code:
>
> ---
> BEGIN_METHOD(Class_Method, GB_STRING a; GB_STRING b)
>
> 	char *b = GB.NewString(STRING(b), LENGTH(b));
>
> 	int res = strncmp(STRING(a), b, LENGTH(a) + 1);
>
> END_METHOD
> ---
>
> I just want to compare (case-unsensitively) the given Gambas strings 'a' and
> 'b'. As I saw in the sources of STRING_new(), the resulting string 'b' is
> always NUL terminated. (I need it GB.NewString'd anyway.)
>
> So this seems correct to me: strncmp() compares the maybe-not-NUL-terminated
> string 'a' and 'b' until either string reaches a NUL or the given number of
> bytes was searched.
>
> That there is a NUL in 'b' is sure. And for 'a', the NUL is either at
> LENGTH(a) + 1 or the string is part of a larger one, right? As if we called
>
> Class.Method(Mid$("String containing 'a'", 3, 3), "string b")
>
> In any case it will *not* produce a segfault if I instruct strncmp() to
> examine LENGTH(a) + 1 bytes. Is this correct?
>
> Additionally, it will lead me a correct comparison of the two strings, or
> not? I fear that something like this could produce false positives:
>
> Class.Method(Mid$("sample", 1, 2), "sample")
>
> Should I also compare the strings' lengths? Sounds kind of inconvenient
> because all I want is to compare ASCII strings case-unsensitively. Maybe I
> have overlooked an API or I don't get the right idea just now.
>
> Regards,
> Tobi
>

GB.NewString() always allocates a null-terminated string. But string 
passed as arguments are not necessarily null-terminated (this is why you 
get both a pointer and a length!).

-> Note: this allows Mid$(), Left$(), Right$(), Trim$(), LTrim$(), 
RTrim$() not to allocate anything, and so to be fast.

Don't use strcmp() and strncmp() (and tolower() and toupper()), because 
they are doing a locale-aware comparison. You must use GB.StrCaseCmp(), 
GB.StrNCaseCamp(), GB.ToLower() and GB.ToUpper() instead.

-> This because of the Turkish language, where 'i' and 'I' are not the 
same letter if you do a locale-aware case unsensitive comparison.

-> Fortunately, if you include "gambas.h", it will do the substitution 
automatically. So you can continue on using strcmp(), strncmp(), 
toupper() and tolower().

So you have to do:

LENGTH(a) == LENGTH(b) && GB.StrNCaseCmp(a, b, LENGTH(a)) == 0

to test case-unsensitive equality.

If you need fast comparisons, look at the 
'main/share/gb_common_string_temp.h' Gambas source file.

Regards,

-- 
Benoît Minisini




More information about the Devel mailing list