[Gambas-devel] Comparing Gambas strings

Tobias Boege tobias at ...692...
Fri Aug 16 09:50:29 CEST 2013


On Fri, 16 Aug 2013, Beno?t Minisini wrote:
> Le 16/08/2013 00:07, Tobias Boege a ?crit :
> > Hi Benoit,
> >
> > I'm a little uncertain about the following code:
> >
> > ---
> > BEGIN_METHOD(Class_Method, GB_STRING a; GB_STRING b)
> >
> > 	char *b = GB.NewString(STRING(b), LENGTH(b));
> >
> > 	int res = strncmp(STRING(a), b, LENGTH(a) + 1);
> >
> > END_METHOD
> > ---
> >
> > I just want to compare (case-unsensitively) the given Gambas strings 'a' and
> > 'b'. As I saw in the sources of STRING_new(), the resulting string 'b' is
> > always NUL terminated. (I need it GB.NewString'd anyway.)
> >
> > So this seems correct to me: strncmp() compares the maybe-not-NUL-terminated
> > string 'a' and 'b' until either string reaches a NUL or the given number of
> > bytes was searched.
> >
> > That there is a NUL in 'b' is sure. And for 'a', the NUL is either at
> > LENGTH(a) + 1 or the string is part of a larger one, right? As if we called
> >
> > Class.Method(Mid$("String containing 'a'", 3, 3), "string b")
> >
> > In any case it will *not* produce a segfault if I instruct strncmp() to
> > examine LENGTH(a) + 1 bytes. Is this correct?
> >
> > Additionally, it will lead me a correct comparison of the two strings, or
> > not? I fear that something like this could produce false positives:
> >
> > Class.Method(Mid$("sample", 1, 2), "sample")
> >
> > Should I also compare the strings' lengths? Sounds kind of inconvenient
> > because all I want is to compare ASCII strings case-unsensitively. Maybe I
> > have overlooked an API or I don't get the right idea just now.
> >
> > Regards,
> > Tobi
> >
> 
> GB.NewString() always allocates a null-terminated string. But string 
> passed as arguments are not necessarily null-terminated (this is why you 
> get both a pointer and a length!).
> 
> -> Note: this allows Mid$(), Left$(), Right$(), Trim$(), LTrim$(), 
> RTrim$() not to allocate anything, and so to be fast.
> 
> Don't use strcmp() and strncmp() (and tolower() and toupper()), because 
> they are doing a locale-aware comparison. You must use GB.StrCaseCmp(), 
> GB.StrNCaseCamp(), GB.ToLower() and GB.ToUpper() instead.
> 
> -> This because of the Turkish language, where 'i' and 'I' are not the 
> same letter if you do a locale-aware case unsensitive comparison.
> 
> -> Fortunately, if you include "gambas.h", it will do the substitution 
> automatically. So you can continue on using strcmp(), strncmp(), 
> toupper() and tolower().
> 
> So you have to do:
> 
> LENGTH(a) == LENGTH(b) && GB.StrNCaseCmp(a, b, LENGTH(a)) == 0
> 
> to test case-unsensitive equality.
> 
> If you need fast comparisons, look at the 
> 'main/share/gb_common_string_temp.h' Gambas source file.
> 

Phew, it took me quite some time to figure out that I meant to say: I want
to compare case-sensitively not case-unsensitively! My mistake...

In the former case I would have used GB.StrNCaseCmp() but case matters. I
guess, the additional length comparison is the solution I need. If not so,
please correct me.

Regards,
Tobi





More information about the Devel mailing list