[Gambas-user] Problem with Conv() and XmlWrite

Benoit Minisini gambas at ...1...
Thu Dec 18 02:09:00 CET 2008


On dimanche 14 décembre 2008, Luigi Carlotto wrote:
> In my application I execute a rescue on XML file of some data.
> The rescue executes before a conversion: Conv (string, “UTF-8”, “ASCII”)
> During my tests, I have noticed that some errors are taken place:
> 1) if the string is larger than 2000 characters (approximately),
> XmlWrite.Attribute() crash;
> 2) if the language comes set up to Chinese (UTF-8), the conversion with
> Conv() crash; same behavior if is used XmlWrite.Attribute(), or
> XmlWrite.Element(), without to execute conversions.
> I have noticed that the errors take place alone if System.Language=zh_CN
> and System.Charset=UTF-8, while with various languages seems to go all
> good.
>
> Error message:
>
> encoding error : output conversion failed due to conv error, bytes 0xE5
> 0x31 0x32 0xE5
> I/O error : encoder error
>
> The bytes indicated from the message they do not seem to correspond,
> neither are comprised, in no string between those which they come saved
> in XML file.
>
> To notice that the reading from XML file is OK, while the writing
> finishes with the error described in mine precedence mail, if the
> writing of XML file with System.Language=zh_CN (Chinese); with other
> languages (Italian, French, English and Spanish) it is all to OK.
>
> >From some tests it begins them seemed that the anomaly depended also
>
> from the dimensions of the string, but this had to cause the crash also
> with the other formulations, what that instead does not happen. The
> strange thing that I have noticed, is that the dimensions of the content
> of the string, with Language=zh_CN, are much smaller of the usual; I
> imagine that this depends from as the String object it manages the data,
> but that turns out me also the Chinese characters use advanced values to
> 256.
>
> In any case, the application is too much large and complex for being
> sent and described in a mail… However, the errors take place (DEBUG) in
> the Save() method, for which at the moment shipment only the functions
> interested. If they serve more information, no problems.
>
> The function toString () works OK, and returns the attended string, but
> the conversion finishes then in error.
>
> '---
> ' Save settings
> '
> PUBLIC SUB Save()
>   DIM oGroup AS pgConfigGroup
>   DIM oItem AS pgConfigItem
>   DIM oXml AS XmlWriter
>   'open file for writing
>   oXml = NEW XmlWriter
>   oXml.Open($filename, TRUE, "UTF8")
>   'write header
>   oXml.StartElement(pgApplication.Name)
>   oXml.Attribute("Type", Str("Config"))
>   oXml.Attribute("Version", pgUtil.SetVersion(VERSION))
>   oXml.Attribute("Date", Str(Now()))
>   'write elements
>   FOR EACH oGroup IN $data
>     oXml.StartElement(oGroup.Name)
>     FOR EACH oItem IN oGroup.Items
>       oXml.Attribute(oItem.Name, Conv(oItem.toString(), "UTF-8",
> "ASCII"))
>     NEXT
>     oXml.EndElement
>   NEXT
>   'write footer
>   oXml.EndElement
>   oXml.EndDocument
> END
> ...
> ...
> ...
> '---
> ' Convert any value into single string
> '
> PUBLIC FUNCTION toString() AS String
>   DIM iPos AS Integer
>   DIM aText AS String[]
>   SELECT CASE $type
>   CASE TYPE_STRING 'gb.String
>     RETURN Str(IIf(IsNull($value), "", $value))
>   CASE TYPE_INTEGER 'gb.Integer
>     RETURN Str(IIf(IsNull($value), "0", $value))
>   CASE TYPE_BOOLEAN 'gb.Boolean
>     RETURN Str(IIf(IsNull($value), "false", IIf($value, "true",
> "false")))
>   CASE TYPE_STRINGARRAY 'String[] object
>     IF (IsNull($value)) OR IF ($value.Count = 0) THEN RETURN Str("")
>     RETURN Str($value.Join(","))
>   CASE TYPE_INTEGERARRAY 'Integer[] object
>     IF (IsNull($value)) OR IF ($value.Count = 0) THEN RETURN Str("")
>     aText = NEW String[]
>     FOR EACH iPos IN $value
>       aText.Add(Str(iPos))
>     NEXT
>     RETURN Str(aText.Join(","))
>   CASE TYPE_COLOR 'custom color object
>     RETURN Str(IIf(IsNull($value), "0,0,0", pgColor.ColorToStr($value)))
>   END SELECT
> END
>
> Help?

As explained privately, the conversion from UTF-8 to ASCII function should 
fail as soon as the string has no ASCII character. But it should not crash 
(i.e. signal #11). 

Anyway, there is no point in converting UTF-8 to ASCII, because there are only 
two possibilities: the UTF-8 string has only ASCII characters, and the UTF-8 
string is already an ASCII string, and so the conversion is possible, or the 
UTF-8 has some non-ASCII characters, and the conversion is impossible and 
fails.

Regards,

-- 
Benoit Minisini




More information about the User mailing list