1
Vote

Problem with XAM_GetString and UTF-8

description

Hello, first, sorry of my English.
XAM_GetString is declared as
 
public static extern int XAM_GetString (XAMHandle inHandle, string inName, StringBuilder outValue),
 
but if is used for get stringvalue in XFieldInfo.Value, return string is "UTF-8", but any wrong recode "UTF-8". For example, in Centera metadata is string "Barnetová", but when I get this metadata with XFieldInfo.Value as XString, the string "Barnetová" is as a "BarnetXX" where "XX" is broken character.
 
XFieldInfo fieldInfo = xSet.GetFieldInfo((XString)xit.Current);
if (!fieldInfo.Property)
 continue;
if (fieldInfo.Type.ToString().Equals("application/vnd.snia.xam.string"))
{
 Encoding utf8Encoding = Encoding.UTF8;
 Encoding defaultEncoding = Encoding.Default;
 string originalString = fieldInfo.Value.ToString();
 byte[] srcBytes = utf8Encoding.GetBytes(originalString);
 byte[] dstBytes = Encoding.Convert(utf8Encoding, defaultEncoding, srcBytes);
 string defaultString = defaultEncoding.GetString(dstBytes);
 string mujString = StaproXAMHelper.GetXAMString(fieldInfo.xamHandle, fieldInfo.Name.ToString());
 MessageBox.Show(string.Format("Original string = {0}\nRecorde string = {1}\nNew implementation = {2}", originalString, defaultString, mujString));
}
 
Field originalString contains wrong encoding string, field defaultString is recode string, too wrong. defaultEncoding is in my case Western Europe. In attached picture is what I get with XAMSDK XAM_GetString and what I get with my modification.
 
I modified XFieldInfo - xamHandle is public - and wrote class
 
internal sealed class StaproXAMHelper
{
 [DllImport("xam.dll")]
 public static extern int XAM_GetString(long inHandle, string inName, byte[] outValue);
 
 const int XAM_MAX_STRING = 512;
 
 public static string GetXAMString(XAMHandle inHandle, string inName)
 {
      byte[] srcBytes = new byte[XAM_MAX_STRING];
      int i = XAM_GetString((long)inHandle, inName, srcBytes);
      Encoding utf8Encoding = Encoding.UTF8;
      Encoding defEncoding = Encoding.Default;
      byte[] dstBytes = Encoding.Convert(utf8Encoding, defEncoding, srcBytes);
      return defEncoding.GetString(dstBytes);
 }
}
 
This implementation is'nt too optional, because dstBytes contains random data after right string.

file attachments

comments

gstuartemc wrote Apr 1, 2011 at 2:38 PM

Thanks! Well spotted - very interesting point. Given that you have shown people how to work around it then I don't see a need to rush this out as a patch. I will, however, look at incorporating your "utf8-safe" version in the next release.

wrote Feb 2, 2013 at 1:32 AM