Proper character encoding?
Hi, I'm trying to get my database to support Chinese. I've tried everything but nothing seems to work, so I'm asking you guys to see if someone actually can help me with this.
Character set is set to utf8, collation is utf8_general_ci.
If I change my motto to 您好!, the data column for motto is æ,¨å¥½ï¼.
Now, inside the client it works fine. Inside the client it still says 您好!, so it encodes/decodes the characters fine inside the client (?). But if I want to display the motto on my website, I'll get some weird results.
If I do
Code:
utf8_decode(MOTTO);
It returns �,�好!. So, it can't decode the first character?
http://puu.sh/frbth/fa83d84c12.png
I've also tried these things:
Code:
string motto = ButterflyEnvironment.FilterInjectionChars(this.Request.PopFixedString());
Encoding Windows1252 = Encoding.GetEncoding("Windows-1252");
Encoding Utf8 = Encoding.UTF8;
byte[] utf8Bytes = Utf8.GetBytes(motto); // Unicode -> UTF-8
string query = Utf8.GetString(utf8Bytes); // Correctly decode as UTF-8
Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(motto);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string msg = iso.GetString(isoBytes);
Console.WriteLine(motto);
Console.WriteLine(msg);
Console.WriteLine(query);
But they all output the same (almost):
æ‚¨å¥½ï¼ | æ‚¨å¥½ï¼ | æ,¨å¥½ï¼
Re: Proper character encoding?
utf8_encode is for converting Latin-1 encoded strings to UTF-8. Not for chinese like you're trying.
http://kunststube.net/frontback/ - You should get a good understanding here. Other wise there's a few good threads on stackoverflow that can also point you in the right direction.
Re: Proper character encoding?
Quote:
Originally Posted by
FatalLulz
utf8_encode is for converting Latin-1 encoded strings to UTF-8. Not for chinese like you're trying.
http://kunststube.net/frontback/ - You should get a good understanding here. Other wise there's a few good threads on stackoverflow that can also point you in the right direction.
I've done everything on that website, as stated in my thread by the examples but the characters still turns out weird.
I'm starting to believe that the client scrambles the characters because they get encoded inside the SWF (?). If so, what kind of encoding is it?
Re: Proper character encoding?
Quote:
Originally Posted by
Matata
I've done everything on that website, as stated in my thread by the examples but the characters still turns out weird.
I'm starting to believe that the client scrambles the characters because they get encoded inside the SWF (?). If so, what kind of encoding is it?
Bit confused. It works fine for your motto inside the client yes? But not on the CMS? Or doesn't work fine at all?
If you do open the .swf with something like SoThink SWF editor, there is a part for text related. If you dig through those files I'm pretty sure you'd find the encoding for it. I'll go take a look now as well.
Re: Proper character encoding?
Quote:
Originally Posted by
FatalLulz
Bit confused. It works fine for your motto inside the client yes? But not on the CMS? Or doesn't work fine at all?
If you do open the .swf with something like SoThink SWF editor, there is a part for text related. If you dig through those files I'm pretty sure you'd find the encoding for it. I'll go take a look now as well.
The motto is set to 您好! and that works fine. It stays that way.
http://puu.sh/frkKl/ecafcfe9e5.png
The column data for motto is æ,¨å¥½ï¼.
If you do utf8_decode(MOTTO); on the homepage, it doesn't decode it correctly. It decodes to �,�好!
So my guess is that the client itself encodes it someway and that messes it up.
I will have a look with sothink but I'm not that experienced.
--
Everything works fine, as long as it stays with those weird characters.
If I set the column data to 您好!, the motto shows up as ???.
And if I set the column data to 您好! encoded as UTF8, is shows up as æ?¨好! but then it shows up correctly on the website.
You can see the difference in this picture:
http://puu.sh/frbth/fa83d84c12.png
Re: Proper character encoding?
I have not seen how the Encoding module is implemented -- only it's interface through the snippets you provided. That being said, assuming the Encoding module and your database are implemented to support UTF-8, check whether your web server is responding to clients with the correct content-type. That is, have you done the following?
Code:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
Edit: (based on the following)
Quote:
Originally Posted by
Matata
If I set the column data to 您好!, the motto shows up as ???.
And if I set the column data to 您好! encoded as UTF8, is shows up as æ?¨好! but then it shows up correctly on the website.
If that statement is true then check whether the entry the emulator inserts into the database (i.e. the UTF-8 encoded string) is an exact match to the UTF-8 encoded string you inserted. From there you can better pin point where the problem derives.
Re: Proper character encoding?
Quote:
Originally Posted by
TheJacob
I have not seen how the Encoding module is implemented -- only it's interface through the snippets you provided. That being said, assuming the Encoding module and your database are implemented to support UTF-8, check whether your web server is responding to clients with the correct content-type. That is, have you done the following?
Code:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
Edit: (based on the following)
If that statement is true then check whether the entry the emulator inserts into the database (i.e. the UTF-8 encoded string) is an exact match to the UTF-8 encoded string you inserted. From there you can better pin point where the problem derives.
The emulator tries to insert the exact same string that appears in the column data. That is why I believe it's something that the client itself does.
Or I've messed something up very bad.
Re: Proper character encoding?
Habbo client uses UTF8 for encoding string. (http://help.adobe.com/en_US/FlashPla...tml#writeUTF()) Try checking if Request.PopFixedString() uses that too.
Re: Proper character encoding?
Quote:
Originally Posted by
AWA
PopFixedString() uses DefaultEncoding, which is set to Encoding.Default;
If I change it to Encoding.UTF8, everything starts acting very weird. My achievement points were around 150 before, which was correct but now they show up as -4137832983. Also, opening the catalog results in a disconnection. If I set my motto to "您好!", it just shows up as ??!.
EDIT:
I just printd out the default encoding (which it's set to by default) and it's Western European (Windows).