Solution For MySQL UTF-8 Encoded Data Unreadable In Web Pages.


Introduction
This blog entry is about technical issue regarding data manipulation in mysql, with tool like phpmyadmin. Data stored in Mysql is retrived/input with PHP pages. If you can’t understand the topic at all, please don’t continue.

Problem
I’ve been dealing with web projects for years. Mostly with PAM(php + apache + mysql) environment. When it comes to something that requires user’s interactions( comment, forum, etc), I always use UTF-8 as the default charset in web pages. This ensures that different characters(simplified chinese, traditional chinese, japanese etc)can be displayed on same page nicely without users have to change the character encoding in the browser.

From time to time, Webmasters are surely going to perform various kinds of database maintenance, such as backup, create indices, restore and many more depends on the requirements(site query speed, space used etc). Now the problem I’ve encountered few days back, was the data in tables that contains chinese characters, became unreadable in web pages after a database cloning.(site A to site B, completely same scripts)

Solution
I’ve solved the problem with phpmyadmin, with the inspiration from this MySQL Doc. I am not sure if my host has upgraded the MySQL server recently or not, but this method worked out for me. Note that I’ve tried to import the same database into my local debian box, the problem still occured and same work needed to be done. Now here’s how it’s solved.Warning, backup everything before making any changes, universal rule!

First, you have to know where’s the data location that needs to be fixed(Database, Table, then Column). With phpmyadmin, I selected the database name, then the related table, and check on the field that needed the workaround. Edit that field DATA TYPE, to “Binary”. It is important that you make sure the LENGTH is maintained same. If the field’s data type was TEXT, MEDIUMTEXT, or LONGTEXT, change it to “BLOB”, with the original length.(This is to ensure that your data won’t get truncated). Collation here is not important since data is going to be converted to binary.

You should now see the field’s data type being changed to “binary”, “varbinary”, or “blob” depending on the length. Now the second step is, change the data type back to what it was before the binary conversion. Note that, a correct “Collation” must be set this time. For my case I’m sure the data is in UTF-8, so I’d use “utf8_general_ci” for the collation. After making this conversion, the data that contains chinese characters is displayed flawlessly on the new site. It’s just that easy.

Note#1: I recommend that the conversion is done field by field for better control. Please be cautious when performing such conversion, you wouldn’t want any data loss.

Note#2: Although I found out this workaround for the problem, but I still don’t understand why is the database behave like this. Like, both databases have the exactly same settings(fields, collations etc), but extra works need to be done when copying data among each and another.(one server to another). Some feedbacks would be appreciated.

Technorati , ,
Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Reddit
  • YahooMyWeb
Ratings:
1 Votes | Average: 5 out of 51 Votes | Average: 5 out of 51 Votes | Average: 5 out of 51 Votes | Average: 5 out of 51 Votes | Average: 5 out of 5 (1 votes, average: 5 out of 5)
Loading ... Loading ...

Text Link Ads


Related Post(s)

One Response to “Solution For MySQL UTF-8 Encoded Data Unreadable In Web Pages.”

  1. Archie Rios Says:

    exhaustlessly spastically canto faipule vampyrella symbiotrophic germanic dicephalous
    PSP Global
    http://www.eaabillings.com/

Leave a Reply