MaxStocker.com   MaxStocker.com    
   
Home About Blog Stuff Contact
 
   
 

November 2009

MySQL, JDBC, Unicode and You
Posted : Sun November 29th

Whatever doesn't kill me will make me stronger
Posted : Thu November 5th

Somewhat random thoughts
Posted : Sat October 17th

Strange SSL woes
Posted : Wed October 14th

Because everybody has a Mom
Posted : Sat October 3rd

Sometimes I wonder
Posted : Tue September 29th

Many updates
Posted : Sat September 26th

IE Rant
Posted : Sun September 6th

Status update
Posted : Sun August 30th

Blog reuse tips
Posted : Mon August 17th

Will Canada take Facebook to court?
Posted : Sun August 16th

Death knell for software patents?
Posted : Wed August 12th

Updating JDoctopdf
Posted : Sat August 8th

Privacy and Security
Posted : Fri August 7th

Recent Comments

Max in Whose blog is it anyway?
on Mon May 10th

Rob in Whose blog is it anyway?
on Fri May 7th

Anonymous in SEO and the magic beans
on Thu April 8th

Max in SEO and the magic beans
on Thu April 8th

n.o. in SEO and the magic beans
on Thu April 8th

silky in Right way, wrong way
on Fri February 19th

Categories

Technical
69 Entries

Security
18 Entries

Java
23 Entries

Privacy
6 Entries

Database
11 Entries

Internet
58 Entries

Business
31 Entries

Site Updates
19 Entries

Personal
86 Entries

RSS Feed RSS Feed

Tag Cloud

MySQL, JDBC, Unicode and You
Posted : Sunday November 29th, 2009

A few notes on my experiences with using MySQL from JDBC with unicode support that might help others.

Yes, MySQL and the JDBC driver fully support UTF-8 and really it's quite easy to do but there are a few things to be aware of.

First you are best off if you create your database to use utf8 as the character set from the get go. If the database is set to use another character set by default then you have to be careful with your CREATE/ALTER table statements to make sure you're using utf8 there. Also the driver can get confused (if you're not careful) when the database and table character sets don't match.

Second on the driver front you should stick the following parameters onto your JDBC connection URL (I am assuming you are using the MySQL supplied JDBC driver, I can't imagine why you wouldn't really).

useUnicode=true&characterEncoding=UTF-8

These parameters will make sure that the driver uses the correct encoding. As mentioned above if the database is set to use utf8 the driver will auto-detect this but there is a difference between database and table character sets in MySQL which can trip you up if not careful. So setting these parameters ensures it will work regardless of your setup.

In my opinion you should never use another character set in MySQL besides latin or utf8. If you are only ever going to store "ASCII" text than latin is fine but if you are supporting any other character set or sets just use utf8. It keeps things simple.

And to wrap up, a few notes on testing and displaying. If you are having problems with UTF-8 data and your database you should test the data before you insert it to make sure it is what you think. A large amount of JDBC related encoding issues are caused by the data being mangled well before it is event stored into the database. Also make sure that you are identifying the data correctly on the way to display it as well.

For quick and dirty stand-alone tests JOptionPanes work well in seeing data (as long as you have fonts that can display the glyphs in question). For web projects there are two common mistakes that can happen, one is not correctly identifying the outbound data as UTF-8 (which you can do very simply in your JSP page directive). The second is mangling the data in which is solved by setting the request encoding correctly in your servlet.

request.setCharacterEncoding("UTF-8");

Tags

JDBC  MySQL  UTF8 

Categories

Technical  Java 

Comments

 
   
  Follow me on Twitter   My Facebook Profile   My LinkedIn Profile   RSS feed of my blog Home   |   About   |   Blog   |   Stuff   |   Contact   |   Privacy Policy  
   
  © 2008 Max Stocker