Dears, I have a small problem with a Java application that I am writing.
It is an application that connects via JDBC to MySQL. When saving the records, the characters are saved correctly (if I write "Mexico" in a JText box and send it to the database, it is saved as "Mexico"); but when reading them back using the method ResultSet.getString()
the "special" characters (accents and "ñ") appear wrong ("México" is read as "M��xico").
I think it's something to do with the "encoding" of the characters, but I don't know specifically what it is. The MySQL database uses encoding utf8_spanish_ci
, and the function Charset.defaultCharset()
returns UTF-8
.
So my specific question is: How to get strings read from MySQL containing special characters (that were correctly stored) to display correctly in the Java application?
Update (partial fix):
After some more searching I found this question and its answer which helped me. Specifically, what it says is that when opening the connection you have to specify the character set that will be used; in my case:
DriverManager.getConnection(
"jdbc:mysql://" + host + "/" + dbName
+ "?useUnicode=true&characterEncoding=utf8_spanish_ci",
user, pass);
However, it only partially fixes the problem:
When reading field data VARCHAR
the special characters are read correctly. However, when reading from fields JSON
, values that have special characters are still displayed as "strange".
Update (final):
The problem has to do with the encoding that MySQL uses to save the JSON data; The answer below illustrates the procedure I followed to fix it.
After scratching the problem a bit more, I found this reference in the MySQL user manual:
So no matter what character encoding I use, MySQL automatically converts the JSON string to
utf8mb4
... which isn't a big deal on saving, but is on reading back :(My solution (and I suspect it's not the best), was the following: Write, inside the query, the conversion to the required encoding. Something like this:
With this "tweak", the data is read perfectly (with all correct accented characters).
I guess there may be a simpler way to fix this, but so far this solution has worked for me.