Why getting character ®(U+00AE) is different in java 6 and java 7?

This is my first time asking at StackOverFlow. I'm not good at English. Please excuse me.

I'm having a problem that my application is returning a strange character.

PlayStation\ufffd\ufffd4 Pro

It has to be like this:

PlayStation®4 Pro

I think '\ufffd' character represents this, 'REPLACE CHARACTER'.

My application is using jdk 1.6.

I found that when I change my application's jdk to 1.7, it prints the character correctly.

PlayStation®4 Pro

More Information

My application uses ibatis, and the problem is occurring after queryForObject.

public class A {
    private String content;
    public String getContent() {
        return content;
A a = (A)queryForObject("mapper.getSomething", params);
return a;
// jdk1.6 - a.getContent() : PlayStation\ufffd\ufffd4 Pro
// jdk1.7 - a.getContent() : PlayStation®4 Pro

JDBC connection property is like this.

More Information 2
  • I tested without ibatis and others. Directly using jdbc connection, but the same result.
public class CharacterEncodeTest {
    // JDBC driver name and database URL
    static final String DB_URL = "jdbc:mysql://{IPADDRESS}/{DBTNAME}}?Unicode=true&characterEncoding=MS949&zeroDateTimeBehavior=convertToNull&socketTimeout=500000&connectTimeout=500000";

    //  Database credentials
    static final String USER = "{USER}";
    static final String PASS = "{PASSWORD}";

    public static void main(String[] args) {
        Connection conn = null;
        Statement stmt = null;
        try {
            //STEP 2: Register JDBC driver

            //STEP 3: Open a connection
            System.out.println("Connecting to a selected database...");
            conn = DriverManager.getConnection(DB_URL, USER, PASS);
            System.out.println("Connected database successfully...");

            //STEP 4: Execute a query
            System.out.println("Creating statement...");
            stmt = conn.createStatement();

            String sql = "SELECT * from TABLE";
            ResultSet rs = stmt.executeQuery(sql);
            //STEP 5: Extract data from result set
            while (rs.next()) {
                //Retrieve by column name
                String content = rs.getString("content");

                //Display values
                System.out.print("content: " + content);
                // jdk1.6 : PlayStation\ufffd\ufffd4 Pro
                // jdk1.7 : PlayStation®4 Pro
        } catch (SQLException se) {
            // something
        } finally {
            // something
        }//end try

The only difference is just changing jdk version.

  1. What difference is the matter between jdk 1.6 and 1.7 about this problem?

  2. Is there any solution to solve this problem in jdk 1.6?

No idea what \ufffd is, but the ® symbol is \u00ae: https://www.fileformat.info/info/unicode/char/00ae/index.htm

No idea, but i think jdk 1.6 and jdk 1.7 use different types of encoding for character. Please visit the below links :

Does Java 1.7 use a different character encoding?

Why is my String returning "\ufffd\ufffdN a m e"

Get string character by index - Java, The method you're looking for is charAt . Here's an example: String text = "foo"; char charAtZero = text.charAt(0); System.out.println(charAtZero); // Prints f.

You got two question mark characters initially. This looks like there was one UTF8 character, but your code was not able to read the 4-byte sequence and thus showed 2 question marks - each representing an unknown 2-byte character. Are you sure that the data did not change while your code was never able to process UTF8? It might have been this 4-byte character before: https://en.wikipedia.org/wiki/Enclosed_R ?

  • Please explain what you are doing to get this string (maybe with a few lines of code).
  • Hello @Henry ! I edited little bit more code. I'm not sure it'll be helpful. When more information is needed, please tell me. Thanks for your comment.
  • This means, the string is already wrong when you get it back from queryForObject. The problem must be be in there or in even deeper layers. Use a debugger to track down where exactly it gets wrong.
  • How do you know that the System.out.print prints \ufffd\ufffd ? I don't know of any terminal or console that outputs unicode escapes. And have you considered that, since the result doesn't come from the JDK but from the mysql driver, that it may be something in there?
  • \ufffd is the replacement character, it's used when a Unicode aware system cannot process a character
  • Both Java 6 and 7 use the same encoding for String: UTF-16 (see Javadoc: docs.oracle.com/javase/6/docs/api/java/lang/String.html). Source code files can be encode in the encoding of your choice you just have to indicate which one is used to the Java compiler using -encoding option (see docs.oracle.com/javase/6/docs/technotes/tools/windows/…).
  • The character in question is 2 bytes in UTF-8, not 4 bytes. The 2 question marks (more accurately, replacement chars) are more likely due to those 2 bytes being misinterpreted individually.