Thai language is not showing in Java output

java locale language codes
character encoding in java example
java locale example
java encoding utf-8
cp037 encoding java
unicode characters not displaying properly windows 10
java 8 locale
german language in locale in java

Unable to print Thai string value in Java console

public static void main(String [] args){
   String engParam = "Beautiful";
   String thaiParam = "สวย";
   System.out.println("Output :" + engParam + ":::" + thaiParam);}

Output is showing like:

Output :Beautiful:::à?ªà??à?¢

I think System.out.println will not be able to print the UTF-8 characters with default console settings. Is there any other way available to resolve this issue? help needed.

Why does some text display with square boxes in some apps on , However, you will not be able to do the following: print Thai with print screen function; display Thai on the Abstract Window Toolkit (AWT) and Java Swing� DrClap wrote: I need to decode with TIS-620 charset.And yet your code reads the file using the UTF-8 encoding. If you have a file that's encoded with TIS-620 then you should start by reading it using TIS-620.

One cannot easily change a Windows' console encoding. So write to a .txt file. For Windows to detect the Unicode UTF-8 encoding, you could write at the beginning an invisible BOM character: "\ufeff".

String text = "\uFEFF" + "Output :" + engParam + ":::" + thaiParam;
Path path = Paths.get("temp.txt");
Files.write(path, Collections.singletonList(text)); // Writes in UTF-8

Understanding Thai language support, OutputStreamWriter , java.lang. The European languages version only supports the encodings shown in the following Basic Encoding Set table. The canonical names used by the new java.nio APIs are in many cases not the same as those used in the java.io x-IBM874, Cp874, ibm-874 ibm874 874 cp874, IBM Thai. Teams. Q&A for Work. Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information.

The problem in not in Java. When converted in UTF-8, the thai string "สวย" gives the bytes '0xe0', '0xb8', '0xaa', '0xe0', '0xb8', '0xa7', '0xe0', '0xb8', '0xa2'

In Latin1, 0xe0 is à, 0xaa is ª, oxa2 is ¢, and the others have no representation giving the ? characters.

That means that the println has done its part of the job but that the thing that should have displayed the characters (terminal screen or IDE) cannot or was not instructed to process UTF8.


Unfortunately, the Windows console is not really Unicode friendly. Recent versions (>= Win 7) support a so called utf-8 code page (chcp 65001) which correctly processes UTF-8 byte strings provided its underlying charset can display the characters. For example after typing chcp 65001 my French system successfully displays all accented characters (éèùïêçàâ...) when they are UTF-8 encoded, but cannot display your example Thai string.

If you need a truely UTF-8 capable console on Windows, you can try the excellent ConEmu.

Supported Encodings, Installer and Supported Languages; Enabled Locales for java.util and If the system's default locale is not supported by the installer, the installer will be displayed in English. Explicitly specifies the Thai Buddhist calendar with java. util. by the runtime environment to the standard output and standard error streams,� JTCC. JTCC is a Java library to tokenize Thai text into a list of TCCs. The rules used to determine TCCs' boundaries are implemented as grammar using ANTLR.. What is TCC ? TCC or Thai Character Cluster (proposed in Character Cluster Based Thai Information Retrieval is a group of inseparable Thai characters.

This answer to a similar question might be your case, if you are using eclipse (but it can be almost the same in IntelliJ)

Java 8 Supported Locales, CSV file Lao characters not showing after export from ODK Briefcase Also, could you run java -version and copy here the output? Location: Thailand (for information, Thai language is quite similar to Lao, so Lao people like my colleague� For your information, date time in Thai language follows Thai Buddhist Era (B.E.). This module provides a convenient function to parse a datetime object into your desired format just like how you would do it via datetime.strftime(). Add the following import statement. import datetime from pythainlp.util import thai_strftime

This answer assumes that:

  1. You are using Windows.
  2. The "Java console" you said is an invoke of Command Prompt (You may know nothing about this if you are using an IDE, but cmd and IntelliJ IDEA surely does, though I don't know whether Eclipse or other does).
  3. My guess was right :-)

Go to Registry Editor (regedit), locate at "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Command Processor", create a REG_EXPAND_SZ named AutoRun with value chcp 65001. Then try again (no reboot required).

Actually, this is an example of creating and using an "initscript" for cmd.exe. It may be the way for us to change the de facto "default" console encoding to UTF-8 (codepage 65001) without changing too much of the system configurations.

To restore it, simply delete this specified value.

.CSV file Lao characters not showing after export from ODK , They identified that some characters are not displayed properly in PDF output This bypasses iText's text layout by using Java/AWT to render the text in PDF. Locale support in input methods implemented in the Java programming language depends solely on the set of installed input methods, not on the host operating system and its localization. However, support for the use of input methods implemented in the Java programming language with peered components is implementation dependent - see below.

Some Thai Characters Are Not Displaying Properly in PDF export , The support for locale-sensitive behavior in the java.util and java.text packages is of the J2SE Runtime Environment 5.0 support all locales shown below. Thai (Western digits). Thailand. th_TH. Thai (Thai digits). Thailand. th_TH_TH and full support for simultaneous use of multiple languages is not always possible. If a language is listed under Preferred languages but doesn't appear on the Windows display language list, you'll first need to install its language pack (assuming it's available in your edition of Windows 10).

Supported Locales, If you need to specify a language that is not on the list, such as Russian, you must type the ISO 639 code for the language, not its name. This example is shown in� If an app depends on one of these fonts for displaying certain Unicode characters and does not make use of font fallback mechanisms provided by Windows, and if the optional font package containing that font is not installed on the system (typically because the system and user profiles are not configured to have the associated language enabled), then the result would be characters displayed as

PDF16: Setting the default language using the /Lang entry in the , In other languages the tokenization rules are often not so simple. This is a Java filter written specifically for stemming the Brazilian dialect of the When a CJK character has no adjacent characters to form a bigram, it is output in unigram form. and segments Chinese text into words with the Hidden Markov Model. To use� After the language change, activity titles are not translated properly sometimes even after restarting of an activity. It took me some time to find out what’s going on. During a launch of an activity, its title (declared in a manifest file) is being loaded from the top level resources and cached.

Comments
  • Most likely there is a problem with your console - which console are you using? IDE build-in, windows command prompt or something else? Try playing with its settings.
  • Windows command prompt
  • Windows command prompt/ PowerShell. Let me clarify the whole scenario. Yes, I can do that/ print that with Eclipse IDE with some IDE specific configuration changes. but I can't use the IDE in a cloud server/ deployment env (though create a WAR file and deploy it in a tomcat server is a good option). That's why I'm trying with a standalone program and use the Windows Powershell/ Windows command prompt.
  • Appreciate your effort.
  • Looks like it's not. Still from windows powershell I'm getting "UTF-8: Beautiful:::???". with (chcp 65001)
  • OK. [1] What font are you using in PowerShell? [2] What about if you run from the Command window (cmd.exe) when using a font that can render Thai characters?
  • I just ran from PowerShell and it worked for me using font Courier Mono Thai. Perhaps open a new PowerShell window after changing the font? Also, note that if you use chcp 65001 in the PowerShell or Command window, then you must use PrintStream ps = new PrintStream(System.out, true, StandardCharsets.UTF_8.name()); in your code. And if you use chcp 874 then you must use System.out.println("Default: " + engParam + ":::" + thaiParam); in your code, as shown in my answer.
  • With chcp 65001, it works for me with the font you have mentioned. Thanks a lot :). Appreciate your work.
  • Tried. Within the temp.txt, I found "Output :John_help:::สวย"
  • Opened with Notepad?
  • The java console seems okay, in general the java compiler javac could use an other encoding then the editor. But as Serge Ballesta investigated, UTF-8 seems to be used (fine). Try also a programmers editor like the java console or NotePad++.