Force MariaDB clients to use utf8mb4

mariadb utf8mb4
utf8mb4 emoji
character_set_system utf8mb4
mysql 5.7 utf8mb4
utf8mb4 vs latin1
utf8mb4 to utf8
utf8mb4_unicode_ci
how to store utf-8 data in mysql

I'm running into an issue where I'm getting differently ordered results when querying with PHP Versus the command line. From my research, it appears that in some cases that bad encoding can cause problems with the order of the results.

That said, all my DB tables are encoded as utf8mb4, with the collation utf8mb4_general_ci. However, it doesnt seem that the mysql variables are set correctly.

I'm on Mysql 5.5.5-10.1.26-MariaDb.

Here are my CNF settings, but to be honest I don't know what I'm doing here:

[client]
default-character-set=utf8mb4

[mysql]
default-character-set=utf8mb4

[mariadb]


[mysqld]

character-set-server=utf8mb4
character_set_client=utf8mb4
collation-server=utf8mb4_general_ci

The variables output from mysql:

character_set_client        utf8
character_set_connection    utf8
character_set_database      utf8mb4
character_set_filesystem    binary
character_set_results       utf8
character_set_server        utf8mb4
character_set_system        utf8
collation_connection        utf8_general_ci
collation_database          utf8mb4_unicode_ci
collation_server            utf8mb4_general_ci

Update: A person has asked for how I'm connecting to the database:

$this->connection = new PDO('mysql:host='.DB_SERVER.';dbname='.DB_NAME.';port='.DB_PORT, DB_USER, DB_PASS, $options);

Update: I've switched to utf8mb4_unicode_ci (as per suggestions in answers below).


You should probably use utf8mb4_unicode_ci instead of utf8mb4_general_ci as it's more accurate. Unless you're running MariaDB on a system with an old/limited CPU and performance is a huge concern.

That being said, the solution is to set init_connect in your MariaDB configuration (or --init-connect on the command line):

init_connect = "SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci"

Either way is fine. I am not recommending one way over the other. Both are equally valid approaches.

Your MariaDB configuration may be in my.cnf or a file included by my.cnf, typically found under /etc/mysql. Check your system documentation for details. Because you are configuring a server variable, as indicated by the MariaDB documentation linked to above, you should set the variable in the server part of the configuration file. The server part of the configuration files is indicated by the INI section names ending in "d". An INI section is denoted by a keyword surrounded by square brackets, e.g. "[section]". The "d" stands for "daemon", which is standard UNIX nomenclature for a server process. You can set the variable in either the [mysqld] section or the [mariadb] section. Because the init_connect server variable is common to both MySQL and MariaDB, I would recommend you put it under [mysqld].

I see that you are setting character_set_client=utf8mb4 in your pasted configuration. You don't need to do this. You can delete or comment out the line. Comments are lines starting with pound symbol (#), also known as a hash mark, octothorp, or number sign.

Any and all clients that connect to the server will execute these command(s) before any other commands are processed.

mysql - Force MariaDB clients to use utf8mb4, You should probably use utf8mb4_unicode_ci instead of utf8mb4_general_ci as it's more accurate. Unless you're running MariaDB on a system with an  in my.cnf will change collation_connection variable display as utf8mb4_unicode_ci instead of utf8mb4_general_ci, however it force the connection to use utf8mb4_uncode_ci regardless of whatever requested by client.


You want to have character-set-client-handshake = FALSE as well.

With /etc/my.cnf.d/character-set.cnf

# https://scottlinux.com/2017/03/04/mysql-mariadb-set-character-set-and-collation-to-utf8/
# https://mariadb.com/kb/en/library/setting-character-sets-and-collations/
# https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434
# https://stackoverflow.com/questions/47566730/force-mariadb-clients-to-use-utf8mb4

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
collation-server = utf8mb4_unicode_ci
init-connect = 'SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci'
character-set-server = utf8mb4

I get everything to be utf8mb41

MariaDB [(none)]> show variables like 'char%'; show variables like 'collation%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)

MariaDB [(none)]>

however without the character-set-client-handshake line some are still utf8

MariaDB [(none)]> show variables like 'char%'; show variables like 'collation%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8_general_ci    |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.01 sec)

MariaDB [(none)]>

1 character_set_system is always utf8.

Setting Character Sets and Collations, This is a UTF-8 client and a UTF-8 server, in a UTF-8 database with a UTF-8 collation. users who are currently using “utf8” should actually use “utf8mb4”. never fix it: that would force every user to rebuild every database. After to updated PHPMyAdmin the default collation has been set to utf8mb4_general_ci instead the old default collation utf8_general_ci.As I do not have any need to use utf8mb4_general_ci I would like to restore utf8_general_ci as default.


init_connect is not performed by anyone connecting as root, so it is not as universal as you would like.

SET NAMES utf8mb4 sets 3 things; experiment to see that. You need all 3.

If you weren't as far back as 5.5, I would recommend utf8mb4_unicode_520_ci as being a better collation: "Unicode collation names now may include a version number to indicate the Unicode Collation Algorithm (UCA) version on which the collation is based. Initial collations thus created use version UCA 5.2.0. For example, utf8_unicode_520_ci is based on UCA 5.2.0. UCA-based Unicode collation names that do not include a version number are based on version 4.0.0."

Version 8.0 has Unicode 9.0 standard.

Back to the question: There is no perfect solution; the user can override whatever you do -- either through ignorance or through malice.

You could police the tables created, but that won't keep them from connecting incorrectly. Or correctly, but with a different charset. It is valid to do SET NAMES latin1, then provide latin1-encode bytes. MySQL will convert as it stores/fetches.

But if they have utf8-encoded bytes, but say SET NAMES latin1, you get "double encoding". This "bug" destroys any chance of collating correctly, but is otherwise (usually) transparent. That is, stuff is messed up as it is stored, then un-messed up as it is fetched.

In MySQL, never use “utf8”. Use “utf8mb4”. - Adam Hooper, Make sure to set the client and server character set as well. I have the Never use utf8 in MySQL — always use utf8mb4 instead. Updating  [client] default-character-set = utf8mb4 [mysqld] character-set-server = utf8mb4 (At this point you might have noticed that Coderwall's syntax coloring sucks, but you get the idea.) Then reload the server by issuing a command along the lines of: mysql.server restart Now, in order to test the new settings, you can do the following:


To fix this warning you should edit

/etc/my.cnf (my.ini on Windows)

Simply add/set in the file

[client]
default-character-set=utf8mb4

[mysql]
default-character-set=utf8mb4

[mysqld]
collation-server=utf8mb4_unicode_ci
init-connect='SET NAMES utf8mb4'
character-set-server=utf8mb4

How to support full Unicode in MySQL databases · Mathias Bynens, This can be checked by executing the following command MariaDB [client] default-character-set = utf8mb4 [mysqld] character-set-server utf8mb4_unicode_ci instead of utf8mb4_general_ci, however it force Why use utf8mb4? https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-​utf8mb4-  You should probably use utf8mb4_unicode_ci instead of utf8mb4_general_ci as it's more accurate. Unless you're running MariaDB on a system with an old/limited CPU and performance is a huge concern. That being said, the solution is to set init_connect in your MariaDB configuration (or --init-connect on the command line):


MySQL MariaDB Set utf8mb4 as default charset, A protip by mvasilkov about mysql, unicode, utf8, utf8mb4, and mariadb. that MySQL calls utf8 is NOT the well-known UTF-8 encoding; it's a [client] default-​character-set = utf8mb4 [mysqld] character-set-server = utf8mb4. PHP PDO: charset, set names? Ask Question Asked 9 years, 5 months ago. Force MariaDB clients to use utf8mb4. 3. Charset UTF-8 and php header() not working. 5.


Setting up Unicode defaults for MariaDB (or MySQL) (Example), In MySQL 8.0 the default Character Set has changed to utf8mb4. Sometimes we see customers using the Logon Trigger init_connect to force clients for So we cannot use it for the Upgrade Process from MySQL 5.6 to 5.7. In MariaDB, the default character set is latin1, and the default collation is latin1_swedish_ci (however this may differ in some distros, see for example Differences in MariaDB in Debian). Both character sets and collations can be specified from the server right down to the column level, as well as for client-server connections.


MariaDB and MySQL Character Set Conversion, In the absence of other information, each client uses the compiled-in default character set, usually utf8mb4 . Each client can autodetect which character set to use  MariaDB 10.0.5 added the utf8_german2_ci, utf8mb4_german2_ci, ucs2_german2_ci, utf16_german2_ci and utf32_german2_ci collations. MariaDB 5.1.41 added a Croatian collation patch from Alexander Barkov to fix some problems with the Croatian character set and LIKE queries.