How to use utf-8 encoded data in erb template

I have an data-file stored with utf-8 encode, and I want to embed the data to an erb template. The data-file is explicitly encoded with utf-8 at the top. But while running the erb engine but I encounter Encoding::CompatibilityError Error.

I thought as the default encoding in Ruby is ASCII, the erb template must also encoded under ascii. I have explicitly changed it to utf-8 but there is no good.

Here is the data-file:

# coding: utf-8

samples: [
    { name: '北京', city: '北京' }
]

Here is the Erb template:

<% # -*- coding: UTF-8 -*- %>
#...
<p><%= samples[:name] %></p>

Follow the steps below to import UTF-8 encoded text in its actual format. Steps: In GroupID Management Console, expand the Synchronize node, right-click All Jobs, and then select New Job. On the Job Template page, select a blank job and click Next. On the Select Source page, specify the data source from which to move

In the scenario where you have an ERB template rendering strings from another file that is in UTF-8, adding the following to the top of the ERB template solved it for me:

<%# coding: UTF-8 %>

(instead of <% # -*- coding: UTF-8 -*- %>)

With this property, you can use string operations such as reference modification on the national data. If it is more convenient to retain the UTF-8 encoding, use the Unicode intrinsic functions to assist with processing the data. For details, see Using intrinsic functions to process UTF-8 encoded data.

If you're using Rails, have you configured default encoding, in application.rb? like:

config.encoding = "utf-8"

My Rails (3.2.1) project does not contain any configuration other than that.

Other thing you want to check is, whether your datafile really in UTF-8 or not. If you're using Unix-like system, you can use 'nkf' command to check the code, by:

nkf --guess FILE_NAME

Download the system delivered .csv templates for the Data Workbench objects. Fill data in the templates and use a csv editor to open a .csv file that uses UTF-8 character encoding. For example, you may use Notepad++ to edit the CSV file by following the below steps: Save the csv file on your system

Specify <meta http-equiv="content-type" content="text/html;charset=UTF-8" /> in the header of the template

As described in UTF-8 and in Wikipedia, UTF-8 is a popular encoding of (multi-byte) Unicode code-points into eight-bit octets.. The goal of this task is to write a encoder that takes a unicode code-point (an integer representing a unicode character) and returns a sequence of 1-4 bytes representing that character in the UTF-8 encoding.

The problem is erb code in ruby 1.9 distribution. When it compiles the template code it forces a 'ASCII-8bit' encoding, the problem is when the template code has multibyte characters the template code is returned in a 'ASCII-8bit' string and when this string is concat with a 'UTF8' string with multibyte character the exception is raised because the strings between this encodings are only

UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages: UTF-16: 16-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire.

When i create a page, i received this message : incompatible character encodings: UTF-8 and ASCII-8BIT def create => @page = Page.create page_params debugger @page.save! redirect_to wiki_path(@page.name) end This is my params Parameters:

Comments
  • I just want to confirm that this is on Ruby 1.9, correct? Encoding behavior changed between 1.8 & 1.9.
  • I am not using Rails. I was using erb in general :)
  • Going to write another answer:)
  • I think you're confusing the server-side exception thrown by RoR using the wrong file encoding for source files, and telling the client (browser) what the file text encoding is