how to convert Python 2 unicode() function into correct Python 3.x syntax

unicode python 3
force unicode python 3
python unicode()
import python 2 module in python 3
python 2 to 3 cheat sheet
python unicode to utf8
nameerror: name 'unicode' is not defined
python print unicode variable

I enabled the compatibility check in my Python IDE and now I realize that the inherited Python 2.7 code has a lot of calls to unicode() which are not allowed in Python 3.x.

I looked at the docs of Python2 and found no hint how to upgrade:

I don't want to switch to Python3 now, but maybe in the future.

The code contains about 500 calls to unicode()

How to proceed?


The comment of user vaultah to read the pyporting guide has received several upvotes.

My current solution is this (thanks to Peter Brittain):

from builtins import str

... I could not find this hint in the pyporting docs.....

As has already been pointed out in the comments, there is already advice on porting from 2 to 3.

Having recently had to port some of my own code from 2 to 3 and maintain compatibility for each for now, I wholeheartedly recommend using python-future, which provides a great tool to help update your code (futurize) as well as clear guidance for how to write cross-compatible code.

In your specific case, I would simply convert all calls to unicode to use str and then import str from builtins. Any IDE worth its salt these days will do that global search and replace in one operation.

Of course, that's the sort of thing futurize should catch too, if you just want to use automatic conversion (and to look for other potential issues in your code).

Unicode HOWTO — Python 2.7.18 documentation, This HOWTO discusses Python 2.x's support for Unicode, and explains various For the Python 3 version, see <>. It's not compatible with existing C functions such as strlen() , so a new family of The rules for converting a Unicode string into the ASCII encoding are simple; for� 2to3 - Automated Python 2 to 3 code translation¶ 2to3 is a Python program that reads Python 2.x source code and applies a series of fixers to transform it into valid Python 3.x code. The standard library contains a rich set of fixers that will handle almost all code.

You can test whether there is such a function as unicode() in the version of Python that you're running. If not, you can create a unicode() alias for the str() function, which does in Python 3 what unicode() did in Python 2, as all strings are unicode in Python 3.

# Python 3 compatibility hack
except NameError:
    unicode = str

Note that a more complete port is probably a better idea; see the porting guide for details.

Unicode HOWTO — Python 3.8.5 documentation, In the standard and in this document, a code point is written using the notation Most Python code doesn't need to worry about glyphs; figuring out the correct 00 00 6e 00 00 00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 the built-in ord() function that takes a one-character Unicode string and returns � Versions of Python before 2.4 were Euro-centric and assumed Latin-1 as a default encoding for string literals; in Python 2.4, characters greater than 127 still work but result in a warning. For example, the following program has no encoding declaration:

Short answer: Replace all unicode calls with str calls.

Long answer: In Python 3, Unicode was replaced with strings because of its abundance. The following solution should work if you are only using Python 3:

unicode = str
# the rest of your goes goes here

If you are using it with both Python 2 or Python 3, use this instead:

import sys
if sys.version_info.major == 3:
    unicode = str
# the rest of your code goes here

The other way: run this in the command line

$ 2to3 package -w

Porting Python 2 Code to Python 3 — Python 3.8.5 documentation, With Python 3 being the future of Python while Python 2 is still in active use, it is But to fully understand how your code is going to change and what you want to have to fix on your own (e.g. using over the built-in open() function is unicode in Python 2 and str in Python 3, for binary that's str / bytes in Python 2� Python 2.7 was published in 2010 as the last of the 2.x releases. The intention behind Python 2.7 was to make it easier for Python 2.x users to port features over to Python 3 by providing some measure of compatibility between the two.

First, as a strategy, I would take a small part of your program and try to port it. The number of unicode calls you are describing suggest to me that your application cares about string representations more than most and each use-case is often different.

The important consideration is that all strings are unicode in Python 3. If you are using the str type to store "bytes" (for example, if they are read from a file), then you should be aware that those will not be bytes in Python3 but will be unicode characters to begin with.

Let's look at a few cases.

First, if you do not have any non-ASCII characters at all and really are not using the Unicode character set, it is easy. Chances are you can simply change the unicode() function to str(). That will assure that any object passed as an argument is properly converted. However, it is wishful thinking to assume it's that easy.

Most likely, you'll need to look at the argument to unicode() to see what it is, and determine how to treat it.

For example, if you are reading UTF-8 characters from a file in Python 2 and converting them to Unicode your code would look like this:

data = open('somefile', 'r').read()
udata = unicode(data)

However, in Python3, read() returns Unicode data to begin with, and the unicode decoding must be specified when opening the file:

udata = open('somefile', 'r', encoding='UTF-8').read()

As you can see, transforming unicode() simply when porting may depend heavily on how and why the application is doing Unicode conversions, where the data has come from, and where it is going to.

Python3 brings greater clarity to string representations, which is welcome, but can make porting daunting. For example, Python3 has a proper bytes type, and you convert byte-data to unicode like this:

udata = bytedata.decode('UTF-8')

or convert Unicode data to character form using the opposite transform.

bytedata = udata.encode('UTF-8')

I hope this at least helps determine a strategy.

Porting Python 2 Code to Python 3 — Python 3.3.7 documentation, With Python 3 being the future of Python while Python 2 is still in active it has the proper trove classifiers to signify what versions of Python it currently supports. of useful syntax and libraries which have become idiomatic in Python 3. It will not only get you used to typing print() as a function instead of a� 2to3 - Automated Python 2 to 3 code translation¶ 2to3 is a Python program that reads Python 2.x source code and applies a series of fixers to transform it into valid Python 3.x code. The standard library contains a rich set of fixers that will handle almost all code.

25.4. 2to3, It can be converted to Python 3.x code via 2to3 on the command line: a place in your source code that needs to be changed, but 2to3 cannot fix it modifies its internal grammar to interpret print() as a function. Converts the old metaclass syntax ( __metaclass__ = Meta in the Renames unicode to str . Python 1.6 therefore represents the state of the CVS tree as of May 2000, with the most significant new feature being Unicode support. Development continued after May, of course, so the 1.6 tree received a few fixes to ensure that it’s forward-compatible with Python 2.0. 1.6 is therefore part of Python’s evolution, and not a side branch.

Cheat Sheet: Writing Python 2-3 compatible code — Python-Future , Python 2 and 3: from __future__ import print_function # (at top of module) print(' Hello', you can use this idiom to make all string literals in a module unicode strings: As an alternative, chr() and .encode('latin-1') can be used to convert an int into a _iter).upper() # builtin next() function calls def __iter__(self): return self itr� The unidecode module accepts unicode string values and returns a unicode string in Python 3. You are giving it binary data instead. Decode to unicode or open the input text file in textmode, and encode the result to ASCII before writing it to a file, or open the output text file in text mode.

Strings — Conservative Python 3 Porting Guide 1.0 documentation, The Porting Process � Tools � Syntax Changes � Exceptions � Importing This type is available as str in Python 3, and unicode in Python 2. explicitly use a serialization function (e.g. pickle.dumps() ), or convert to text and encode the text. idea to make sure your code handles strings correctly – even in unusual situations. For example, here is how the again() function from the How To Make a Simple Calculator Program in Python 3 tutorial is commented, with comments following each indent level of the code: # Define again() function to ask user if they want to use the calculator again def again(): # Take input from user calc_again = input(''' Do you