Windows cmd encoding change causes Python crash
error: gcloud crashed (lookuperror: unknown encoding: cp65001)
lookuperror: unknown encoding
warning console codepage must be set to cp65001 to support utf-8 encoding on windows platforms
docker-compose unknown encoding: cp65001
lookuperror: unknown encoding: mbcs
lookuperror: unknown encoding: ascii
cannot convert from the charset unknown encoding (-1)
First I change Windows CMD encoding to utf-8 and run Python interpreter:
chcp 65001 python
Then I try to print a unicode sting inside it and when i do this Python crashes in a peculiar way (I just get a cmd prompt in the same window).
>>> import sys >>> print u'ëèæîð'.encode(sys.stdin.encoding)
Any ideas why it happens and how to make it work?
UPD2: It just came to me that the issue might be connected with the fact that utf-8 uses multi-byte character set (kcwu made a good point on that). I tried running the whole example with 'windows-1250' and got 'ëeaî?'. Windows-1250 uses single-character set so it worked for those characters it understands. However I still have no idea how to make 'utf-8' work here.
UPD3: Oh, I found out it is a known Python bug. I guess what happens is that Python copies the cmd encoding as 'cp65001 to sys.stdin.encoding and tries to apply it to all the input. Since it fails to understand 'cp65001' it crashes on any input that contains non-ascii characters.
Windows cmd encoding change causes Python crash , First I change Windows CMD encoding to utf-8 and run Python interpreter: chcp 65001 python. Then I try to print a unicode sting inside it and when i do this You could find some more references here : Windows cmd encoding change causes Python crash. In particular, the referenced Python bug was still active the 2014-10-02 So what to do ? The only correct solution in Windows is to use a 8bits only character set. Latin1 (windows cp 1252) should display swedish characters provided you use a Consolas
Set PYTHONIOENCODING system variable:
> chcp 65001 > set PYTHONIOENCODING=utf-8 > python example.py Encoding is utf-8
example.py is simple:
import sys print "Encoding is", sys.stdin.encoding
Changing Windows cmd encoding causes Python to crash, Here's the alias cp65001 for UTF-8 without changing encodings\aliases.py : import codecs codecs.register(lambda name: codecs.lookup('utf-8') if name Windows cmd encoding change causes Python crash (6) First I change Windows CMD encoding to utf-8 and run Python interpreter: chcp 65001 python Then I try to print a unicode sting inside it and when i do this Python crashes in a peculiar way (I just get a cmd prompt in the same window).
Do you want Python to encode to UTF-8?
>>>print u'ëèæîð'.encode('utf-8') Ã«Ã¨Ã¦Ã®Ã°
Python will not recognize cp65001 as UTF-8.
Issue 1602: windows console doesn't print or input Unicode, Microsoft changed default text encoding of notepad.exe to UTF-8 from 2019 May Update! I propose to change Python's default text encoding too, from 2021. For example, “Command Prompt” uses the legacy code page by default. in it, and then that caused pip on 3.9 to crash on start-up, then we've totally broken them. Windows cmd encoding change causes Python crash Python, Unicode, and the Windows console Python Helpers for String/Unicode Encoding, Decoding and Printing Print to the console in Python without UnicodeEncodeErrors- note, this did not work for me in Python 2.7.9
I had this annoying issue, too, and I hated not being able to run my unicode-aware scripts same in MS Windows as in linux. So, I managed to come up with a workaround.
Take this script (say,
uniconsole.py in your site-packages or whatever):
import sys, os if sys.platform == "win32": class UniStream(object): __slots__= ("fileno", "softspace",) def __init__(self, fileobject): self.fileno = fileobject.fileno() self.softspace = False def write(self, text): os.write(self.fileno, text.encode("utf_8") if isinstance(text, unicode) else text) sys.stdout = UniStream(sys.stdout) sys.stderr = UniStream(sys.stderr)
This seems to work around the python bug (or win32 unicode console bug, whatever). Then I added in all related scripts:
try: import uniconsole except ImportError: sys.exc_clear() # could be just pass, of course else: del uniconsole # reduce pollution, not needed anymore
Finally, I just run my scripts as needed in a console where
chcp 65001 is run and the font is
Lucida Console. (How I wish that
DejaVu Sans Mono could be used instead… but hacking the registry and selecting it as a console font reverts to a bitmap font.)
This is a quick-and-dirty
stderr replacement, and also does not handle any
raw_input related bugs (obviously, since it doesn't touch
sys.stdin at all). And, by the way, I've added the
cp65001 alias for
utf_8 in the
encodings\aliases.py file of the standard lib.
PEP 597: Use UTF-8 for default text file encoding, I am attempting to run docker-compose.exe from Windows in .com/questions/878972/windows-cmd-encoding-change-causes-python-crash . The Windows terminal sometimes uses a non-UTF-8 encoding (python: unicode in Windows terminal, encoding used?You therefore might want to try the following: stdout_encoding = sys.stdout.encoding def printMessages(self, out): print >>out, ("Lista wiadomości: %s" % self.name).decode('utf-8').encode(stdout_encoding) for i in self.messages: print >>out, i.decode('utf-8').encode(stdout_encoding)
For me setting this env var before execution of python program worked:
Unknown encoding: 65001 · Issue #2009 · microsoft/WSL · GitHub, The Windows console is riddled with bugs when the encoding is set to an I believe there might be some issues with Python, so you may want to check This can cause a lot of problems. The command to change the codepage is chcp <codepage> . Don't fuss with Windows Registry unless you have no other option. I am not sure if this is a Python bug or simply a limitation of cmd.exe. I am using Windows XP Home. I run cmd.exe with the /u option and I have set my console font to "Lucida Console" (the only TrueType font offered), and I run chcp 65001 to set the utf8 code page.
Change default code page of Windows console to UTF-8, /windows-cmd-encoding-change-causes-python-crash/3259271#3259271 whenever we run into problems on cmd.exe and windows. It seems like docker-compose.exe does not work in a Windows container as it seems to be bundled with Python 2 binary. I just had to rerun a test for the Chocolatey package 1.10.1 as the download didn't work in the approval.
Cross-platform unicode support for Python command line tool , By default python execution environment is lazy and defaults to ascii /878972/windows-cmd-encoding-change-causes-python-crash/1432462#1432462>. """. In this post, we’ll discuss the improvements we’ve been making to the Windows Console’s internal text buffer, enabling it to better store and handle Unicode and UTF-8 text. Posts in the Windows Command-Line series: This list will be updated as more posts are published: Command-Line Backgrounder The Evolution of the Windows Command-Line Inside the Windows Console Introducing the Windows
fix_encoding.py - chromium/tools/depot_tools.git, in windows cmd.exe # Posted on Stack Overflow , available under CC-BY-SA 3.0  # # Question: "Windows cmd encoding change causes Python crash" This issue also causes display issues with Windows programs run from within bash. See missing/incorrect umlauts below: My codepage is 850, but the inner cmd.exe's (ran from bash) is 65001.
- can you print sys.stdin.encoding? what does it return?
- It's easy for python to know how to deal with the 'cp65001' codec: one has to add a line to Lib/encodings/aliases.py , mapping 'cp65001' to 'utf_8'. I created a patch for that, and also updated the bug you mention, Alex. There are still issues, though.
- related: Python, Unicode, and the Windows console
- +1 because your answer is worthy, plus a virtual +1 for the font issue suggestion, even though it's too late (I and Windows have had a break-up with lots of fights; I don't think we'll ever be together again but for brief encounters at friends' computers :) Thanks.
- @David-Sarah: Thanks for the very useful code! Do you happen to know if there's a corresponding way to fix console input (so that e.g. copy-pasted unicode characters Just Work, irrespective of codepage etc.) This would presumably involve ReadConsoleW?
- I try to solve the problems around bugs.python.org/issue1602 for Python 3 in my project github.com/Drekin/win-unicode-console. The package is on PyPI: pypi.python.org/pypi/win_unicode_console. It actually builds on code from the issue, which originates in this your code.
- @KevinThibedeau: use colorama package.
- @KevinThibedeau: I don't know why should cp65001 matter. On the other hand, there is no need to use cp65001.
- I tried this in Python 2.7.5, and while
utf-8it didn't generate the proper output. It showed each byte of output as individual characters instead of combining them into codepoints.
python -c "import sys; print('Encoding='+sys.stdin.encoding)"instead of making a file.