UnicodeDecodeError: 'utf8' codec can't decode byte 0xa5 in position 0: invalid start byte

unicodedecodeerror: 'charmap
unicodedecodeerror python 3
unicodedecodeerror 'utf-8'
how to solve unicode decode error in python
unicodedecodeerror pandas
unicodedecodeerror traceback
unicodedecodeerror: 'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte
unicodedecodeerror: 'ascii' codec can't decode byte pickle

I am using Python-2.6 CGI scripts but found this error in server log while doing json.dumps(),

Traceback (most recent call last):
  File "/etc/mongodb/server/cgi-bin/getstats.py", line 135, in <module>
    print json.dumps(​​__get​data())
  File "/usr/lib/python2.7/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 201, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 264, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa5 in position 0: invalid start byte

​Here ,

​__get​data() function returns dictionary {} .

Before posting this question I have referred this of question os SO.


UPDATES

Following line is hurting JSON encoder,

now = datetime.datetime.now()
now = datetime.datetime.strftime(now, '%Y-%m-%dT%H:%M:%S.%fZ')
print json.dumps({'current_time': now}) // this is the culprit

I got a temporary fix for it

print json.dumps( {'old_time': now.encode('ISO-8859-1').strip() })

But I am not sure is it correct way to do it.

The error is because there is some non-ascii character in the dictionary and it can't be encoded/decoded. One simple way to avoid this error is to encode such strings with encode() function as follows (if a is the string with non-ascii character):

a.encode('utf-8').strip()

How to fix: "UnicodeDecodeError: 'ascii' codec can't decode byte , tl;dr / quick fix. Don't decode/encode willy nilly; Don't assume your strings are UTF​-8 encoded; Try to convert strings to Unicode strings as soon as possible in  >>> b = unicode(a) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128) This didn’t work because the default encoding in python in ascii. So, python was not able to decode a assuming ascii encoding.

Try the below code snippet:

with open(path, 'rb') as f:
  text = f.read()

Solving Unicode Problems in Python 2.7, UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 1: ordinal not in range(128) Learn how to solve unicode problems in  Python3 does its best to give you texts encoded as a valid unicode characters strings. When it hits an invalid bytes sequence (according to the used charset), it has two choices: drops the value or raises an UnicodeDecodeError. This document present the behaviour of Python3 for the command line, environment variables and filenames.

I switched this simply by defining a different codec package in the read_csv() command:

encoding = 'unicode_escape'

How to fix: “UnicodeDecodeError: 'ascii' codec can't , To fix “UnicodeDecodeError you can use the following piece of code this is the default encoding of python is utf8. After writing this code there  UnicodeDecodeError: "utf-8" codec can"t decode byte 0xa0 in position 10: invalid start byte. import pandas as pd a = pd.read_csv ("filename.csv") python-programming. Jul 23, 2019 in Python by Hari. • 11,431 views. Your comment on this question: #N#Your name to display (optional): #N#Email me at this address if a comment is added after mine

Your string has a non ascii character encoded in it.

Not being able to decode with utf-8 may happen if you've needed to use other encodings in your code. For example:

>>> 'my weird character \x96'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 19: invalid start byte

In this case, the encoding is windows-1252 so you have to do:

>>> 'my weird character \x96'.decode('windows-1252')
u'my weird character \u2013'

Now that you have unicode, you can safely encode into utf-8.

How to fix: “UnicodeDecodeError: 'ascii' codec , How to fix: “UnicodeDecodeError: 'ascii' codec can't decode byte”. as3:~/​ngokevin-site# nano content/blog  @v-chojas. I can reproduce this issue with the Access_2010 ODBC driver (ACEODBC.DLL 14.00.7180.5000). It really is garbage characters at the end of the remarks column as returned by SQLColumnsW.

On read csv I added an encoding method:

import pandas as pd
dataset = pd.read_csv('sample_data.csv',header=0,encoding = 'unicode_escape')

UnicodeDecodeError: "utf-8" codec can't decode byte in position , While I importing the file it shows UnicodeDecodeError: "utf-8" codec can"t decode byte 0xa0 in position as pd a  UnicodeDecodeError: 'ascii' codec can't decode byte generally happens when you try to convert a Python 2.x str that contains non-ASCII to a Unicode string without specifying the encoding of the original string. In brief, Unicode strings are an entirely separate type of Python string that does not contain any encoding.

UnicodeDecodeError: 'ascii' codec can't decode byte · Issue #101 , UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 1496: ordinal not in range(128). Python version - 3.6.8. Environment  UnicodeDecodeError: 'ascii' codec can't decode byte 0x90 in position 614: ordinal not in range(128) I would appreciate your help in this regard. best wishes Ritesh. Hi yebud! I just started 'neural-networks-and-deep-learning' a few days ago and I am stuck right now. I saw you used this learning material as well, is there any chance I can

UnicodeDecodeError, UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 31: ordinal not in range(128) Oops, try again. true. I have a solution working in PyCharm 

Fixing UnicodeDecodeError in Python, b = unicode(a) Traceback (most recent call last): File "<stdin>", line 1, in <module​> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 

Comments
  • It looks like you have some string data in the dictionary that can't be encoded/decoded. What's in the dict?
  • @mgilson yup master I understood the issue but donno how to deal with it..dict has list, dict, python timestamp value
  • @Pilot -- Not really. The real problem is buried somewhere in __getdata. I don't know why you're getting a non-decodable character. You can try to come up with patches on the dict to make it work, but those are mostly just asking for more problems later. I would try printing the dict to see where the non-ascii character is. Then figure out how that field got calculated/set and work backward from there.
  • Possible duplicate of UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c.
  • I had that same error when trying to read a .csv file which had some non-ascii characters in it. Removing those characters (as suggested below) solved the issue.
  • Since UTF-8 is back-compatible with the oldschool 7-bit ASCII you should just encode everything. For characters in the 7-bit ASCII range this encoding will be an identity mapping.
  • This doesn't seem real clear. When importing a csv file how do you use this code?
  • I had r instead of rb. thanks for the reminder to add b!
  • I got "AttributeError: 'str' object has no attribute 'decode'". Not sure what went wrong?
  • Did you include b to the "rb"? The b is for opening the file as byte-formated. If you just use r it is string, and don't include decode.