how do I .decode('string-escape') in Python3?

I have some escaped strings that need to be unescaped. I'd like to do this in Python.

For example, in python2.7 I can do this:

>>> "\\123omething special".decode('string-escape')
'Something special'
>>> 

How do I do it in Python3? This doesn't work:

>>> b"\\123omething special".decode('string-escape')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown encoding: string-escape
>>> 

My goal is to be abel to take a string like this:

s\000u\000p\000p\000o\000r\000t\000@\000p\000s\000i\000l\000o\000c\000.\000c\000o\000m\000

And turn it into:

"support@psiloc.com"

After I do the conversion, I'll probe to see if the string I have is encoded in UTF-8 or UTF-16.

If you want str-to-str decoding of escape sequences, so both input and output are Unicode:

def string_escape(s, encoding='utf-8'):
    return (s.encode('latin1')         # To bytes, required by 'unicode-escape'
             .decode('unicode-escape') # Perform the actual octal-escaping decode
             .encode('latin1')         # 1:1 mapping back to bytes
             .decode(encoding))        # Decode original encoding

Testing:

>>> string_escape('\\123omething special')
'Something special'

>>> string_escape(r's\000u\000p\000p\000o\000r\000t\000@'
                  r'\000p\000s\000i\000l\000o\000c\000.\000c\000o\000m\000',
                  'utf-16-le')
'support@psiloc.com'

Decode de Code: Instructions, Here, you will be challenged to decode a scientific quote that has been encoded by the computer. The computer will generate an "alphabet" (either random or� Do We Have the Math to Truly Decode Google's Algorithms? This article shows how SEO studies that claim how to decode Google and other search engine algorithms are based on poor data and bad science.

You'll have to use unicode_escape instead:

>>> b"\\123omething special".decode('unicode_escape')

If you start with a str object instead (equivalent to the python 2.7 unicode) you'll need to encode to bytes first, then decode with unicode_escape.

If you need bytes as end result, you'll have to encode again to a suitable encoding (.encode('latin1') for example, if you need to preserve literal byte values; the first 256 Unicode code points map 1-on-1).

Your example is actually UTF-16 data with escapes. Decode from unicode_escape, back to latin1 to preserve the bytes, then from utf-16-le (UTF 16 little endian without BOM):

>>> value = b's\\000u\\000p\\000p\\000o\\000r\\000t\\000@\\000p\\000s\\000i\\000l\\000o\\000c\\000.\\000c\\000o\\000m\\000'
>>> value.decode('unicode_escape').encode('latin1')  # convert to bytes
b's\x00u\x00p\x00p\x00o\x00r\x00t\x00@\x00p\x00s\x00i\x00l\x00o\x00c\x00.\x00c\x00o\x00m\x00'
>>> _.decode('utf-16-le') # decode from UTF-16-LE
'support@psiloc.com'

Encoder and Decoder Tool, It won't do all the work for you, but can easily decode ROT13 codes, and help you in breaking substitution ciphers. It is very useful for decoding some of the� In a DECODE function, Oracle considers two nulls to be equivalent. If expr is null, then Oracle returns the result of the first search that is also null. The maximum number of components in the DECODE function, including expr, searches, results, and default, is 255.

The old "string-escape" codec maps bytestrings to bytestrings, and there's been a lot of debate about what to do with such codecs, so it isn't currently available through the standard encode/decode interfaces.

BUT, the code is still there in the C-API (as PyBytes_En/DecodeEscape), and this is still exposed to Python via the undocumented codecs.escape_encode and codecs.escape_decode.

>>> import codecs
>>> codecs.escape_decode(b"ab\\xff")
(b'ab\xff', 6)
>>> codecs.escape_encode(b"ab\xff")
(b'ab\\xff', 3)

These functions return the transformed bytes object, plus a number indicating how many bytes were processed... you can just ignore the latter.

>>> value = b's\\000u\\000p\\000p\\000o\\000r\\000t\\000@\\000p\\000s\\000i\\000l\\000o\\000c\\000.\\000c\\000o\\000m\\000'
>>> codecs.escape_decode(value)[0]
b's\x00u\x00p\x00p\x00o\x00r\x00t\x00@\x00p\x00s\x00i\x00l\x00o\x00c\x00.\x00c\x00o\x00m\x00'

Base64 Decode and Encode, Decode from Base64 or Encode to Base64 with advanced formatting options. Enter our site for an easy-to-use online tool. How to Decode Binary Numbers. Binary is the language of computers. They allow computers to do all of the complex things that they do. You might think that since binary numbers allow for such complex operations that they would be equally

You can't use unicode_escape on byte strings (or rather, you can, but it doesn't always return the same thing as string_escape does on Python 2) – beware!

This function implements string_escape using a regular expression and custom replacement logic.

def unescape(text):
    regex = re.compile(b'\\\\(\\\\|[0-7]{1,3}|x.[0-9a-f]?|[\'"abfnrt]|.|$)')
    def replace(m):
        b = m.group(1)
        if len(b) == 0:
            raise ValueError("Invalid character escape: '\\'.")
        i = b[0]
        if i == 120:
            v = int(b[1:], 16)
        elif 48 <= i <= 55:
            v = int(b, 8)
        elif i == 34: return b'"'
        elif i == 39: return b"'"
        elif i == 92: return b'\\'
        elif i == 97: return b'\a'
        elif i == 98: return b'\b'
        elif i == 102: return b'\f'
        elif i == 110: return b'\n'
        elif i == 114: return b'\r'
        elif i == 116: return b'\t'
        else:
            s = b.decode('ascii')
            raise UnicodeDecodeError(
                'stringescape', text, m.start(), m.end(), "Invalid escape: %r" % s
            )
        return bytes((v, ))
    result = regex.sub(replace, text)

URL Decode and Encode, Decode from or Encode to URL encoded (also known as Percent-encoded) format with advanced options. Enter our site for an easy-to-use online tool. Decode your VIN. Learn what your vehicle identification number (VIN) means. We'll show you a few ways to get a detailed VIN check for any car.

At least in my case this was equivalent:

Py2: my_input.decode('string_escape')
Py3: bytes(my_input.decode('unicode_escape'), 'latin1')

convertutils.py:

def string_escape(my_bytes):
    return bytes(my_bytes.decode('unicode_escape'), 'latin1')

How to Decode a Secret Message! (DIY Decoder), Today's assignment from Head Quarters: spies-in-training Izzy and Kaiden must spy decode Duration: 4:16 Posted: Feb 22, 2018 Meet URL Decode and Encode, a simple online tool that does exactly what it says; decodes URL encoding and encodes into it quickly and easily. URL encode your data in a hassle-free way, or decode it into human-readable format.

How to Use a Basic Cipher to Encode and Decode a Secret , Former covert CIA intelligence officer Andrew Bustamante teaches you how to use a basic Duration: 22:36 Posted: May 12, 2019 I’d rephrase the question as do different USB receivers with hardware MQA decoder decode MQA better than others? DAC has nothing to do with MQA as it only cares about the incoming LPCM bits and sample rates before oversampling and bit-reduction (or DSD from HQPlayer or other hardware upsampler such as Chord MScaler)

Encode and decode text and strings, The traditional variant of string encoding and decoding in which all symbols are encoded with '%' sign except special characters like : , / ? : @ & = + $ #. To code� The DECODE function is used to find exact matches. This is one of the drawbacks when comparing it to the CASE statement, as the CASE statement can perform more advanced checks. However, it is possible to use the Oracle DECODE function with LIKE. You can do this with a combination of the SIGN function and the INSTR function.

[PDF] 11. Secret Codes We will use the fact that the , How can you decode this message to get the original message? The answer is surprisingly simple: apply the same procedure as when encoding, except you must� Using the information we learned earlier we can now decode the attack and gain a better idea of what this command is trying to do. Here’s the encoded string: Here’s what it looks like after being decoded with one of the methods we explained above: We can now see the PowerShell in plain text, but let’s clean it up and “prettify” it.

Comments
  • Are you absolutely certain those are escapes and not literal bytes?
  • They are literal bytes! There is a backslash, then a 0, then another 0, then a third 0... I have a program that reads a binary file and outputs information like this. It outputs the binary that is actually in the file. Sometimes the content of the file is UTF-8 coded and it just passes through. But if it isn't valid UTF-8 it gets encoded this way.
  • Same question, but does not specify version. The lowest voted answer there answers for Py3.
  • That turns my binary object into a Unicode object. I want to keep it a binary object. Any way to do that?
  • @vy32: Encode it after decoding? What encoding do you expect this to fit in? ASCII, Latin 1?
  • It could be anything. The program probes a variety of possible codings. It might be ASCII, UTF-8, UTF-16, Latin 1, or a dozen other possibilities.
  • @vy32: Then convert to 'proper' bytes by decoding from unicode_escape, then back to bytes via latin1 (which has the happy coincidence of mapping 1-on-1). You then have bytes to try decodings on.