golang - display character, not ascii. Like '&', not '\0026'

golang byte to rune
golang print rune
golang byte literal
golang range string
golang ascii characters
golang byte to string
golang char to byte
go playground

This is my testing code.Just make a simple http server. Then generating a json data that it's value is "&". But the result is what i don't want. The result is below the code block.

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "net/http"
)

func testFunc(w http.ResponseWriter, r *http.Request) {
    data := make(map[string]string)
    data["key"] = "&"
    bytes, err := json.Marshal(data)
    if err != nil {
        fmt.Fprintln(w, "generator json error")
    } else {
        //print console
        fmt.Println(string(bytes))
        fmt.Println("&")
        //print broswer
        fmt.Fprintln(w, string(bytes))
        fmt.Fprintln(w, "&")
    }
}

func main() {
    http.HandleFunc("/", testFunc)
    err := http.ListenAndServe(":9090", nil)
    if err != nil {
        log.Fatal("ListenAndServe", err)
    }

}

result: Chrome browser show:

{"key":"\u0026"}

&

Console also show:

{"key":"\u0026"}

&

When '&' not in json, browser and console will print '&'.

In Go1.7 they have added a new option to fix this:

encoding/json: add Encoder.DisableHTMLEscaping This provides a way to disable the escaping of <, >, and & in JSON strings.

The relevant function is

func (*Encoder) SetEscapeHTML

That should be applied to a Encoder.

enc := json.NewEncoder(os.Stdout)
enc.SetEscapeHTML(false)

The example of stupidbodo modified: https://play.golang.org/p/HnWGJAjqPA

Strings, bytes, runes and characters in Go, Because some of the bytes in our sample string are not valid ASCII, not even The simple print statement That symbol has Unicode value U+2318, encoded as UTF-8 by the bytes after the space (hex value 20 ): e2 8c 98 . Golang For Loop Range form; Golang Rune Datatype Beginner Guide; Like other programming languages, There is no specific data type for Character representation. We can use rune data type. rune is a primitive data type which contains ASCII code of type integer, meaning rune is an alias for int65. Each character has ASCII code.

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
)

type Search struct {
    Query string `json:"query"`
}

func main() {
    data := &Search{Query: "http://google.com/?q=stackoverflow&ie=UTF-8"}
    responseJSON, _ := JSONMarshal(data, true)
    fmt.Println(string(responseJSON))

}

func JSONMarshal(v interface{}, safeEncoding bool) ([]byte, error) {
    b, err := json.Marshal(v)

    if safeEncoding {
        b = bytes.Replace(b, []byte("\\u003c"), []byte("<"), -1)
        b = bytes.Replace(b, []byte("\\u003e"), []byte(">"), -1)
        b = bytes.Replace(b, []byte("\\u0026"), []byte("&"), -1)
    }
    return b, err
}

Results:

JSONMarshal(data, true)
{"query":"http://google.com/?q=stackoverflow&ie=UTF-8"}

JSONMarshal(data, false)
{"query":"http://google.com/?q=stackoverflow\u0026ie=UTF-8"}

Credits: https://github.com/clbanning/mxj/blob/master/json.go#L20

Playbook: http://play.golang.org/p/c7M32gICl8

Dealing with Unicode in Go (Example), Under the hood, Go is actually encoding the string as a byte array. While it Since Chinese characters take up three bytes while ASCII characters take only one, Go tells you the length is Print(string(hello[i])) } >>> Hello, äç. The most common encoding for ASCII uses the code points as 7-bit bytes, so that the encoding of 'A' for example is 65. This set is actually US ASCII. Due to European desires for accented characters, some punctuation characters are omitted to form a minimal set, ISO 646, while there are "national variants" with suitable European characters.

From the docs (emphasis by me):

String values encode as JSON strings. InvalidUTF8Error will be returned if an invalid UTF-8 sequence is encountered. The angle brackets "<" and ">" are escaped to "\u003c" and "\u003e" to keep some browsers from misinterpreting JSON output as HTML. Ampersand "&" is also escaped to "\u0026" for the same reason.

Apparently if you want to send '&' as is, you'll need to either create a custom Marshaler, or use RawMessage type like this: http://play.golang.org/p/HKP0eLogQX.

Golang: Print String, print string with NON-ASCII char as escape sequence: \u2665 . print string as hexadecimal to see the bytes. The fmt.Printf function has several  For ASCII, it doesn't matter, but for all other characters (e.g. characters like á), it matters. Of course, if you have bytes in the string that don't form a valid UTF-8 sequence, the rune received for each "bad" byte would be the value of unicode.ReplacementChar.

Another way to solve the problem is to simply replace those escaped characters in json.RawMessage into just valid UTF-8 characters, after the json.Marshal() call.

You can use the strconv.Quote() and strconv.Unquote() to do so.

func _UnescapeUnicodeCharactersInJSON(_jsonRaw json.RawMessage) (json.RawMessage, error) {
    str, err := strconv.Unquote(strings.Replace(strconv.Quote(string(_jsonRaw)), `\\u`, `\u`, -1))
    if err != nil {
        return nil, err
    }
    return []byte(str), nil
}

func main() {
    // Both are valid JSON.
    var jsonRawEscaped json.RawMessage   // json raw with escaped unicode chars
    var jsonRawUnescaped json.RawMessage // json raw with unescaped unicode chars

    // '\u263a' == '☺'
    jsonRawEscaped = []byte(`{"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}`) // "\\u263a"
    jsonRawUnescaped, _ = _UnescapeUnicodeCharactersInJSON(jsonRawEscaped)                        // "☺"

    fmt.Println(string(jsonRawEscaped))   // {"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}
    fmt.Println(string(jsonRawUnescaped)) // {"HelloWorld": "안녕, 세상(世上). ☺"}
}

https://play.golang.org/p/pUsrzrrcDG-

Hope this helps.

Character in Go (Golang) – Welcome To Golang By Example, Golang does not have any data type of 'char'. Therefore. byte is used to represent the ASCII character. byte is an alias for uint8, hence is of 8  The \u####-\u#### says which characters match.\u0000-\u007F is the equivilent of the first 255 characters in utf-8 or unicode, which are always the ascii characters. So you match every non ascii character (because of the not) and do a replace on everything that matches. – Gordon Tucker Dec 11 '09 at 21:11

Ascii Nightmares! Please Help! - Golang, My specific issue is when I added ascii art to the file it wigged out on me. main.​go:13:38: invalid character U+005C '' Interesting, the ascii art is not outputting into this topic the way it looks in the rules like those ^ (you could even intentionally use text which will display differently from the text itself). IsPrint reports whether the rune is defined as printable by Go. Such characters include letters, marks, numbers, punctuation, symbols, and the ASCII space character, from categories L, M, N, P, S and the ASCII space character. This categorization is the same as IsGraphic except that the only spacing character is ASCII space, U+0020. func IsPunct ¶

Golang Substring Examples (Rune Slices), To handle non-ASCII chars, we use rune slices. Golang program that uses rune slice, string slice package main import "fmt" func main() { // A string. value := "Welcome, my Here: We take the first two characters in the string as a substring​. by jjc.jclark.com: The spec defines identifier like this: identifier = letter { letter | unicode_digit } where letter is _ or class Lu, Ll, Lt, Lm, or Lo. This doesn't work for languages with combining characters (e.g. South and South-Ea

Check If the Rune is a Letter or not in Golang, This standard number is known as a Unicode code point or rune in the Go language. You are allowed to check the given rune is a letter or not with the help of  Strings, bytes, runes and characters in Go. Rob Pike 23 October 2013 Introduction. The previous blog post explained how slices work in Go, using a number of examples to illustrate the mechanism behind their implementation. Building on that background, this post discusses strings in Go.

Comments
  • Ok.I edited just now.This is complete code.
  • What is wrong with the output?
  • What's wrong with that? var o = {"key":"\u0026"}; o.key === '&' returns true.
  • I see some documentation, and search Google.As you see, output is not wrong, but how can i see '&' when it's as value in json ? Everytime browser show is \u0026
  • The string is valid json. See the RFC at ietf.org/rfc/rfc4627.txt Section 2.5. Strings
  • what would be the case if you wanted to support more utf-8 characters? do you have to replace each character manually? I dont think this should be done manually.. there has to be another way
  • SetEscapeHTML from Go 1.7 should probably be used instead of manually replacing characters.
  • Any other solution?