Cyrillic symbols in URL

url encryption
unicode url
non english characters in url
html decode
url encode hashtag
double url encoding
html to url converter
unescape url

App crashes with following url:

let jsonUrl = "http://api.com/алматы/events"
let session = NSURLSession.sharedSession()
let shotsUrl = NSURL(string: jsonUrl)
let task = session.dataTaskWithURL(shotsUrl!)

Log:

fatal error: unexpectedly found nil while unwrapping an Optional value

It's because of cyrillic symbols in url. How can I solve this issue. Thanks for your help!

Try this:

let encodedUrl = jsonUrl.stringByAddingPercentEncodingWithAllowedCharacters(URLQueryAllowedCharacterSet)

Cyrillic letter in URL, DNS is old, and historically has only supported the 26 Latin characters A through Z and the dash. I suppose the original designers of DNS didn't  I was under that above impression, but after reading this it seems that to protect against domain phishing using characters that look similar, algorithms or whitelists are applied to determine which IDNs are displayed as native characters or as the punycode. Part of that algorithm for Chrome is your current language setting.

Swift 4 Using String Extension Create a swift file named String+Extension.swift and paste this code

import UIKit
extension String{
    var encodeUrl : String
    {
        return self.addingPercentEncoding(withAllowedCharacters: NSCharacterSet.urlQueryAllowed)!
    }
    var decodeUrl : String
    {
        return self.removingPercentEncoding!
    }
}

and Use it like so: (sample according to question):

"http://api.com/алматы/events".encodeUrl

Why do Cyrillic letter in URL domain appears so strange?, [This thread is closed.] Hello, can you update the plugin and decode the URL string because i get something like this url:… Cyrillic characters go in their native order, with a "window" for pseudographic characters. ISO/IEC 8859-5 – 8-bit Cyrillic character encoding established by International Organization for Standardization; KOI8-R – 8-bit native Russian character encoding. Invented in the USSR for use on Soviet clones of American IBM and DEC computers.

Something like this:

let apiHost = "http://api.com/"
let apiPath = "алматы/events"
let escapedPath = apiPath.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLHostAllowedCharacterSet())
let url = NSURL(string: "\(apiHost)\(escapedPath!)")

Obviously you should do something smarter than just force unwrap escapedPath.

Using the Wikipedia page for Swift as an example:

https://ru.wikipedia.org/wiki/Swift_(язык_программирования) becomes https://ru.wikipedia.org/wiki/Swift_(%D1%8F%D0%B7%D1%8B%D0%BA_%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F) which when pasted into the browser takes you to the right page (and most browsers will conveniently render the UFT-8 characters for you).

Cyrillic symbols in URL encoded, Recent developments enable you to add non-ASCII characters to Web addresses​. of the address bar to notify you when an URL contains a non-ASCII character. is true for combinations of Greek or Cyrillic characters with Latin characters. Teams. Q&A for Work. Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information.

Non-ASCII characters (and many special characters) need to be escaped in a URL. Chrome and other browser do it automatically. And they unescape the URLs in the address bar for a nicer display.

So if you have a static URL, just paste it into the adressbar, press enter, selected the URL again, copy and paste it to your app:

So instead of:

let jsonUrl = "http://api.com/алматы/events"

You'll get:

let jsonUrl = "http://api.com/%D0%B0%D0%BB%D0%BC%D0%B0%D1%82%D1%8B/events"

An Introduction to Multilingual Web Addresses, Originally Answered: What is the correct way to handle non utf-8 characters in URLs (for example Cyrillic or Chinese) ? Well, there are two ways you can do it: If​  (unfortunately, that URL does get broken when I cut/paste) The big question to me would be whether searchers are in the habit of using Latin characters in searches, and whether those searches draw more volume than Cyrillic. Unfortunately, we don't have any Russian speakers here on staff, so I can't comment on that one.

Try stringByAddingPercentEncodingWithAllowedCharacters: defined on NSString. You may see people suggesting stringByAddingPercentEscapesUsingEncoding:, but that method is deprecated in iOS 9.

There are also a few predefined NSCharacterSets in Foundation, such as URLHostAllowedCharacterSet and URLPathAllowedCharacterSet. Therefore, if you really have to parse the unescaped URL in code (using preprocessed URLs, mentioned in the accepted answer, is usually a much better idea), you can write a helper method like this:

import Foundation

func url(scheme scheme: String, host: String, path: String) -> NSURL? {
    let components = NSURLComponents()
    components.scheme = scheme
    components.host = host.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLHostAllowedCharacterSet())
    components.path = path.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLPathAllowedCharacterSet())
    return components.URL
}

// evaluates to http://api.com/%25D0%25B0%25D0%25BB%25D0%25BC%25D0%25B0%25D1%2582%25D1%258B/events
url(scheme: "http", host: "api.com", path: "/алматы/events")

Note that the above documentation mentions that

This method is intended to percent-encode an URL component or subcomponent string, NOT an entire URL string.

That's because according RFC 3986, not all parts of an URL can be percent-encoded (e.g. scheme - http/https/etc.)

What is the correct way to handle non-Latin characters in URLs (for , To map the wide range of characters used worldwide into the 60 or so allowed characters in a URI, a two-step process is used: Convert the character string into a  The hexadecimal version of Russian спорт ("sport") is &‌#x0421;&‌#x043F;&‌#x043E;&‌#x0440;&‌#x0442; Note that the hexadecimal numbers include x as part of the code.

URL Encode Decode, Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL. In HTML forms, the character​  UTF-8 Cyrillic. ❮ Previous Next ❯. If you want any of these characters displayed in HTML, you can use the HTML entity found in the table below. If the character does not have an HTML entity, you can use the decimal (dec) or hexadecimal (hex) reference. <p>I will display &#1025;</p>. <p>I will display &#x0401;</p>.

URL Encoding | Maps URLs, срб was not amongst the included. The next dilemma was a character set. International symbols are represented by several standards, and one of the largest and  Paste the text to decode in the big text area. The first few words will be analyzed so they should be (scrambled) in supposed Cyrillic. The program will try to decode the text and will print the result below. If the translation is successful, you will see the text in Cyrillic characters and will be able to copy it and save it if it's important.

Cyrillic URL, Crash if URL string contains cyrillic characters. #560. Closed. njuri opened this issue on Jul 3, 2015 · 2 comments. Closed  The answer is in that URL. It may look like it reads “apple”, but that’s actually a bunch of Cyrillic characters: A, Er, Er, Palochka, Ie.

Comments
  • the method is deprecated
  • Tnx, it's very useful answer
  • most welcome. please do comment if have any more request. and send me the link. i have create bunch of xcode extension.. to make sure development easy and fast.
  • @aaisataev I provided a basic answer that should be enough for you to figure it out from there.
  • Look at this for example: ru.wikipedia.org/wiki/Арктическая_экспедиция_Грили No Punicode on Wikipedia. Percent escaped characters will be interpreted as UTF-8, as per the NSURL documentation.
  • "Punycode is intended for the encoding of labels in the Internationalized Domain Names in Applications (IDNA) framework, such that these domain names may be represented in the ASCII character set allowed in the Domain Name System of the Internet. " So what you are saying is only true for the domain part of the URL, not the rest.