List of all special characters that need to be escaped in a regex

regex match special characters
regex special characters
regex escape forward slash javascript
escape special characters
escape character javascript
regex escape bracket
regex escape dot
javascript escape special characters

I am trying to create an application that matches a message template with a message that a user is trying to send. I am using Java regex for matching the message. The template/message may contain special characters.

How would I get the complete list of special characters that need to be escaped in order for my regex to work and match in the maximum possible cases?

Is there a universal solution for escaping all special characters in Java regex?

You can look at the javadoc of the Pattern class: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

You need to escape any char listed there if you want the regular char and not the special meaning.

As a maybe simpler solution, you can put the template between \Q and \E - everything between them is considered as escaped.

Regex Tutorial, If you want to use any of these characters as a literal in a regex, you need to escape them with a backslash. If you want to match 1+1=2, the correct regex is� There are several characters that need to be escaped to be taken literally (at least outside char classes): Brackets: [] Parentheses: Curly braces: {} Operators: *, +, ?, | Anchors: ^, $ Others: ., \ In order to use a literal ^ at the start or a literal $ at the end of a regex, the character must be escaped.

Escaping, special characters, So it's a special character in regexps (just like in regular strings). There are other special characters as well, that have special meaning in a regexp. Don't try to remember the list – soon we'll deal with each of them separately and you'll know them by heart automatically Not “any character”, but just a dot. I don't know the complete set of characters - but I wouldn't rely on the knowledge anyway, and I wouldn't put it into code. Instead, I would use Regex.Escape whenever I wanted some literal text that I wasn't sure about:

To escape you could just use this from Java 1.5:

Pattern.quote("$test");

You will match exacty the word $test

Regular Expressions, Regular Expression Engine Types � Substitutions with Regular Expressions � Useful Saying that backslash is the "escape" character is a bit misleading. In order to use a literal backslash anywhere in a regex, it must be escaped by another backslash. ALL of the other metacharacters must be escaped differently:. I am using Java regex for matching the message. The template/message may contain special characters. How would I get the complete list of special characters that need to be escaped in order for my regex to work and match in the maximum possible cases? Is there a universal solution for escaping all special characters in Java regex? java regex |

According to the String Literals / Metacharacters documentation page, they are:

<([{\^-=$!|]})?*+.>

Also it would be cool to have that list refereed somewhere in code, but I don't know where that could be...

Special characters in regular expressions and how to escape them, or any of a list of frames, contains text matching the regular expression Some characters have special meanings within regexes these characters are: If you want to use any of these as literal characters you can escape� Characters other than those listed in the Character or sequence column have no special meaning in regular expressions; they match themselves. The characters included in the Character or sequence column are special regular expression language elements. To match them in a regular expression, they must be escaped or included in a positive

On @Sorin's suggestion of the Java Pattern docs, it looks like chars to escape are at least:

\.[{(*+?^$|

In a regular expression, which characters need escaping?, There are multiple types of regular expressions and the set of special characters depend on the particular type. Some of them are described below. In all the� That is because some special characters are part of regular expression language and are considered are Meta Characters in RegEx, so it’s always a best practice to Escape special characters. To your rescue, here is a quick tip to escape all special characters in a string using the .

Which of these characters need to be escaped for a regular , class="_2yuc _3-96" />. Is there anything special I need to do, due to the numbers ? Up vote 1 Down vote. Depending on what type of regex is being used (PCRE , .NET, whatever) the special characters may differ. Boost documentation. From this list of characters "><=/_-. , only . have to be escaped. I'm using the Python module re to write regular expressions for lexical analysis. I've been looking for a comprehensive list of which special characters must be escaped in order to be recognized by the regex to no avail. Can someone please point me to an exhaustive list? The line in the current regex I'm writing that's giving me trouble is:

Regular expressions 1. Special characters, The following characters are the meta characters that give special meaning to the escape sequences while "\1" is one of the substitution special characters. Example: "[^0-9]" matches any character that is not an ASCII digit. all the meta characters are interpreted as ordinary characters without the need to escape them. Don’t try to remember the list – soon we’ll deal with each of them separately and you’ll know them by heart automatically. Escaping. Let’s say we want to find literally a dot. Not “any character”, but just a dot. To use a special character as a regular one, prepend it with a backslash: \.. That’s also called “escaping a

Regular expressions, The next step up in complexity is . , which matches any character except a newline: You need to use an “escape” to tell the regular expression you want to match it exactly, not use its A complete list of unicode properties can be found at� Because we want to do more than simply search for literal pieces of text, we need to reserve certain characters for special use. In the regex flavors discussed in this tutorial, there are 12 characters with special meanings: the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening parenthesis (, the closing parenthesis ), the opening square bracket [, and the opening

Comments
  • If you find \Q and \E hard to remember you can use instead Pattern.quote("...")
  • I wish you'd actually stated them
  • Why, @AleksandrDubinsky ?
  • @Sorin Because it is the spirit (nay, policy?) of Stack Exchange to state the answer in your answer rather than just linking to an off-site resource. Besides, that page doesn't have a clear list either. A list can be found here: docs.oracle.com/javase/tutorial/essential/regex/literals.html, yet it states "In certain situations the special characters listed above will not be treated as metacharacters," without explaining what will happen if one tries to escape them. In short, this question deserves a good answer.
  • "everything between them [\Q and \E] is considered as escaped" — except other \Q's and \E's (which potentially may occur within original regex). So, it's better to use Pattern.quote as suggested here and not to reinvent the wheel.
  • Is there any way to not escape but allow those characters?
  • Escaping a character means to allow the character instead of interpreting it as an operator.
  • Unescaped - within [] may not always work since it is used to define ranges. It's safer to escape it. For example, the patterns [-] and [-)] match the string - but not with [(-)].
  • Even though the accepted answer does answer the question, this answer was more helpful to me when I was just looking for a quick list.
  • Why is this not the most highly rated answer? It solves the problem without going into the complex details of listing all characters that needs escaping and it's part of the JDK - no need to write any extra code! Simple!
  • String escaped = tnk.replaceAll("[\\<\\(\\[\\{\\\\\\^\\-\\=\\$\\!\\|\\]\\}\\)\\?\\*\\+\\.\\>]", "\\\\$0");
  • The Pattern javadoc says it is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct, but a backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct. Therefore a much simpler regex will suffice: s.replaceAll("[\\W]", "\\\\$0") where \W designates non-word characters.
  • String escaped = regexString.replaceAll("([\\\\\\.\\[\\{\\(\\*\\+\\?\\^\\$\\|])", "\\\\$1");
  • ) also has to be escaped, and depending on whether you are inside or outside of a character class, there can be more characters to escape, in which case Pattern.quote does quite a good job at escaping a string for use both inside and outside of character class.