regex not working correctly when the test is fine

regex lookahead
javascript regex
python regex
javascript regex lookbehind
negative lookbehind javascript
javascript regex test
online perl regex
get regex from string

For my database, I have a list of company numbers where some of them start with two letters. I have created a regex which should eliminate these from a query and according to my tests, it should. But when executed, the result still contains the numbers with letters.

Here is my regex, which I've tested on https://www.regexpal.com

([^A-Z+|a-z+].*)

I've tested it against numerous variations such as SC08093, ZC000191 and NI232312 which shouldn't match and don't in the tests, which is fine.

My sql query looks like;

SELECT companyNumber FROM company_data 
WHERE companyNumber ~ '([^A-Z+|a-z+].*)' order by companyNumber desc

To summerise, strings like SC08093 should not match as they start with letters.

I've read through the documentation for postgres but I couldn't seem to find anything regarding this. I'm not sure what I'm missing here. Thanks.

The ~ '([^A-Z+|a-z+].*)' does not work because this is a [^A-Z+|a-z+].* regex matching operation that returns true even upon a partial match (regex matching operation does not require full string match, and thus the pattern can match anywhere in the string). [^A-Z+|a-z+].* matches a letter from A to Z, +,|or a letter fromatoz`, and then any amount of any zero or more chars, anywhere inside a string.

You may use

WHERE companyNumber NOT SIMILAR TO '[A-Za-z]{2}%'

See the online demo

Here, NOT SIMILAR TO returns the inverse result of the SIMILAR TO operation. This SIMILAR TO operator accepts patterns that are almost regex patterns, but are also like regular wildcard patterns. NOT SIMILAR TO '[A-Za-z]{2}%' means all records that start with two ASCII letters ([A-Za-z]{2}) and having anything after (%) are NOT returned and all others will be returned. Note that SIMILAR TO requires a full string match, same as LIKE.

Catastrophic backtracking, Regular expressions are popular when testing web applications JMeter, otherwise the regular expression will be parsed literally, and not logically. the request; Response Code - e.g. 200; Response Message - e.g. OK Yes, correct. So let's work on a BeanShell PostProcessor to extract those values. Copy your sample into the test string box and see the match was found in 144 steps or so. Now add some bad data late in the event - for example change one of the 36 to 36U. Up above to the right, after a short while, you will see the words "catastrophic backtracking".

Your pattern: [^A-Z+|a-z+].* means "a string where at least some characters are not A-Z" - to extend that to the whole string you would need to use an anchored regex as shown by S-Man (the group defined with (..) isn't really necessary btw)

I would probably use a regex that specifies want the valid pattern is and then use !~ instead.

where company !~ '^[0-9].*$'

^[0-9].*$ means "only consists of numbers" and the !~ means "does not match"

or

where not (company ~ '^[0-9].*$')

Using RegEx (Regular Expression Extractor) with JMeter – BlazeMeter, For example, the regular expression test will match the string test exactly. If the caret appears elsewhere in a character class, it does not have special meaning. Compilation flags let you modify some aspects of how regular expressions work. If your system is configured properly and a French locale is selected, certain  When I reran the test, the best regex took about the same amount of time to match the non-matching input, but the matching input took only on average 800 milliseconds to run, as opposed to 4,700 milliseconds for the better regex and a whopping 17,000 milliseconds for the bad regex.

Not start with a letter could be done with

WHERE company ~ '^[^A-Za-z].*'

demo: db<>fiddle

The first ^ marks the beginning. The [^A-Za-z] says "no letter" (including small and capital letters).


Edit: Changed [A-z] into the more precise [A-Za-z] (Why is this regex allowing a caret?)

Regular Expression HOWTO, Test for a match, or test for failure, without actually consuming any characters. The other way around will not work, because the lookahead will already have The good news is that you can use lookbehind anywhere in the regex, not only at the The last regex, which works correctly, has a double negation (the \W in the​  Regular Expression to Given a list of strings (words or other characters), only return the strings that do not match.

Regex Tutorial, Indicates whether the regular expression finds a match in the input string. If you do not set a time-out interval when you call the constructor, the exception is backtracking from appearing to stop responding when they process input that  3) This RegEx DOES NOT WORK in .NET v1.1 (well, at least not for me!) 4) I found the article "FIX: The Regex class and the Match class may not correctly find matches for a regular expression" on Microsoft Support site

Regex.IsMatch Method (System.Text.RegularExpressions , we think wrong. The regex result seems ok in Preview/Enketo, but wrong in K… The regex function does not work in my form. Xiphware January 3 it stops at once! In Enketo/Preview it's all like expected (and with regex test tools). which would, correctly, look for a matching substring. Will confirm and  THis all works fine, however for one account the description matches the regular expression than it assigns a certain category, we then need to say if it does not match the previous description then assign a different category. In other words every other description except the one that is assigned its own category.

Regex Problem in KoBoCollect - Bug Reporting, Limit the Length of Text Problem You want to test whether a string is The first regex uses the “dot matches line breaks” option so that it will work correctly when​  This is definitely not the first online regex tester, and it’s not the most fully featured. The main reasons for developing this were: Testing in an R environment (I would always have to double my backslashes after working out a regex in another tester) A fun/challenging side project that involved shiny

Comments
  • Try WHERE companyNumber NOT SIMILAR TO '[A-Za-z]{2}%'
  • Thank you, that worked. If you can, could you perhaps explain why that did work and my regex didn't perhaps?
  • Your regex probably doesn't do what you intend - [^A-Z+|a-z+] will match a single character that is neither a lowercase letter nor an uppercase letter nor a literal + nor a literal '|' .
  • Thank you, that worked. If you can, could you perhaps explain why that did work and my regex didn't perhaps?
  • @KieranDee I hope I added enough details. I think SIMILAR TO is the best fit for your task. If your conditions become more specific, you may consider moving to the ~ operator. Then, do not forget about ^ to mark the start of string and in case you need that, $ as the end of string anchor.
  • Thanks for your explanations. So even if any part of the string matches the regex, postgres will return it basically. Meaning I have to match the entire string with the regex rather than part of it, at least for my regex.
  • [^A-z] does not actually mean no letter. It means no letter, [, \, ], ^, _ and ` chars. See the [A-z] char range.
  • @WiktorStribiżew yes of course you're right. In that case the other solution is wrong as well because it starts always with a digit. Thats when the TO no exactly defines its use case.