Number of regex matches

regex 0 or 1 times
regex zero or more
regex any number of digits
javascript regex
regex number of characters
regex one or more digits
regex match only once
count number of regex matches python

I'm using the finditer function in the re module to match some things and everything is working.

Now I need to find out how many matches I've got. Is it possible without looping through the iterator twice? (one to find out the count and then the real iteration)

Some code:

imageMatches = re.finditer("<img src\=\"(?P<path>[-/\w\.]+)\"", response[2])
# <Here I need to get the number of matches>
for imageMatch in imageMatches:
    doStuff

Everything works, I just need to get the number of matches before the loop.

If you know you will want all the matches, you could use the re.findall function. It will return a list of all the matches. Then you can just do len(result) for the number of matches.

How do I count the number of matches by a regex?, This article shows how to find the number of regex matches in a string. Regex for 1 to 9. To match any number from 1 to 9, regular expression is simple /[1-9]/ Similarly you may use /[3-7]/ to match any number from 3 to 7 or /[2-5]/ to match 2,3,4,5. Regex for 0 to 10. To match numbers from 0 to 10 is the start of a little complication, not that much, but a different approach is used.

If you always need to know the length, and you just need the content of the match rather than the other info, you might as well use re.findall. Otherwise, if you only need the length sometimes, you can use e.g.

matches = re.finditer(...)
...
matches = tuple(matches)

to store the iteration of the matches in a reusable tuple. Then just do len(matches).

Another option, if you just need to know the total count after doing whatever with the match objects, is to use

matches = enumerate(re.finditer(...))

which will return an (index, match) pair for each of the original matches. So then you can just store the first element of each tuple in some variable.

But if you need the length first of all, and you need match objects as opposed to just the strings, you should just do

matches = tuple(re.finditer(...))

Java: Find number of regex matches in a String, The number of matches. Implements. Count Count Count. Exceptions. RegexMatchTimeoutException. A time-out occurred. I just wrote a regular expression in Python that matches multiple times in the text and wondered: how to count the number of matches?. Consider the example where you match an arbitrary number of word characters '[a-z]+' in a given sentence 'python is the best programming language in the world'.

If you find you need to stick with finditer(), you can simply use a counter while you iterate through the iterator.

Example:

>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
        count +=1
>>> count
3

If you need the features of finditer() (not matching to overlapping instances), use this method.

Quantifiers in Regular Expressions, To match a character having special meaning in regex, you need to use a escape Take note that this regex matches number with leading zeros, such as "000"  Dim match As Match = regex.Match(input, startAt) Do While match.Success ' Handle match here match = match.NextMatch() Loop The regular expression pattern for which the Matches(String, Int32) method searches is defined by the call to one of the Regex class constructors.

#An example for counting matched groups
import re

pattern = re.compile(r'(\w+).(\d+).(\w+).(\w+)', re.IGNORECASE)
search_str = "My 11 Char String"

res = re.match(pattern, search_str)
print(len(res.groups())) # len = 4  
print (res.group(1) ) #My
print (res.group(2) ) #11
print (res.group(3) ) #Char
print (res.group(4) ) #String

MatchCollection.Count Property (System.Text.RegularExpressions , Limit the number of nonwhitespace characters. The following regex matches any string that contains between 10 and 100 nonwhitespace characters: ^\s*(? Of the nine digits in the input string, five match the pattern and four (95, 929, 9219, and 9919) do not.

For those moments when you really want to avoid building lists:

import re
import operator
from functools import reduce
count = reduce(operator.add, (1 for _ in re.finditer(my_pattern, my_string))) 

Sometimes you might need to operate on huge strings. This might help.

Regular Expression (Regex) Tutorial, In regular expression parsing, the * symbol matches zero or more within square brackets, then the * will match any quantity of all of these characters. 1. Match any character using regex '.' character will match any character without regard to what character it is. The matched character can be an alphabet, number of any special character. By default, period/dot character only matches a single character. To create more meaningful patterns, we can combine it with other regular expression constructs.

4.9. Limit the Length of Text, preg_match_all — Perform a global regular expression match Returns the number of full pattern matches (which might be zero), or FALSE if an error occurred. The next column, "Legend", explains what the element means (or encodes) in the regex syntax. The next two columns work hand in hand: the "Example" column gives a valid regular expression that uses the element, and the "Sample Match" column presents a text string that could be matched by the regular expression.

Regular Expressions 101, Phone number. Task: Write a regular expression which matches any phone number. A phone number in this example consists  Java regular expressions are very similar to the Perl programming language and very easy to learn. A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern.

preg_match_all - Manual, regexp_matches ----------------- {PostgreSQL} {REGEX_MATCHES} (2 rows) The result set has two rows, each is an array, which indicated that there were two matches. Noted that the REGEXP_MATCHES() returns each row as an array, rather than a string.

Comments
  • @Rafe Kettler: findall finds non-overlapping. From the documentation: Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.
  • @Rafe Kettler & JoshD: Thanks for the clarification, the parts will never overlap so that won't be a problem for me in this case. The only annoyance with re.findall is that i loose my named groups, but it works so it's good enough.
  • Okay, I posted my answer anyway. Happy trails.
  • Yea I thought of doing that but due to things in my "doStuff" code won't work without adding a lot of extra code in various places. Thanks for the tip anyway :)
  • I would use for count, match in enumerate(iterator): in the case of Rafe's code.
  • @Tony: thanks, forgot about enumerate. If you do use enumerate, though, it will give you the highest index, not the actual number of matches; for that, you'd have to add 1.
  • for count, match in enumerate(iterator) regresses badly where there are no matches. Adding count = -1 before the loop may be an acceptable solution.
  • I think len(res.groups()) throw exception when re match nothing