Why is this regular expression only capturing the last digit?

regex
regular expression examples
regular expression python
in regular expression
regex101
best use case for regular expression validate
regex any number of digits
regular expression search

From my code it should be pretty easy to see what I'm trying to do

for path in glob.glob("orig_data/*.*"):
    pattern = ".*(\d+\.).*"
    new_name = re.sub(pattern, r'\1txt', path)
    copyfile(path, 'orig_data_renamed/'+new_name)

I just want to keep the numbers that are immediately before the "." filename but it's not having it.

Here's an example output

some_folder/asdf321428.txt
8.txt

The problem is clearly with the '+' but I'm not sure what it wants instead.

Here is an re.sub solution, also using string split. We can split the input path on separator /, and then use the last element to obtain the numbers. We make a second call to re.sub to isolate the digits occurring just before the dot.

path = "some_folder/asdf321428.txt"
nums = re.sub(r'^.*?(\d+)\.\w+$', '\\1', path.split("/")[-1])
print(nums)

This prints:

321428

If you just want the filename, then try this version:

path = "some_folder/asdf321428.txt"
nums = re.sub(r'^.*?(?=\d+\.\w+$)$', '', path.split("/")[-1])
print(nums)

Everything you need to know about Regular Expressions, The most basic building block in a regular expression is a character a.k.a. literal. Therefore there is only one string that matches this pattern, and it is Anything that is left over after the last slash is what we capture into a� I need to create a regular expression to validate comma separated numeric values. They should look like: 1,2,3,4,5 etc…. The value must be either a single digit like: 1 no empty spaces before or after, no commas before or after. Or… multiple numerical values separated by commas. First and last characters must be a number.

Maybe,

(\S*?)(\d*)\.txt

might work OK here.

Test
import re

string = '''
some_folder/asdf321428.txt
8.txt
some_folder123/asdf321428.txt
'''

expression = r'(?m)(\S*?)(\d*)\.txt'


print(re.findall(expression, string))
Output
[('some_folder/asdf', '321428'), ('', '8'), ('some_folder123/asdf', '321428')]

If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


even last digit, RegEx for Plockar ut sista siffran om den �r oj�mn. Match IPv6 Address � match a wide range of international phone number � Url � email validation� Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. These patterns are used with the exec() and test() methods of RegExp, and with the match(), matchAll(), replace(), search(), and split() methods of String. This chapter describes JavaScript regular expressions.

You're saying you want to keep the numbers, but what you're doing with re.sub is substituting the numbers. What you want to do is: find by the pattern, then take the first match (please handle error by yourself)

new_name = re.findall(pattern, path)[0]  + "txt" 

Output:

321428.txt

also if you want to take all the digits before the dot, simply change the pattern into:

pattern = r"\D(\d+\.)"

Documentation: 9.0: Pattern Matching, string LIKE pattern [ESCAPE escape-character] string NOT LIKE pattern A string is said to match a regular expression if it is a member of the regular set array of all of the captured substrings resulting from matching a POSIX regular expression pattern. As the last example demonstrates, the regexp split functions ignore� Because in our regular expression, we only match the string which contains 1 to 5 ‘g’s. Usage of {m,n} to match m to n instances of the preceding character, group, bracket expression or character class.

Python Regular Expression, Regular expression is widely used for pattern matching. Python import re >>> s = "my number is 123" >>> match = re.search(r'\d\d\d', s) >>> match <_sre. Group capturing allows to extract parts from the matching string. Regular Expression to such as 333-333-1234. Character classes. any character except newline \w \d \s: word, digit, whitespace

Repeating a Capturing Group vs. Capturing a Repeated Group, Repeating a capturing group in a regular expression is not the same as capturing a The difference is that the repeated capturing group will capture only the last the engine reached the position between the first and second character in the� Parentheses group together a part of the regular expression, so that the quantifier applies to it as a whole. Parentheses groups are numbered left-to-right, and can optionally be named with (?<name>). The content, matched by a group, can be obtained in the results: The method str.match returns capturing groups only without flag g.

Regular Expression Syntax in the Regex Library Editor, A regular expression is a pattern that is matched against a subject string from left to right. See information on capturing subpatterns with regular expressions. where the current character and the previous character do not both match \w or� It will attache numbers to the capturing groups and allow back referencing using these numbers. In this section, we will see a few examples on how to use capturing groups in Java regex API. Let's use a capturing group that matches only when an input text contains two digits next to each other:

Comments
  • Try .*?(\d+\.).* instead. .* matches greedily.
  • I still get the same result. In the above, I'd like for the new to pop out as '321428.txt'
  • @financial_physician Is your expected output 321428, or is it 321428.txt ?
  • the latter, but it's really easy to get their from your answer below. Out of curiosity, what does "r" do?
  • clean, I like it. Thank you!
  • For some reason, when I run yours I still get errors. Tim's works for me. I don't know why yours doesn't work for me. It looks good on the regex101.com page
  • Thank you! This fills me in on a lot