I am doing some scraping and I want to scrape a certain part of a src element but not sure how to do this with regex. Are there any regex ninja's here who can help me?

srcset=" 150w, 300w, 600w, 1200w"

I want the first url before 1200w. So the outcome should be:

Why I need regex for this, the last element:

a = 'srcset=" 150w, 300w, 600w, 1200w"'

a = a.replace('srcset=', '').replace('"', '').split(',')
done = a[len(a)-1].strip().split(' ')[0]

You can use this regex:


Searching for r"600w, (.*) 1200w" , you Group 1 should return the url you are looking for.

The pattern .+?(?=1200w) will match any character except a newline 1+ times until what is on the right is 1200.

To get a more specific match using a regex, you could use a capturing group:

\bsrcset="[^"]* (https?://\S+)\s+1200w"

Regex demo | Python demo

For example:

import re
regex = r'\bsrcset="[^"]* (https?://\S+)\s+1200w"'
test_str = """srcset=\" 150w, 300w, 600w, 1200w\""""

matches =, test_str)
if matches:


