How to do a non-greedy match in grep?

sed non greedy match
regex non greedy
grep lazy match
grep first match
grep show only match
bash regex non greedy
grep regex
grep capture group

I want to grep the shortest match and the pattern should be something like:

<car ... model=BMW ...>
...
...
...
</car>

... means any character and the input is multiple lines.

You're looking for a non-greedy (or lazy) match. To get a non-greedy match in regular expressions you need to use the modifier ? after the quantifier. For example you can change .* to .*?.

By default grep doesn't support non-greedy modifiers, but you can use grep -P to use the Perl syntax.

You need grep with PCRE (Perl Compatible Regular Expression) support. e.g. GNU grep has this -- can be leveraged with the -P option. There is also a  How to do a non-greedy match in grep ? - Wikitechy. HOT QUESTIONS. What is difference between class and interface in C#; Mongoose.js: Find user by username LIKE value

Actualy the .*? only works in perl. I am not sure what the equivalent grep extended regexp syntax would be. Fortunately you can use perl syntax with grep so grep -P would work but grep -E which is same as egrep would not work (it would be greedy).

See also: http://blog.vinceliu.com/2008/02/non-greedy-regular-expression-matching.html

To get a non-greedy match in regular expressions use the modifier ? after the quantifier. For instance we can change .* to .*?. In grep, it does not  For non-greedy match in grep you could use a negated character class. In other words, try to avoid wildcards. For example, to fetch all links to jpeg files from the page content, you'd use: grep -o '"[^" ]\+.jpg"'

My grep that works after trying out stuff in this thread:

echo "hi how are you " | grep -shoP ".*? "

Just make sure you append a space to each one of your lines

(Mine was a line by line search to spit out words)

Agree with Kyle. However, in this case, you could do this: egrep "\[#([^]])*)#\]" . -​Rohis and get what you're looking for. The [^]]* matches non- ] characters, so it'll  You need to do non-greedy match here, to stop at first occurrence. But since grep doesn't support non-greedy match by default, you can use negated character class: echo "word word" | grep -o 'w[^r]*rd' If you've GNU grep, then you can use -P option to enable Perl regex syntax.

grep

For non-greedy match in grep you could use a negated character class. In other words, try to avoid wildcards.

For example, to fetch all links to jpeg files from the page content, you'd use:

grep -o '"[^" ]\+.jpg"'

To deal with multiple line, pipe the input through xargs first. For performance, use ripgrep.

You are using .* properly but as you noticed it is greedily eating up as many characters as it can in your match because . matches any character  Even if your regular expression engine supports non-greedy matching, it's better to spell out what you actually mean. If this is what you mean, you should probably say this, instead of rely on non-greedy matching to (hopefully, probably) Do What I Mean.

To get a non-greedy match in regular expressions you need to use the modifier ? after the quantifier. For example you can change .* to .*? . By default grep doesn't​  Vi and Vim Stack Exchange is a question and answer site for people using the vi and Vim families of text editors. It only takes a minute to sign up.

By using non-greedy Perl-style regular expressions, you can prevent this from occurring and stop the search as soon as the search criteria has been satisfied. Teams. Q&A for Work. Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information

If you're on an engine that does not support non-greedy match, you can use some trick to achieve that. Note: I will be using GNU grep ( 2.25 ) in a  The trick to get non greedy matching in sed is to match all characters excluding the one that terminates the match. I know, a no-brainer, but I wasted precious minutes on it and shell scripts should be, after all, quick and easy. So in case somebody else might need it: Greedy matching % echo "foobar" | sed 's/<.*>//g' bar Non greedy matching

[\D\S] means not digit OR whitespace, both match p> if found, attempt lazy match of any characters until (?s)<p(?(?=\s)\ .*?)>(. if you can NOT find behind.

Comments
  • stackoverflow.com/questions/1732348/1732454#1732454
  • eegg: dot all modifier is also known as multiline. It's a modifier that changes the "." match behavior to include newlines (normally it doesn't). There's no such modifier in grep, but there is in pcregrep.
  • Correction: In most of the regex flavors that support it, the mode that allows . to match newlines is called DOTALL or single-line mode; Ruby is the only one that calls it multiline. In the other flavors, multiline is the mode that allows the anchors (^ and $) to match at line boundaries. Ruby has no equivalent mode because in Ruby they always work that way.
  • -P was a complete new one on me, I've been happily grepping away for years, and only using -E ... so many wasted years! - Note to self: Re-read Man pages as a (even more!) regular thing, you never digest enough switches and options.
  • On some platforms (like Mac OS X) grep does not support -P, but if you use egrep you can use the .*? pattern to achieve the same result. egrep -o 'start.*?end' text.html
  • As an extension to @SaltyNuts comment, Mac OS X does not support -P but -E would call egrep hence the suggested .*? works just fine.
  • grep -P does not work in GNU grep 2.9 -- just tried it (it doesnt error, just silently doesn't apply the ?. Intertestly neither does the not class eg: env|grep '[^\=]*\='
  • There's no grep -P option or pgrep command in Darwin/OS X 10.8 Mountain Lion, but egrep works great.
  • There's a pgrep command on my OS X 10.9 box, but it's a completely different program whose purpose is to "find or signal processes by name".
  • @robertotomás Responding to a 6-year old comment here, but....I thought this as well and then realized I was getting multiple non-greedy matches. For instance, on a color terminal you can see that ` echo "bbbbb" | grep -P 'b.*?b'` returns 2 matches.
  • -shoP nice mnemonic :)
  • echo "bbbbb" | grep -shoP 'b.*?b' is a little bit of a learning experience. Only thing that worked for me in terms of explicitly lazy as well.