I am trying to find sentences between pipe | and dot ., e.g.

| This is one. This is two.

The regex pattern I use :

preg_match_all('/(:\s|\|+)(.*?)(\.|!|\?)/s', $file0, $matches);

So far I could not manage to capture both sentences. The regex I use captures only the first sentence.

How can I solve this problem?

EDIT: as it may seen from the regex, I am trying to find the sentences BETWEEN (: or |) AND (. or ! or ?)

Column or pipe indicates starting point for sentences. The sentences might be:

: Sentence one. Sentence two. Sentence three. 
| Sentence one. Sentence two? 
| Sentence one. Sentence two! Sentence three?

I would keep it simple and just match on:


This says to match any content not consisting of pipes or full stops, and it also trims optional whitespace before/after each sentence.

$input = "| This is one. This is two.";
preg_match_all('/\s*[^.|]+\s*/s', $input, $matches);

This prints:

    [0] =>  This is one
    [1] =>  This is two

This does the job:

$str = '| This is one. This is two.';
preg_match_all('/(?:\s|\|)+(.*?)(?=[.!?])/', $str, $m);


    [0] => Array
            [0] => | This is one
            [1] =>  This is two

    [1] => Array
            [0] => This is one
            [1] => This is two


Demo & explanation

To keep it simple, find everything between | and . and then split:

$input = "John loves Mary. | This is one. This is two. | Sentence 1. Sentence 2.";
preg_match_all('/\|\s*([^|]+)\./', $input, $matches);
if ($matches) {
    foreach($matches[1] as $match) {
        print_r(preg_split('/\.\s*/', $match));


    [0] => This is one
    [1] => This is two
    [0] => Sentence 1
    [1] => Sentence 2

  • What is your expected output here?
  • It's simple but not my interpretation of what the OP is looking for. If $input is "John loves Mary.| This is one. This is two.", then it also matches John loves Mary, which is not *between a | and a .`.
  • I consider John loves Mary to in fact be a sentence in this case.
  • We agree it's a sentence but it is not between | and .. It comes before the pipe character.
  • You probably only want to print the second array element, and also you may want to add some non capturing groups to your regex.
  • @TimBiegeleisen: It's up to OP to choose what they want to keep, I've printed the whole match to show how it captures the strings. A non capture group is already in place.
  • On input John loves Mary. | This is one. This is two., you also match loves Mary.
  • This is two. and Sentence 2. are not between | and .
  • | This is one. This is two. I have highlighted the characters I believe that qualifies. The OP wants sentences (multiple) between these two characters and complained that he could not capture both sentences.
  • Conclusion there are more than 1 interpretation of the question => need clarification from OP.