How to split a string, but also keep the delimiters?

how to split a string but also keep the delimiters python
split string but keep delimiter javascript
split without removing delimiter java
c# split but keep delimiter
vba split keep delimiter
string split but keep delimiter python
how to split a string in java with delimiter
how to split string with

I have a multiline string which is delimited by a set of different delimiters:

(Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4)

I can split this string into its parts, using String.split, but it seems that I can't get the actual string, which matched the delimiter regex.

In other words, this is what I get:

  • Text1
  • Text2
  • Text3
  • Text4

This is what I want

  • Text1
  • DelimiterA
  • Text2
  • DelimiterC
  • Text3
  • DelimiterB
  • Text4

Is there any JDK way to split the string using a delimiter regex but also keep the delimiters?

You can use Lookahead and Lookbehind. Like this:

System.out.println(Arrays.toString("a;b;c;d".split("(?<=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("(?=;)")));
System.out.println(Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))")));

And you will get:

[a;, b;, c;, d]
[a, ;b, ;c, ;d]
[a, ;, b, ;, c, ;, d]

The last one is what you want.

((?<=;)|(?=;)) equals to select an empty character before ; or after ;.

Hope this helps.

EDIT Fabian Steeg comments on Readability is valid. Readability is always the problem for RegEx. One thing, I do to help easing this is to create a variable whose name represent what the regex does and use Java String format to help that. Like this:

static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";
...
public void someMethod() {
...
final String[] aEach = "a;b;c;d".split(String.format(WITH_DELIMITER, ";"));
...
}
...

This helps a little bit. :-D

How to split string but keep delimiters in java?, As from your input string and expected results, I can infer that you want to split your string basically from three rules. Split from the point which is  Questions: I have a multiline string which is delimited by a set of different delimiters: (Text1)(DelimiterA)(Text2)(DelimiterC)(Text3)(DelimiterB)(Text4) I can split this string into its parts, using String.split, but it seems that I can’t get the actual string, which matched the delimiter regex.

Split a string on matched tokens, but keep the delimiters · Issue #772 , I also tried the "Ruby-way" with the link provided over, but nothing again gave the desired output. Any help is greatly appreciated, A. n.b. I would'  Stack Overflow Public questions and answers; Teams Private questions and answers for your team; Enterprise Private self-hosted questions and answers for your enterprise; Talent Hire technical talent

A very naive solution, that doesn't involve regex would be to perform a string replace on your delimiter along the lines of (assuming comma for delimiter):

string.replace(FullString, "," , "~,~")

Where you can replace tilda (~) with an appropriate unique delimiter.

Then if you do a split on your new delimiter then i believe you will get the desired result.

Split Strings and Keep the Delimiter, But still, there are some cases, when we want to keep them when splitting the string. 'one.two.three'.split(/(?<=\.)/); // ['one.',  To keep several delimiters as a whole. The whole idea being, as you want to split but keep all the characters, to match positions only.

import java.util.regex.*;
import java.util.LinkedList;

public class Splitter {
    private static final Pattern DEFAULT_PATTERN = Pattern.compile("\\s+");

    private Pattern pattern;
    private boolean keep_delimiters;

    public Splitter(Pattern pattern, boolean keep_delimiters) {
        this.pattern = pattern;
        this.keep_delimiters = keep_delimiters;
    }
    public Splitter(String pattern, boolean keep_delimiters) {
        this(Pattern.compile(pattern==null?"":pattern), keep_delimiters);
    }
    public Splitter(Pattern pattern) { this(pattern, true); }
    public Splitter(String pattern) { this(pattern, true); }
    public Splitter(boolean keep_delimiters) { this(DEFAULT_PATTERN, keep_delimiters); }
    public Splitter() { this(DEFAULT_PATTERN); }

    public String[] split(String text) {
        if (text == null) {
            text = "";
        }

        int last_match = 0;
        LinkedList<String> splitted = new LinkedList<String>();

        Matcher m = this.pattern.matcher(text);

        while (m.find()) {

            splitted.add(text.substring(last_match,m.start()));

            if (this.keep_delimiters) {
                splitted.add(m.group());
            }

            last_match = m.end();
        }

        splitted.add(text.substring(last_match));

        return splitted.toArray(new String[splitted.size()]);
    }

    public static void main(String[] argv) {
        if (argv.length != 2) {
            System.err.println("Syntax: java Splitter <pattern> <text>");
            return;
        }

        Pattern pattern = null;
        try {
            pattern = Pattern.compile(argv[0]);
        }
        catch (PatternSyntaxException e) {
            System.err.println(e);
            return;
        }

        Splitter splitter = new Splitter(pattern);

        String text = argv[1];
        int counter = 1;
        for (String part : splitter.split(text)) {
            System.out.printf("Part %d: \"%s\"\n", counter++, part);
        }
    }
}

/*
    Example:
    > java Splitter "\W+" "Hello World!"
    Part 1: "Hello"
    Part 2: " "
    Part 3: "World"
    Part 4: "!"
    Part 5: ""
*/

I don't really like the other way, where you get an empty element in front and back. A delimiter is usually not at the beginning or at the end of the string, thus you most often end up wasting two good array slots.

Edit: Fixed limit cases. Commented source with test cases can be found here: http://snippets.dzone.com/posts/show/6453

How to split string but keep all delimiters (Java in General forum at , Also, testing with a single test case does not allow one to have confidence in the resulting regex. Edit: Regular expression use "[ just about  How can I split a string in Java and retain the delimiters? Ask Question Also, while replace() doesn't use a regex, split() does. How to split and keep

I got here late, but returning to the original question, why not just use lookarounds?

Pattern p = Pattern.compile("(?<=\\w)(?=\\W)|(?<=\\W)(?=\\w)");
System.out.println(Arrays.toString(p.split("'ab','cd','eg'")));
System.out.println(Arrays.toString(p.split("boo:and:foo")));

output:

[', ab, ',', cd, ',', eg, ']
[boo, :, and, :, foo]

EDIT: What you see above is what appears on the command line when I run that code, but I now see that it's a bit confusing. It's difficult to keep track of which commas are part of the result and which were added by Arrays.toString(). SO's syntax highlighting isn't helping either. In hopes of getting the highlighting to work with me instead of against me, here's how those arrays would look it I were declaring them in source code:

{ "'", "ab", "','", "cd", "','", "eg", "'" }
{ "boo", ":", "and", ":", "foo" }

I hope that's easier to read. Thanks for the heads-up, @finnw.

Split string with regex and keep delimiters « Blog Archive « icyrock , The other day I needed to split a string with regex delimiter, but also keep these delimiters. Java's default String.split does not do that – it throws  If you want to keep the delimiters in their own parts (as opposed to the beginning or end of delimited parts), you can also use @"([.,;])". According to msdn.microsoft.com/en-us/library/…, "If capturing parentheses are used in a Regex.Split expression, any captured text is included in the resulting string array."

strsplit, I wanted to split the string but also keep the delimiter. basic regular expressions. Let's start at the beginning. If you do not know what regular  You can also split a string with an array of strings instead of a regular expression, like this: def tokenizeString(aString, separators): #separators is an array of strings that are being used to split the the string.

[Solved] splitstrings but keep the multiple delimiters, Your best best is to use a Regex rather than Split: split does not keep the delimiter: Hide Copy Code. string input  splitstrings but keep the multiple delimiters. basically split the string on K or R but keep the K or R when I writeline the sequence [edit]Code block added

How to Split String based on delimiter in Java , Alternatively, you can also use special characters inside character class i.e. Splitting a String on delimiter as pipe is little bit tricky becuase the most This is one of the simplest way to split a CSV String in Java, but if your  Here's a more general method that works smilarly to String.split, which you can cut out and keep, paste into your class, add to your toolkit (The 1 is because we want the number part to the matcher group to appear at the start of the following string,

Comments
  • Come to think of it, where do you want to keep the delimiters? Along with words or separate? In the first case, would you attach them to preceding or following word? In the second case, my answer is what you need...
  • Just implemented a class which should help you achieve what you are looking for. See below
  • Very nice! Here we can see again the power of regular expressions!!
  • Nice to see there is a way to do this with String#split, though I wish there was a way to include the delimiters as there was for the StringTokenizer - split(";", true) would be so much more readable than split("((?<=;)|(?=;))").
  • That should be: String.format(WITH_DELIMITER, ";"); as format is a static method.
  • One complication I just encountered is variable-length delimiters (say [\\s,]+) that you want to match completely. The required regexes get even longer, as you need additional negative look{ahead,behind}s to avoid matching them in the middle, eg. (?<=[\\s,]+)(?![\\s,])|(?<![\\s,])(?=[\\s,]+).
  • what if I want split by two delimiters? let's say ';' or '.'
  • Note that this will only work for relatively simple expressions; I got a "Look-behind group does not have an obvious maximum length" trying to use this with a regex representing all real numbers.
  • FYI: Merged from stackoverflow.com/questions/275768/…
  • Wahoo... Thank you for participating! Interesting approach. I am not sure it can be help consistently (with that, sometimes there is a delimiter, sometimes there is not), but +1 for the effort. However, you still need to properly address the limit cases (empty or null values)