C# Regex Split - commas outside quotes

regex split on space unless in quotes
regex split comma delimited string c#
split string with double quotes in javascript
regex match comma not inside quotes
c# split(string with text qualifier)
split string with double quotes in c#
regex parse csv with quotes
c string split with quotes

I got quite a lot of strings (segments of SQL code, actually) with the following format:

('ABCDEFG', 123542, 'XYZ 99,9')

and i need to split this string, using C#, in order to get:

  • 'ABCDEFG'
  • 123542
  • 'XYZ 99,9'

I was originally using a simple Split(','), but since that comma inside the last parameter is causing havoc in the output i need to use Regex to get it. The problem is that i'm still quite noobish in regular expressions and i can't seem to crack the pattern mainly because inside that string both numerical and alpha-numerical parameters may exist at any time...

What could i use to split that string according to every comma outside the quotes? Cheers

You could split on all commas, that do have an even number of quotes following them , using the following Regex to find them:

",(?=(?:[^']*'[^']*')*[^']*$)"

You'd use it like

var result = Regex.Split(samplestring, ",(?=(?:[^']*'[^']*')*[^']*$)");

C# Regex Split, I want the parser to split fields only with commas outside quotes. One solution could be accept a regex as a field separator. There is one question  C# Regex Pattern To Split Strings Separated By Comma Outside Quotation Marks, 3.0 out of 5 based on 1 rating Possibly relevant: Javascript Regular Expression (Regex) To Match/Replace/Validate URL

//this regular expression splits string on the separator character NOT inside double quotes. 
//separatorChar can be any character like comma or semicolon etc. 
//it also allows single quotes inside the string value: e.g. "Mike's Kitchen","Jane's Room"
Regex regx = new Regex(separatorChar + "(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))"); 
string[] line = regx.Split(string to split);

Regex: Splitting by Character, Unless in Quotes, Many times when you're parsing text you find yourself needing to split strings on a comma character (or new lines, tabs, etc.), but then what if you needed to use  C# Regex Split - commas outside quotes #4 Approaching the Problem. First of all we have to accept the fact, that there is no possible standard how CSV files are Implementing the Feature. So a tokenizer is responsible for splitting your CSV data into your column data. This can be Using a

although I too like a challenge some of the time, but this actually isn't one. please read this article http://secretgeek.net/csv_trouble.asp and then go on and use http://www.filehelpers.com/

[Edit1, 3]: or maybe this article can help too (the link only shows some VB.Net sample code but still, you can use it with C# too!): http://msdn.microsoft.com/en-us/library/cakac7e6.aspx

I've tried to do the sample for C# (add reference to Microsoft.VisualBasic to your project)

using System;
using System.IO;
using Microsoft.VisualBasic.FileIO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            TextReader reader = new StringReader("('ABCDEFG', 123542, 'XYZ 99,9')");
            TextFieldParser fieldParser = new TextFieldParser(reader);

            fieldParser.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
            fieldParser.SetDelimiters(",");

            String[] currentRow; 

            while (!fieldParser.EndOfData)
            {
                try
                {
                     currentRow = fieldParser.ReadFields();

                     foreach(String currentField in currentRow)
                     {
                        Console.WriteLine(currentField);                        
                     }
                }
                catch (MalformedLineException e)
                {
                    Console.WriteLine("Line {0} is not valid and will be skipped.", e);
               }

            } 

        }
    }
}

[Edit2]: found another one which could be of help here: http://www.codeproject.com/KB/database/CsvReader.aspx

-- reinhard

C# Regex Pattern To Split Strings Separated By Comma Outside , Rating: 3.0/5 (1 vote cast). C# Regex Pattern To Split Strings Separated By Comma Outside Quotation Marks, 3.0 out of 5 based on 1 rating  I believe it's showing up because the regex processor finds the ", " after "asd," and splits "asd," out. It then finds another match (the "'Howdy, Howdy, Howdy'") and splits out everything between the ", " and the "'Howdy, Howdy, Howdy'" (and empty string). Note that the two commas after "asd" are not the problem.

Try (hacked from Jens') in the split method:

",(?:.*?'[^']*?')"

or just add question marks after Jens' *'s, that makes it lazy rather than greedy.

Reading CSV with comma placed within double quotes?, Hi all, I was able to parse and import . C# regular expressions The quotes " got replaced with " and it got split into two columns. Kenny  The positive lookahead ((?=)) ensures that there is an even number of quotes ahead of the comma to split on (i.e. either they occur in pairs, or there are none). [^"]* matches non-quote characters.

... or you could have installed NuGet package LumenWorks CsvReader and done something like below where I read a csv file which has content like for example

"hello","how","hello, how are you"
"hi","hello","greetings"
...

and process it like this

public static void ProcessCsv()
        {
            var filename = @"your_file_path\filename.csv";
            DataTable dt = new DataTable("MyTable");

            List<string> product_codes = new List<string>();
            using (CsvReader csv = new CsvReader(new StreamReader(filename), true))
            {
                int fieldCount = csv.FieldCount;

                string[] headers = csv.GetFieldHeaders();
                for (int i = 0; i < headers.Length; i++)
                {
                     dt.Columns.Add(headers[i], typeof(string));
                }

                while (csv.ReadNextRecord())
                {
                    DataRow dr = dt.NewRow();
                    for (int i = 0; i < fieldCount; i++)
                    {
                        product_codes.Add(csv[i]);
                        dr[i] = csv[i];
                    }
                    dt.Rows.Add(dr);
                }
            }
        }

[Solved] Regular expression for splitting a comma-delimited string , C# · regular-expression. My boss is trying to split a comma-delimited string with Regex. He's looking for a comma followed Try linq with regex example. Hide Copy Code. string s The idea is to split only on comma's that have an even number of or no single quotes after it. Using this expression I got the  Use (C#) Regex.Matches to get an array of any strings found between escaped quotes (your in-field commas should be in fields wrapped in quotes), and replace commas with || before splitting each line into columns/fields. After splitting each line, I looped each line and column to replace || with commas.

C# Regex Split Quotes and Comma Syntax Error, Java Generics<Part-1>: A Basic Introduction; Regex to split a String on comma outside double quotes. [PDF] Regular Expressions: The Complete Tutorial,  This is a really powerful feature in regex, but can be difficult to implement. To practice, try looking at the regex we gave and see if you can modify it to split on a different character, like a semicolon (;). If that was easy, try modifying it so it needs to see two quotation marks on each side of the string. Have a simpler regex string, or some tips on creating them? Let us know in the comments!

Regex to pick commas outside of quotes, splitting on comma outside quotes split commas outside quotes c# split on comma unless in quotes regex tester regex find comma in string regex exclude quotes The Regex.Split methods are similar to the String.Split method, except that Regex.Split splits the string at a delimiter determined by a regular expression instead of a set of characters. The count parameter specifies the maximum number of substrings into which the input string is split; the last string contains the unsplit remainder of the string.

Splitting data inside quotes and comma using regex, Regex to split a String on comma outside double quotes., But you would Regex . c# - Regular Expression to split on spaces unless in quotes - Stack Overflow. C# Regex Split-commas outside quotes (4) . I got quite a lot of strings (segments of SQL code, actually) with the following format:

Comments
  • possible duplicate of Java: splitting a comma-separated string but ignoring commas in quotes
  • Except it's in C#..............
  • Not to mention SO's search didn't show that thread either.
  • Sure, but the regex is practically the same and it's trivial to convert into C#. I just found it worth mentioning since the other thread contains a bit more explanation on the regex.
  • @Jens: Lools like it fails when you have a single quote inside the string value: e.g. 'op','ID','script','Mike's','Content-Length'
  • @Michael: Yes, it fails in that case. But I think most other parsers would fail as well in this case. You'd need to escape the quote (and correct the regex to respoect that).
  • @MichaelNarinsky Single quotes within a quoted string isn't valid to begin with. The value 'op','ID','script','Mike's','Content-Length' is invalid and should be 'op','ID','script','Mike''s','Content-Length' which I believe still works. (According to SQL string escaping)
  • This isn't for CSVs, although Filehelpers looks interesting. Thanks
  • although your sample string is not a CSV file you could still look at it as one row from a CSV. I just wanted to point out, as many others have to people trying to use RegEx for parsing HTML and RegEx is definitely not good for that, that also for parsing CVS like strings it's better to use a parser/helper/whatever instead of plain RegEx.
  • @Hal: just because the sample code is VB doesn't mean you can't use it in C# (add a reference to Microsoft.VisualBasic and add using Microsoft.VisualBasic.FileIO; and you're fine to use TextFieldParser)
  • @pastacool I've been having issues with a CSV for days and just came across this answer. It worked fantastic in my situation, great work!
  • @Hal Your values are separated by commans so why isn't it CSV what stands for Comma Separated Values? Why would be a problem to use an assembly written in VB.Net? On the other hand VisualBasic namespace doesn't necessarly mean the assembly is compiled from Visual Basic, could be any other .Net language.
  • You seem to be missing the point of Jens's regex. The part after the comma has to be a lookahead, and the lookahead has to account for all the remaining quotes. It has to be anchored with $, so non-greedy quantifiers are pointless, and it can't use . because that will make it lose count of the quotes.