Strange IndexOf of a string in c# which returns -1

I have a string which looks like:

var a = @"DISC INFO:


And I want to detect if DISC INFO exists in that string.

I did in simple way:

var index = a.IndexOf("disc info", StringComparison.OrdinalIgnoreCase);

It returns me -1 ...

Why ? I expected to find it

The entire C# code:

There was a U+2002 : EN SPACE {nut} in between DISC and INFO.

I personally check this with notepad++, I'm not sure if you need any special settings to see the characters but this is how it looks:

So when using a normal space to match it won't work.

Ok then, how to detect that strange character with space in .net c# ?

Probably your best bet is to use a regex instead of a simple match; the \s token matches unicode whitespace, not just literal space character (ASCII 32):

var match = Regex.Match(a, @"disc\sinfo", RegexOptions.IgnoreCase);

(you can look at match.Success and match.Index, etc)

Note, however, that it is not quite true that everything that looks and smells like a space is categorized as a space in the unicode tables. Plus: the unicode tables evolve over time, so it depends which unicode version Regex on your runtime and operating system is using. Mostly it'll work, though.

Problem with your original string.

  • index -1 means it wasn't found. I'm looking at the fiddle now
  • I know what means ... I expected to find it...since it appear at beginning of string
  • I found the issue. the space in between disc and info isn't an actual space! If you remove that "space" and put a normal one there your program will work
  • @EpicKip beat me to it; you should post that as an answer, IMO
  • You can also just copy/paste the text into my little tool at
  • A different approach would be a = new string(a.ToCharArray().Select(c => char.GetUnicodeCategory(c) == UnicodeCategory.SpaceSeparator ? ' ' : c).ToArray()); where we only replace those that have the space separator category, I guess that doesn't include newlines and tabs and such, as they are control.
  • Ok then, how to detect that strange character with space in .net c# ?
  • You will have to special case it, if you expect this to happen in the future. You can replace all characters in the string that have unicode category Space with an actual space, something like a = new string(a.ToCharArray().Select(c => char.GetUnicodeCategory(c) == UnicodeCategory.SpaceSeparator ? ' ' : c).ToArray()); You can probably write a more optimal version of this using a StringBuilder.
  • @LasseVågsætherKarlsen or... 1 line with regex?
  • Depends on what you want :) Personally I would use the space separator approach, though I would probably write it differently, to avoid replacing newlines and such.
  • I guess you could do something like this in your regex: \p{Zs} instead of \s to limit to only space characters.