C# Sanitize File Name

c# sanitize string
sanitize filename java
sanitize filename php
remove illegal characters from filename c
sanitize filename python
sanitize-filename typescript
make safe filename c
invalid filename characters

I recently have been moving a bunch of MP3s from various locations into a repository. I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException:

"The given path's format is not supported."

This was generated by either File.Copy() or Directory.CreateDirectory().

It didn't take long to realize that my file names needed to be sanitized. So I did the obvious thing:

public static string SanitizePath_(string path, char replaceChar)
{
    string dir = Path.GetDirectoryName(path);
    foreach (char c in Path.GetInvalidPathChars())
        dir = dir.Replace(c, replaceChar);

    string name = Path.GetFileName(path);
    foreach (char c in Path.GetInvalidFileNameChars())
        name = name.Replace(c, replaceChar);

    return dir + name;
}

To my surprise, I continued to get exceptions. It turned out that ':' is not in the set of Path.GetInvalidPathChars(), because it is valid in a path root. I suppose that makes sense - but this has to be a pretty common problem. Does anyone have some short code that sanitizes a path? The most thorough I've come up with this, but it feels like it is probably overkill.

    // replaces invalid characters with replaceChar
    public static string SanitizePath(string path, char replaceChar)
    {
        // construct a list of characters that can't show up in filenames.
        // need to do this because ":" is not in InvalidPathChars
        if (_BadChars == null)
        {
            _BadChars = new List<char>(Path.GetInvalidFileNameChars());
            _BadChars.AddRange(Path.GetInvalidPathChars());
            _BadChars = Utility.GetUnique<char>(_BadChars);
        }

        // remove root
        string root = Path.GetPathRoot(path);
        path = path.Remove(0, root.Length);

        // split on the directory separator character. Need to do this
        // because the separator is not valid in a filename.
        List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));

        // check each part to make sure it is valid.
        for (int i = 0; i < parts.Count; i++)
        {
            string part = parts[i];
            foreach (char c in _BadChars)
            {
                part = part.Replace(c, replaceChar);
            }
            parts[i] = part;
        }

        return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
    }

Any improvements to make this function faster and less baroque would be much appreciated.

To clean up a file name you could do this

private static string MakeValidFileName( string name )
{
   string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
   string invalidRegStr = string.Format( @"([{0}]*\.+$)|([{0}]+)", invalidChars );

   return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}

Sanitized File Name C# · GitHub, Sanitize File Name | Test your C# code online with .NET Fiddle code editor. Sanitized File Name C#. GitHub Gist: instantly share code, notes, and snippets.

A shorter solution:

var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');

Sanitize File Name | C# Online Compiler, This was generated by either File.Copy() or Directory.CreateDirectory() . It didn't take long to realize that my file names needed to be sanitized. So I did the  Sanitize File Name | Test your C# code online with .NET Fiddle code editor.

Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:

/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
    var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
    var invalidReStr = string.Format(@"[{0}]+", invalidChars);

    var reservedWords = new []
    {
        "CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
        "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
        "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
    };

    var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
    foreach (var reservedWord in reservedWords)
    {
        var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
        sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
    }

    return sanitisedNamePart;
}

And these are my unit tests

[Test]
public void CoerceValidFileName_SimpleValid()
{
    var filename = @"thisIsValid.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual(filename, result);
}

[Test]
public void CoerceValidFileName_SimpleInvalid()
{
    var filename = @"thisIsNotValid\3\\_3.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}

[Test]
public void CoerceValidFileName_InvalidExtension()
{
    var filename = @"thisIsNotValid.t\xt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid.t_xt", result);
}

[Test]
public void CoerceValidFileName_KeywordInvalid()
{
    var filename = "aUx.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("_reservedWord_.txt", result);
}

[Test]
public void CoerceValidFileName_KeywordValid()
{
    var filename = "auxillary.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("auxillary.txt", result);
}

Path.GetInvalidFileNameChars Method (System.IO), That's all about how to remove all special characters from String in C#. C# invalid file name characters C# sanitize filename Formatting a valid file name in C#  I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException: “The given path’s format is not supported.” This was generated by either File.Copy() or Directory.CreateDirectory(). It didn’t take long to realize that my file names needed to be sanitized.

string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));

C# Sanitize File Name, Sometimes, I need to create files or folders directly, and use existing data to provide the file name - and then my app throws an exception  The special characters are passed through the sanitize_file_name_chars filter before removing them from the file name, allowing plugins to change which characters are considered invalid. After sanitize_file_name() has done its work, it passes the sanitized file name through the sanitize_file_name filter.

I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems.

I'm using the following code:

foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
    filename = filename.Replace(invalidchar, '_');
}

C# Replace Invalid Filename Characters – Programming , C # Sanitize File Name. J'ai récemment déplacé un groupe de fichiers MP3 de divers endroits dans un référentiel. J'avais construit les nouveaux noms de  sanitize_file_name() is in a class of functions that help you sanitize potentially unsafe data which allow you to pass an arbitrary variable and receive the clean version based on data type. Others include:

Removing characters which are not allowed in Windows filenames , This was generated by either File.Copy() or Directory.CreateDirectory() . It didn't take long to realize that my file names needed to be sanitized. GetFileName (ReadOnlySpan<Char>) Returns the file name and extension of a file path that is represented by a read-only character span. public: static ReadOnlySpan<char> GetFileName (ReadOnlySpan<char> path); C#. public static ReadOnlySpan<char> GetFileName (ReadOnlySpan<char> path);

c#, Here's the function that I am using now (thanks jcollum for the C# example): public static string MakeSafeFilename(string filename, char replaceChar){ foreach (char c in System.IO.Path.GetInvalidFileNameChars()) { filename = filename.Replace(c, replaceChar); } return filename;}

C# Sanitize File Name, Thanks Andre for this cleanup code: To clean up a file name you could do this Next to GetInvalidFileNameChars, you have GetInvalidPathChars. --jeroen via: validation - C# Sanitize File Name - Stack Overflow.

Comments
  • possible duplicate of How to remove illegal characters from path and filenames?
  • The question was about paths, not filenames, and the invalid characters for these are different.
  • Maybe, but this code certainly helped me when I had the same problem :)
  • And another potentially great SO user goes walking... This function is great. Thank you Adrevdm...
  • Great method. Don't forget though that reserved words will still bite you, and you will be left scratching your head. Source: Wikipedia Filename reserved words
  • Periods are invalid characters if they are at the end of the file name so GetInvalidFileNameChars does not include them. It does not throw a exception in windows, it just strips them off, but it could cause unexpected behavior if you are expecting the period to be there. I modified the regex to handle that case to cause . to be considered one of the invalid characters if it is at the end of the string.
  • @PeterMajeed: TIL that line-counting starts at zero :-)
  • This is better than the top answer especially for ASP.NET Core which might return different characters based on platform.
  • This is an extremely complete answer, at least to the filename part of the question, and deserves more upvotes.
  • Minor suggestion since it looks like the method was going this direction: Add a this keyword and it becomes a handy extension method. public static String CoerceValidFileName(this String filename)
  • Small bug: this method doesn't change reserved words without file extensions (eg. COM1), which are also disallowed. Suggested fix would be to change the reservedWordPattern to "^{0}(\\.|$)" and the replacement string to "_reservedWord_$1"
  • consider String.Concat(dirty...) instead of Join(String.Empty...
  • DenNukem already suggested this answer : stackoverflow.com/a/13617375/244916 (same consider comment, though).
  • note that there are many characters that look more similar to those, like the fullwidth form !"#$%&'()*+,-./:;<=>?@{|}~ or other forms of them like / SOLIDUS and ` ⁄ ` FRACTION SLASH that can be used directly in filenames without problem
  • I love one liners :)