如何从路径和文件名中移除非法字符?

时间:2021-08-13 22:16:38

I need a robust and simple way to remove illegal path and file characters from a simple string. I've used the below code but it doesn't seem to do anything, what am I missing?

我需要一个健壮的和简单的方法来消除非法路径和文件字符从一个简单的字符串。我使用了下面的代码,但它似乎什么都做不了,我缺少什么呢?

using System;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string illegal = "\"M<>\"\\a/ry/ h**ad:>> a\\/:*?\"<>| li*tt|le|| la\"mb.?";

            illegal = illegal.Trim(Path.GetInvalidFileNameChars());
            illegal = illegal.Trim(Path.GetInvalidPathChars());

            Console.WriteLine(illegal);
            Console.ReadLine();
        }
    }
}

23 个解决方案

#1


414  

Try something like this instead;

试试这样做吧;

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string invalid = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());

foreach (char c in invalid)
{
    illegal = illegal.Replace(c.ToString(), ""); 
}

But I have to agree with the comments, I'd probably try to deal with the source of the illegal paths, rather than try to mangle an illegal path into a legitimate but probably unintended one.

但我必须同意这些评论,我可能会尝试去处理非法途径的来源,而不是试图将非法途径弄成合法的,但可能是无意的。

Edit: Or a potentially 'better' solution, using Regex's.

编辑:或者使用正则表达式的“更好的”解决方案。

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
illegal = r.Replace(illegal, "");

Still, the question begs to be asked, why you're doing this in the first place.

尽管如此,这个问题还是要问,你为什么要这么做。

#2


184  

I use Linq to clean up filenames. You can easily extend this to check for valid paths as well.

我用Linq来清理文件名。您也可以很容易地将其扩展到检查有效路径。

private static string CleanFileName(string fileName)
{
    return Path.GetInvalidFileNameChars().Aggregate(fileName, (current, c) => current.Replace(c.ToString(), string.Empty));
}

Update

Some comments indicate this method is not working for them so I've included a link to a DotNetFiddle snippet so you may validate the method.

一些注释表明此方法对它们不起作用,因此我包含了一个到DotNetFiddle代码片段的链接,以便您可以验证该方法。

https://dotnetfiddle.net/nw1SWY

https://dotnetfiddle.net/nw1SWY

#3


173  

public string GetSafeFilename(string filename)
{

    return string.Join("_", filename.Split(Path.GetInvalidFileNameChars()));

}

This answer was on another thread by Ceres, I really like it neat and simple.

这个答案是Ceres在另一个线程上的,我真的很喜欢它简洁明了。

#4


80  

You can remove illegal chars using Linq like this:

你可以使用Linq来删除非法字符:

var invalidChars = Path.GetInvalidFileNameChars();

var invalidCharsRemoved = stringWithInvalidChars
.Where(x => !invalidChars.Contains(x))
.ToArray();

EDIT
This is how it looks with the required edit mentioned in the comments:

编辑这是在评论中提到的要求编辑的样子:

var invalidChars = Path.GetInvalidFileNameChars();

string invalidCharsRemoved = new string(stringWithInvalidChars
  .Where(x => !invalidChars.Contains(x))
  .ToArray());

#5


23  

These are all great solutions, but they all rely on Path.GetInvalidFileNameChars, which may not be as reliable as you'd think. Notice the following remark in the MSDN documentation on Path.GetInvalidFileNameChars:

这些都是很好的解决方案,但它们都依赖于路径。GetInvalidFileNameChars,它可能不像您想的那样可靠。请注意MSDN文档中关于Path.GetInvalidFileNameChars的下列注释:

The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names. The full set of invalid characters can vary by file system. For example, on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t).

从该方法返回的数组不保证包含文件和目录名中无效的完整字符集。完整的无效字符集可以因文件系统而异。例如,在基于windows的桌面平台上,无效的路径字符可能包括ASCII/Unicode字符1到31,以及引号("),小于(<),大于(>),pipe (|), backspace (\b), null(\0)和tab (\t)。

It's not any better with Path.GetInvalidPathChars method. It contains the exact same remark.

这不是更好的路径。GetInvalidPathChars方法。它包含了完全相同的评论。

#6


18  

For starters, Trim only removes characters from the beginning or end of the string. Secondly, you should evaluate if you really want to remove the offensive characters, or fail fast and let the user know their filename is invalid. My choice is the latter, but my answer should at least show you how to do things the right AND wrong way:

对于初学者,Trim仅从字符串的开始或结尾删除字符。其次,你应该评估你是否真的想要删除攻击字符,或者快速失败,让用户知道他们的文件名是无效的。我的选择是后者,但我的回答至少应该告诉你如何做正确和错误的事情:

* question showing how to check if a given string is a valid file name. Note you can use the regex from this question to remove characters with a regular expression replacement (if you really need to do this).

显示如何检查给定字符串是否是有效文件名的*问题。注意,您可以从这个问题中使用regex来删除具有正则表达式替换的字符(如果您真的需要这样做)。

#7


15  

I use regular expressions to achieve this. First, I dynamically build the regex.

我使用正则表达式来实现这一点。首先,我动态构建regex。

string regex = string.Format(
                   "[{0}]",
                   Regex.Escape(new string(Path.GetInvalidFileNameChars())));
Regex removeInvalidChars = new Regex(regex, RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.CultureInvariant);

Then I just call removeInvalidChars.Replace to do the find and replace. This can obviously be extended to cover path chars as well.

然后我调用removeInvalidChars。替换为查找和替换。这显然可以扩展到覆盖路径字符。

#8


14  

I absolutely prefer the idea of Jeff Yates. It will work perfectly, if you slightly modify it:

我绝对喜欢杰夫·耶茨的想法。如果你稍微修改一下,它会很完美的:

string regex = String.Format("[{0}]", Regex.Escape(new string(Path.GetInvalidFileNameChars())));
Regex removeInvalidChars = new Regex(regex, RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.CultureInvariant);

The improvement is just to escape the automaticially generated regex.

改进只是为了避免自动生成的正则表达式。

#9


13  

For file names:

文件名:

string cleanFileName = String.Join("", fileName.Split(Path.GetInvalidFileNameChars()));

For full paths:

完整路径:

string cleanPath = String.Join("", path.Split(Path.GetInvalidPathChars()));

#10


11  

Here's a code snippet that should help for .NET 3 and higher.

这里有一个代码片段,可以帮助。net 3和更高版本。

using System.IO;
using System.Text.RegularExpressions;

public static class PathValidation
{
    private static string pathValidatorExpression = "^[^" + string.Join("", Array.ConvertAll(Path.GetInvalidPathChars(), x => Regex.Escape(x.ToString()))) + "]+$";
    private static Regex pathValidator = new Regex(pathValidatorExpression, RegexOptions.Compiled);

    private static string fileNameValidatorExpression = "^[^" + string.Join("", Array.ConvertAll(Path.GetInvalidFileNameChars(), x => Regex.Escape(x.ToString()))) + "]+$";
    private static Regex fileNameValidator = new Regex(fileNameValidatorExpression, RegexOptions.Compiled);

    private static string pathCleanerExpression = "[" + string.Join("", Array.ConvertAll(Path.GetInvalidPathChars(), x => Regex.Escape(x.ToString()))) + "]";
    private static Regex pathCleaner = new Regex(pathCleanerExpression, RegexOptions.Compiled);

    private static string fileNameCleanerExpression = "[" + string.Join("", Array.ConvertAll(Path.GetInvalidFileNameChars(), x => Regex.Escape(x.ToString()))) + "]";
    private static Regex fileNameCleaner = new Regex(fileNameCleanerExpression, RegexOptions.Compiled);

    public static bool ValidatePath(string path)
    {
        return pathValidator.IsMatch(path);
    }

    public static bool ValidateFileName(string fileName)
    {
        return fileNameValidator.IsMatch(fileName);
    }

    public static string CleanPath(string path)
    {
        return pathCleaner.Replace(path, "");
    }

    public static string CleanFileName(string fileName)
    {
        return fileNameCleaner.Replace(fileName, "");
    }
}

#11


11  

The best way to remove illegal character from user input is to replace illegal character using Regex class, create method in code behind or also it validate at client side using RegularExpression control.

从用户输入中删除非法字符的最好方法是使用Regex类替换非法字符,在后面的代码中创建方法,或者在客户端使用正则表达式控件进行验证。

public string RemoveSpecialCharacters(string str)
{
    return Regex.Replace(str, "[^a-zA-Z0-9_]+", "_", RegexOptions.Compiled);
}

OR

<asp:RegularExpressionValidator ID="regxFolderName" 
                                runat="server" 
                                ErrorMessage="Enter folder name with  a-z A-Z0-9_" 
                                ControlToValidate="txtFolderName" 
                                Display="Dynamic" 
                                ValidationExpression="^[a-zA-Z0-9_]*$" 
                                ForeColor="Red">

#12


8  

Most solutions above combine illegal chars for both path and filename which is wrong (even when both calls currently return the same set of chars). I would first split the path+filename in path and filename, then apply the appropriate set to either if them and then combine the two again.

上面的大多数解决方案都将非法字符合并到路径和文件名中,这是错误的(即使两个调用当前都返回相同的字符集)。我将首先在路径和文件名中分割路径+文件名,然后将适当的设置应用到它们,然后再将它们组合在一起。

wvd_vegt

wvd_vegt

#13


6  

If you remove or replace with a single character the invalid characters, you can have collisions:

如果您删除或替换一个字符,无效字符,您可以有冲突:

<abc -> abc
>abc -> abc

Here is a simple method to avoid this:

这里有一个简单的方法来避免:

public static string ReplaceInvalidFileNameChars(string s)
{
    char[] invalidFileNameChars = System.IO.Path.GetInvalidFileNameChars();
    foreach (char c in invalidFileNameChars)
        s = s.Replace(c.ToString(), "[" + Array.IndexOf(invalidFileNameChars, c) + "]");
    return s;
}

The result:

结果:

 <abc -> [1]abc
 >abc -> [2]abc

#14


5  

Throw an exception.

抛出异常。

if ( fileName.IndexOfAny(Path.GetInvalidFileNameChars()) > -1 )
            {
                throw new ArgumentException();
            }

#15


3  

I think it is much easier to validate using a regex and specifiing which characters are allowed, instead of trying to check for all bad characters. See these links: http://www.c-sharpcorner.com/UploadFile/prasad_1/RegExpPSD12062005021717AM/RegExpPSD.aspx http://www.windowsdevcenter.com/pub/a/oreilly/windows/news/csharp_0101.html

我认为使用正则表达式和指定字符来进行验证要容易得多,而不是检查所有的坏字符。查看这些链接:http://www.c- sharp.com/uploadfile/prasad_1/regexppsd12062005021717am/regexppsd.aspx http://www.windowsdevcenter.com/pub/a/oreilly/windows/news/csharp_0101.html。

Also, do a search for "regular expression editor"s, they help a lot. There are some around which even output the code in c# for you.

另外,搜索“正则表达式编辑器”,它们会有很大帮助,甚至有一些代码会为您输出c#中的代码。

#16


3  

I wrote this monster for fun, it lets you roundtrip:

我写这个怪物是为了好玩,它让你往返:

public static class FileUtility
{
    private const char PrefixChar = '%';
    private static readonly int MaxLength;
    private static readonly Dictionary<char,char[]> Illegals;
    static FileUtility()
    {
        List<char> illegal = new List<char> { PrefixChar };
        illegal.AddRange(Path.GetInvalidFileNameChars());
        MaxLength = illegal.Select(x => ((int)x).ToString().Length).Max();
        Illegals = illegal.ToDictionary(x => x, x => ((int)x).ToString("D" + MaxLength).ToCharArray());
    }

    public static string FilenameEncode(string s)
    {
        var builder = new StringBuilder();
        char[] replacement;
        using (var reader = new StringReader(s))
        {
            while (true)
            {
                int read = reader.Read();
                if (read == -1)
                    break;
                char c = (char)read;
                if(Illegals.TryGetValue(c,out replacement))
                {
                    builder.Append(PrefixChar);
                    builder.Append(replacement);
                }
                else
                {
                    builder.Append(c);
                }
            }
        }
        return builder.ToString();
    }

    public static string FilenameDecode(string s)
    {
        var builder = new StringBuilder();
        char[] buffer = new char[MaxLength];
        using (var reader = new StringReader(s))
        {
            while (true)
            {
                int read = reader.Read();
                if (read == -1)
                    break;
                char c = (char)read;
                if (c == PrefixChar)
                {
                    reader.Read(buffer, 0, MaxLength);
                    var encoded =(char) ParseCharArray(buffer);
                    builder.Append(encoded);
                }
                else
                {
                    builder.Append(c);
                }
            }
        }
        return builder.ToString();
    }

    public static int ParseCharArray(char[] buffer)
    {
        int result = 0;
        foreach (char t in buffer)
        {
            int digit = t - '0';
            if ((digit < 0) || (digit > 9))
            {
                throw new ArgumentException("Input string was not in the correct format");
            }
            result *= 10;
            result += digit;
        }
        return result;
    }
}

#17


2  

This seems to be O(n) and does not spend too much memory on strings:

这似乎是O(n)并没有在字符串上花费太多的内存:

    private static readonly HashSet<char> invalidFileNameChars = new HashSet<char>(Path.GetInvalidFileNameChars());

    public static string RemoveInvalidFileNameChars(string name)
    {
        if (!name.Any(c => invalidFileNameChars.Contains(c))) {
            return name;
        }

        return new string(name.Where(c => !invalidFileNameChars.Contains(c)).ToArray());
    }

#18


1  

public static bool IsValidFilename(string testName)
{
    return !new Regex("[" + Regex.Escape(new String(System.IO.Path.GetInvalidFileNameChars())) + "]").IsMatch(testName);
}

#19


1  

public static class StringExtensions
      {
        public static string RemoveUnnecessary(this string source)
        {
            string result = string.Empty;
            string regex = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
            Regex reg = new Regex(string.Format("[{0}]", Regex.Escape(regex)));
            result = reg.Replace(source, "");
            return result;
        }
    }

You can use method clearly.

你可以清楚地使用方法。

#20


0  

This will do want you want, and avoid collisions

这是你想要的,避免碰撞。

 static string SanitiseFilename(string key)
    {
        var invalidChars = Path.GetInvalidFileNameChars();
        var sb = new StringBuilder();
        foreach (var c in key)
        {
            var invalidCharIndex = -1;
            for (var i = 0; i < invalidChars.Length; i++)
            {
                if (c == invalidChars[i])
                {
                    invalidCharIndex = i;
                }
            }
            if (invalidCharIndex > -1)
            {
                sb.Append("_").Append(invalidCharIndex);
                continue;
            }

            if (c == '_')
            {
                sb.Append("__");
                continue;
            }

            sb.Append(c);
        }
        return sb.ToString();

    }

#21


0  

I think the question already not full answered... The answers only describe clean filename OR path... not both. Here is my solution:

我想这个问题已经没有全部答案了……答案只描述干净的文件名或路径…不是两个。这是我的解决方案:

private static string CleanPath(string path)
{
    string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
    Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
    List<string> split = path.Split('\\').ToList();
    string returnValue = split.Aggregate(string.Empty, (current, s) => current + (r.Replace(s, "") + @"\"));
    returnValue = returnValue.TrimEnd('\\');
    return returnValue;
}

#22


0  

Scanning over the answers here, they all** seem to involve using a char array of invalid filename characters.

扫描这里的答案,它们似乎都涉及使用一个无效的文件名字符数组。

Granted, this may be micro-optimising - but for the benefit of anyone who might be looking to check a large number of values for being valid filenames, it's worth noting that building a hashset of invalid chars will bring about notably better performance.

当然,这可能是一种微小的优化——但是对于那些想要检查大量的有效文件名的人来说,值得注意的是,构建一个无效的chars的hashset将会带来更好的性能。

I have been very surprised (shocked) in the past just how quickly a hashset (or dictionary) outperforms iterating over a list. With strings, it's a ridiculously low number (about 5-7 items from memory). With most other simple data (object references, numbers etc) the magic crossover seems to be around 20 items.

我一直很惊讶(震惊)过去是多么迅速hashset(或字典)优于遍历一个列表。有了字符串,这个数字低得可笑(大约5-7个条目来自内存)。对于大多数其他简单数据(对象引用、数字等),神奇的交叉似乎是大约20个条目。

There are 40 invalid characters in the Path.InvalidFileNameChars "list". Did a search today and there's quite a good benchmark here on * that shows the hashset will take a little over half the time of an array/list for 40 items: https://*.com/a/10762995/949129

有40个无效字符的路径。InvalidFileNameChars“列表”。今天进行了搜索,在*上有一个相当不错的基准测试,显示hashset将占用一个数组/列表的一半以上的时间,用于40个项目:https://*.com/a/10762995/949129 ?

Here's the helper class I use for sanitising paths. I forget now why I had the fancy replacement option in it, but it's there as a cute bonus.

这是我用于清理路径的助手类。我现在忘记了为什么我有一个漂亮的替代选择,但它是一个可爱的奖金。

Additional bonus method "IsValidLocalPath" too :)

额外的奖励方法“IsValidLocalPath”:)

(** those which don't use regular expressions)

(**那些不使用正则表达式的)

public static class PathExtensions
{
    private static HashSet<char> _invalidFilenameChars;
    private static HashSet<char> InvalidFilenameChars
    {
        get { return _invalidFilenameChars ?? (_invalidFilenameChars = new HashSet<char>(Path.GetInvalidFileNameChars())); }
    }


    /// <summary>Replaces characters in <c>text</c> that are not allowed in file names with the 
    /// specified replacement character.</summary>
    /// <param name="text">Text to make into a valid filename. The same string is returned if 
    /// it is valid already.</param>
    /// <param name="replacement">Replacement character, or NULL to remove bad characters.</param>
    /// <param name="fancyReplacements">TRUE to replace quotes and slashes with the non-ASCII characters ” and ⁄.</param>
    /// <returns>A string that can be used as a filename. If the output string would otherwise be empty, "_" is returned.</returns>
    public static string ToValidFilename(this string text, char? replacement = '_', bool fancyReplacements = false)
    {
        StringBuilder sb = new StringBuilder(text.Length);
        HashSet<char> invalids = InvalidFilenameChars;
        bool changed = false;

        for (int i = 0; i < text.Length; i++)
        {
            char c = text[i];
            if (invalids.Contains(c))
            {
                changed = true;
                char repl = replacement ?? '\0';
                if (fancyReplacements)
                {
                    if (c == '"') repl = '”'; // U+201D right double quotation mark
                    else if (c == '\'') repl = '’'; // U+2019 right single quotation mark
                    else if (c == '/') repl = '⁄'; // U+2044 fraction slash
                }
                if (repl != '\0')
                    sb.Append(repl);
            }
            else
                sb.Append(c);
        }

        if (sb.Length == 0)
            return "_";

        return changed ? sb.ToString() : text;
    }


    /// <summary>
    /// Returns TRUE if the specified path is a valid, local filesystem path.
    /// </summary>
    /// <param name="pathString"></param>
    /// <returns></returns>
    public static bool IsValidLocalPath(this string pathString)
    {
        // From solution at https://*.com/a/11636052/949129
        Uri pathUri;
        Boolean isValidUri = Uri.TryCreate(pathString, UriKind.Absolute, out pathUri);
        return isValidUri && pathUri != null && pathUri.IsLoopback;
    }
}

#23


-5  

Or you can just do

或者你也可以这么做。

[YOUR STRING].Replace('\\', ' ').Replace('/', ' ').Replace('"', ' ').Replace('*', ' ').Replace(':', ' ').Replace('?', ' ').Replace('<', ' ').Replace('>', ' ').Replace('|', ' ').Trim();

#1


414  

Try something like this instead;

试试这样做吧;

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string invalid = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());

foreach (char c in invalid)
{
    illegal = illegal.Replace(c.ToString(), ""); 
}

But I have to agree with the comments, I'd probably try to deal with the source of the illegal paths, rather than try to mangle an illegal path into a legitimate but probably unintended one.

但我必须同意这些评论,我可能会尝试去处理非法途径的来源,而不是试图将非法途径弄成合法的,但可能是无意的。

Edit: Or a potentially 'better' solution, using Regex's.

编辑:或者使用正则表达式的“更好的”解决方案。

string illegal = "\"M\"\\a/ry/ h**ad:>> a\\/:*?\"| li*tt|le|| la\"mb.?";
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
illegal = r.Replace(illegal, "");

Still, the question begs to be asked, why you're doing this in the first place.

尽管如此,这个问题还是要问,你为什么要这么做。

#2


184  

I use Linq to clean up filenames. You can easily extend this to check for valid paths as well.

我用Linq来清理文件名。您也可以很容易地将其扩展到检查有效路径。

private static string CleanFileName(string fileName)
{
    return Path.GetInvalidFileNameChars().Aggregate(fileName, (current, c) => current.Replace(c.ToString(), string.Empty));
}

Update

Some comments indicate this method is not working for them so I've included a link to a DotNetFiddle snippet so you may validate the method.

一些注释表明此方法对它们不起作用,因此我包含了一个到DotNetFiddle代码片段的链接,以便您可以验证该方法。

https://dotnetfiddle.net/nw1SWY

https://dotnetfiddle.net/nw1SWY

#3


173  

public string GetSafeFilename(string filename)
{

    return string.Join("_", filename.Split(Path.GetInvalidFileNameChars()));

}

This answer was on another thread by Ceres, I really like it neat and simple.

这个答案是Ceres在另一个线程上的,我真的很喜欢它简洁明了。

#4


80  

You can remove illegal chars using Linq like this:

你可以使用Linq来删除非法字符:

var invalidChars = Path.GetInvalidFileNameChars();

var invalidCharsRemoved = stringWithInvalidChars
.Where(x => !invalidChars.Contains(x))
.ToArray();

EDIT
This is how it looks with the required edit mentioned in the comments:

编辑这是在评论中提到的要求编辑的样子:

var invalidChars = Path.GetInvalidFileNameChars();

string invalidCharsRemoved = new string(stringWithInvalidChars
  .Where(x => !invalidChars.Contains(x))
  .ToArray());

#5


23  

These are all great solutions, but they all rely on Path.GetInvalidFileNameChars, which may not be as reliable as you'd think. Notice the following remark in the MSDN documentation on Path.GetInvalidFileNameChars:

这些都是很好的解决方案,但它们都依赖于路径。GetInvalidFileNameChars,它可能不像您想的那样可靠。请注意MSDN文档中关于Path.GetInvalidFileNameChars的下列注释:

The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names. The full set of invalid characters can vary by file system. For example, on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t).

从该方法返回的数组不保证包含文件和目录名中无效的完整字符集。完整的无效字符集可以因文件系统而异。例如,在基于windows的桌面平台上,无效的路径字符可能包括ASCII/Unicode字符1到31,以及引号("),小于(<),大于(>),pipe (|), backspace (\b), null(\0)和tab (\t)。

It's not any better with Path.GetInvalidPathChars method. It contains the exact same remark.

这不是更好的路径。GetInvalidPathChars方法。它包含了完全相同的评论。

#6


18  

For starters, Trim only removes characters from the beginning or end of the string. Secondly, you should evaluate if you really want to remove the offensive characters, or fail fast and let the user know their filename is invalid. My choice is the latter, but my answer should at least show you how to do things the right AND wrong way:

对于初学者,Trim仅从字符串的开始或结尾删除字符。其次,你应该评估你是否真的想要删除攻击字符,或者快速失败,让用户知道他们的文件名是无效的。我的选择是后者,但我的回答至少应该告诉你如何做正确和错误的事情:

* question showing how to check if a given string is a valid file name. Note you can use the regex from this question to remove characters with a regular expression replacement (if you really need to do this).

显示如何检查给定字符串是否是有效文件名的*问题。注意,您可以从这个问题中使用regex来删除具有正则表达式替换的字符(如果您真的需要这样做)。

#7


15  

I use regular expressions to achieve this. First, I dynamically build the regex.

我使用正则表达式来实现这一点。首先,我动态构建regex。

string regex = string.Format(
                   "[{0}]",
                   Regex.Escape(new string(Path.GetInvalidFileNameChars())));
Regex removeInvalidChars = new Regex(regex, RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.CultureInvariant);

Then I just call removeInvalidChars.Replace to do the find and replace. This can obviously be extended to cover path chars as well.

然后我调用removeInvalidChars。替换为查找和替换。这显然可以扩展到覆盖路径字符。

#8


14  

I absolutely prefer the idea of Jeff Yates. It will work perfectly, if you slightly modify it:

我绝对喜欢杰夫·耶茨的想法。如果你稍微修改一下,它会很完美的:

string regex = String.Format("[{0}]", Regex.Escape(new string(Path.GetInvalidFileNameChars())));
Regex removeInvalidChars = new Regex(regex, RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.CultureInvariant);

The improvement is just to escape the automaticially generated regex.

改进只是为了避免自动生成的正则表达式。

#9


13  

For file names:

文件名:

string cleanFileName = String.Join("", fileName.Split(Path.GetInvalidFileNameChars()));

For full paths:

完整路径:

string cleanPath = String.Join("", path.Split(Path.GetInvalidPathChars()));

#10


11  

Here's a code snippet that should help for .NET 3 and higher.

这里有一个代码片段,可以帮助。net 3和更高版本。

using System.IO;
using System.Text.RegularExpressions;

public static class PathValidation
{
    private static string pathValidatorExpression = "^[^" + string.Join("", Array.ConvertAll(Path.GetInvalidPathChars(), x => Regex.Escape(x.ToString()))) + "]+$";
    private static Regex pathValidator = new Regex(pathValidatorExpression, RegexOptions.Compiled);

    private static string fileNameValidatorExpression = "^[^" + string.Join("", Array.ConvertAll(Path.GetInvalidFileNameChars(), x => Regex.Escape(x.ToString()))) + "]+$";
    private static Regex fileNameValidator = new Regex(fileNameValidatorExpression, RegexOptions.Compiled);

    private static string pathCleanerExpression = "[" + string.Join("", Array.ConvertAll(Path.GetInvalidPathChars(), x => Regex.Escape(x.ToString()))) + "]";
    private static Regex pathCleaner = new Regex(pathCleanerExpression, RegexOptions.Compiled);

    private static string fileNameCleanerExpression = "[" + string.Join("", Array.ConvertAll(Path.GetInvalidFileNameChars(), x => Regex.Escape(x.ToString()))) + "]";
    private static Regex fileNameCleaner = new Regex(fileNameCleanerExpression, RegexOptions.Compiled);

    public static bool ValidatePath(string path)
    {
        return pathValidator.IsMatch(path);
    }

    public static bool ValidateFileName(string fileName)
    {
        return fileNameValidator.IsMatch(fileName);
    }

    public static string CleanPath(string path)
    {
        return pathCleaner.Replace(path, "");
    }

    public static string CleanFileName(string fileName)
    {
        return fileNameCleaner.Replace(fileName, "");
    }
}

#11


11  

The best way to remove illegal character from user input is to replace illegal character using Regex class, create method in code behind or also it validate at client side using RegularExpression control.

从用户输入中删除非法字符的最好方法是使用Regex类替换非法字符,在后面的代码中创建方法,或者在客户端使用正则表达式控件进行验证。

public string RemoveSpecialCharacters(string str)
{
    return Regex.Replace(str, "[^a-zA-Z0-9_]+", "_", RegexOptions.Compiled);
}

OR

<asp:RegularExpressionValidator ID="regxFolderName" 
                                runat="server" 
                                ErrorMessage="Enter folder name with  a-z A-Z0-9_" 
                                ControlToValidate="txtFolderName" 
                                Display="Dynamic" 
                                ValidationExpression="^[a-zA-Z0-9_]*$" 
                                ForeColor="Red">

#12


8  

Most solutions above combine illegal chars for both path and filename which is wrong (even when both calls currently return the same set of chars). I would first split the path+filename in path and filename, then apply the appropriate set to either if them and then combine the two again.

上面的大多数解决方案都将非法字符合并到路径和文件名中,这是错误的(即使两个调用当前都返回相同的字符集)。我将首先在路径和文件名中分割路径+文件名,然后将适当的设置应用到它们,然后再将它们组合在一起。

wvd_vegt

wvd_vegt

#13


6  

If you remove or replace with a single character the invalid characters, you can have collisions:

如果您删除或替换一个字符,无效字符,您可以有冲突:

<abc -> abc
>abc -> abc

Here is a simple method to avoid this:

这里有一个简单的方法来避免:

public static string ReplaceInvalidFileNameChars(string s)
{
    char[] invalidFileNameChars = System.IO.Path.GetInvalidFileNameChars();
    foreach (char c in invalidFileNameChars)
        s = s.Replace(c.ToString(), "[" + Array.IndexOf(invalidFileNameChars, c) + "]");
    return s;
}

The result:

结果:

 <abc -> [1]abc
 >abc -> [2]abc

#14


5  

Throw an exception.

抛出异常。

if ( fileName.IndexOfAny(Path.GetInvalidFileNameChars()) > -1 )
            {
                throw new ArgumentException();
            }

#15


3  

I think it is much easier to validate using a regex and specifiing which characters are allowed, instead of trying to check for all bad characters. See these links: http://www.c-sharpcorner.com/UploadFile/prasad_1/RegExpPSD12062005021717AM/RegExpPSD.aspx http://www.windowsdevcenter.com/pub/a/oreilly/windows/news/csharp_0101.html

我认为使用正则表达式和指定字符来进行验证要容易得多,而不是检查所有的坏字符。查看这些链接:http://www.c- sharp.com/uploadfile/prasad_1/regexppsd12062005021717am/regexppsd.aspx http://www.windowsdevcenter.com/pub/a/oreilly/windows/news/csharp_0101.html。

Also, do a search for "regular expression editor"s, they help a lot. There are some around which even output the code in c# for you.

另外,搜索“正则表达式编辑器”,它们会有很大帮助,甚至有一些代码会为您输出c#中的代码。

#16


3  

I wrote this monster for fun, it lets you roundtrip:

我写这个怪物是为了好玩,它让你往返:

public static class FileUtility
{
    private const char PrefixChar = '%';
    private static readonly int MaxLength;
    private static readonly Dictionary<char,char[]> Illegals;
    static FileUtility()
    {
        List<char> illegal = new List<char> { PrefixChar };
        illegal.AddRange(Path.GetInvalidFileNameChars());
        MaxLength = illegal.Select(x => ((int)x).ToString().Length).Max();
        Illegals = illegal.ToDictionary(x => x, x => ((int)x).ToString("D" + MaxLength).ToCharArray());
    }

    public static string FilenameEncode(string s)
    {
        var builder = new StringBuilder();
        char[] replacement;
        using (var reader = new StringReader(s))
        {
            while (true)
            {
                int read = reader.Read();
                if (read == -1)
                    break;
                char c = (char)read;
                if(Illegals.TryGetValue(c,out replacement))
                {
                    builder.Append(PrefixChar);
                    builder.Append(replacement);
                }
                else
                {
                    builder.Append(c);
                }
            }
        }
        return builder.ToString();
    }

    public static string FilenameDecode(string s)
    {
        var builder = new StringBuilder();
        char[] buffer = new char[MaxLength];
        using (var reader = new StringReader(s))
        {
            while (true)
            {
                int read = reader.Read();
                if (read == -1)
                    break;
                char c = (char)read;
                if (c == PrefixChar)
                {
                    reader.Read(buffer, 0, MaxLength);
                    var encoded =(char) ParseCharArray(buffer);
                    builder.Append(encoded);
                }
                else
                {
                    builder.Append(c);
                }
            }
        }
        return builder.ToString();
    }

    public static int ParseCharArray(char[] buffer)
    {
        int result = 0;
        foreach (char t in buffer)
        {
            int digit = t - '0';
            if ((digit < 0) || (digit > 9))
            {
                throw new ArgumentException("Input string was not in the correct format");
            }
            result *= 10;
            result += digit;
        }
        return result;
    }
}

#17


2  

This seems to be O(n) and does not spend too much memory on strings:

这似乎是O(n)并没有在字符串上花费太多的内存:

    private static readonly HashSet<char> invalidFileNameChars = new HashSet<char>(Path.GetInvalidFileNameChars());

    public static string RemoveInvalidFileNameChars(string name)
    {
        if (!name.Any(c => invalidFileNameChars.Contains(c))) {
            return name;
        }

        return new string(name.Where(c => !invalidFileNameChars.Contains(c)).ToArray());
    }

#18


1  

public static bool IsValidFilename(string testName)
{
    return !new Regex("[" + Regex.Escape(new String(System.IO.Path.GetInvalidFileNameChars())) + "]").IsMatch(testName);
}

#19


1  

public static class StringExtensions
      {
        public static string RemoveUnnecessary(this string source)
        {
            string result = string.Empty;
            string regex = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
            Regex reg = new Regex(string.Format("[{0}]", Regex.Escape(regex)));
            result = reg.Replace(source, "");
            return result;
        }
    }

You can use method clearly.

你可以清楚地使用方法。

#20


0  

This will do want you want, and avoid collisions

这是你想要的,避免碰撞。

 static string SanitiseFilename(string key)
    {
        var invalidChars = Path.GetInvalidFileNameChars();
        var sb = new StringBuilder();
        foreach (var c in key)
        {
            var invalidCharIndex = -1;
            for (var i = 0; i < invalidChars.Length; i++)
            {
                if (c == invalidChars[i])
                {
                    invalidCharIndex = i;
                }
            }
            if (invalidCharIndex > -1)
            {
                sb.Append("_").Append(invalidCharIndex);
                continue;
            }

            if (c == '_')
            {
                sb.Append("__");
                continue;
            }

            sb.Append(c);
        }
        return sb.ToString();

    }

#21


0  

I think the question already not full answered... The answers only describe clean filename OR path... not both. Here is my solution:

我想这个问题已经没有全部答案了……答案只描述干净的文件名或路径…不是两个。这是我的解决方案:

private static string CleanPath(string path)
{
    string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
    Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
    List<string> split = path.Split('\\').ToList();
    string returnValue = split.Aggregate(string.Empty, (current, s) => current + (r.Replace(s, "") + @"\"));
    returnValue = returnValue.TrimEnd('\\');
    return returnValue;
}

#22


0  

Scanning over the answers here, they all** seem to involve using a char array of invalid filename characters.

扫描这里的答案,它们似乎都涉及使用一个无效的文件名字符数组。

Granted, this may be micro-optimising - but for the benefit of anyone who might be looking to check a large number of values for being valid filenames, it's worth noting that building a hashset of invalid chars will bring about notably better performance.

当然,这可能是一种微小的优化——但是对于那些想要检查大量的有效文件名的人来说,值得注意的是,构建一个无效的chars的hashset将会带来更好的性能。

I have been very surprised (shocked) in the past just how quickly a hashset (or dictionary) outperforms iterating over a list. With strings, it's a ridiculously low number (about 5-7 items from memory). With most other simple data (object references, numbers etc) the magic crossover seems to be around 20 items.

我一直很惊讶(震惊)过去是多么迅速hashset(或字典)优于遍历一个列表。有了字符串,这个数字低得可笑(大约5-7个条目来自内存)。对于大多数其他简单数据(对象引用、数字等),神奇的交叉似乎是大约20个条目。

There are 40 invalid characters in the Path.InvalidFileNameChars "list". Did a search today and there's quite a good benchmark here on * that shows the hashset will take a little over half the time of an array/list for 40 items: https://*.com/a/10762995/949129

有40个无效字符的路径。InvalidFileNameChars“列表”。今天进行了搜索,在*上有一个相当不错的基准测试,显示hashset将占用一个数组/列表的一半以上的时间,用于40个项目:https://*.com/a/10762995/949129 ?

Here's the helper class I use for sanitising paths. I forget now why I had the fancy replacement option in it, but it's there as a cute bonus.

这是我用于清理路径的助手类。我现在忘记了为什么我有一个漂亮的替代选择,但它是一个可爱的奖金。

Additional bonus method "IsValidLocalPath" too :)

额外的奖励方法“IsValidLocalPath”:)

(** those which don't use regular expressions)

(**那些不使用正则表达式的)

public static class PathExtensions
{
    private static HashSet<char> _invalidFilenameChars;
    private static HashSet<char> InvalidFilenameChars
    {
        get { return _invalidFilenameChars ?? (_invalidFilenameChars = new HashSet<char>(Path.GetInvalidFileNameChars())); }
    }


    /// <summary>Replaces characters in <c>text</c> that are not allowed in file names with the 
    /// specified replacement character.</summary>
    /// <param name="text">Text to make into a valid filename. The same string is returned if 
    /// it is valid already.</param>
    /// <param name="replacement">Replacement character, or NULL to remove bad characters.</param>
    /// <param name="fancyReplacements">TRUE to replace quotes and slashes with the non-ASCII characters ” and ⁄.</param>
    /// <returns>A string that can be used as a filename. If the output string would otherwise be empty, "_" is returned.</returns>
    public static string ToValidFilename(this string text, char? replacement = '_', bool fancyReplacements = false)
    {
        StringBuilder sb = new StringBuilder(text.Length);
        HashSet<char> invalids = InvalidFilenameChars;
        bool changed = false;

        for (int i = 0; i < text.Length; i++)
        {
            char c = text[i];
            if (invalids.Contains(c))
            {
                changed = true;
                char repl = replacement ?? '\0';
                if (fancyReplacements)
                {
                    if (c == '"') repl = '”'; // U+201D right double quotation mark
                    else if (c == '\'') repl = '’'; // U+2019 right single quotation mark
                    else if (c == '/') repl = '⁄'; // U+2044 fraction slash
                }
                if (repl != '\0')
                    sb.Append(repl);
            }
            else
                sb.Append(c);
        }

        if (sb.Length == 0)
            return "_";

        return changed ? sb.ToString() : text;
    }


    /// <summary>
    /// Returns TRUE if the specified path is a valid, local filesystem path.
    /// </summary>
    /// <param name="pathString"></param>
    /// <returns></returns>
    public static bool IsValidLocalPath(this string pathString)
    {
        // From solution at https://*.com/a/11636052/949129
        Uri pathUri;
        Boolean isValidUri = Uri.TryCreate(pathString, UriKind.Absolute, out pathUri);
        return isValidUri && pathUri != null && pathUri.IsLoopback;
    }
}

#23


-5  

Or you can just do

或者你也可以这么做。

[YOUR STRING].Replace('\\', ' ').Replace('/', ' ').Replace('"', ' ').Replace('*', ' ').Replace(':', ' ').Replace('?', ' ').Replace('<', ' ').Replace('>', ' ').Replace('|', ' ').Trim();