我应该在数据库中的URL中存储空格吗?如果是这样,在将它们放入时,如何对它们进行编码?

时间:2022-12-17 07:24:29

In my blog, I store URIs on entities to allow them be customised (and friendly). Originally, they could contain spaces (eg. "/tags/ASP.NET MVC"), but the W3C validation says spaces are not valid.

在我的博客中,我将URI存储在实体上以允许它们被定制(和友好)。最初,它们可以包含空格(例如“/tags/ASP.NET MVC”),但W3C验证表明空格无效。

The System.Uri class takes spaces, and seems to encode them as I want (eg. /tags/ASP.NET MVC becomes /tags/ASP.NET%20MVC), but I don't want to create a Uri just to throw it away, this feels dirty!

System.Uri类占用空格,似乎按我的意愿对它们进行编码(例如/tags/ASP.NET MVC变为/tags/ASP.NET%20MVC),但我不想创建一个Uri只是为了抛出它走了,这感觉很脏!

Note: None of Html.Encode, Html.AttributeEncode and Url.Encode will encode "/tags/ASP.NET MVC" to "/tags/ASP.NET%20MVC".

注意:Html.Encode,Html.AttributeEncode和Url.Encode都不会将“/tags/ASP.NET MVC”编码为“/tags/ASP.NET%20MVC”。


Edit: I edited the DataType part out of my question as it turns out DataType does not directly provide any validation, and there's no built-in URI validation. I found some extra validators at dataannotationsextensions.org but it only supports absolute URIs and it looks like spaces my be valid there too.

编辑:我编辑了DataType部分,因为事实证明DataType没有直接提供任何验证,也没有内置的URI验证。我在dataannotationsextensions.org上找到了一些额外的验证器,但它只支持绝对URI,它看起来像空格我也有效。

4 个解决方案

#1


2  

It seems that the only sensible thing to do is not allow spaces in URLs. Support for encoding them correctly seems flaky in .NET :(

似乎唯一明智的做法是不允许URL中的空格。支持正确编码它们在.NET中看起来很脆弱:(

I'm going to instead replace spaces with a dash when I auto-generate them, and validate they only contain certain characters (alphanumeric, dots, dashes, slashes).

当我自动生成空格时,我会用短划线替换空格,并验证它们只包含某些字符(字母数字,点,短划线,斜线)。

I think the best way to use them would be to store %20 in the DB, as the space is "unsafe" and it seems non-trivial to then encode them in a way that will pass the W3C validator in .NET.

我认为使用它们的最佳方法是将%20存储在数据库中,因为空间是“不安全的”,然后以一种通过.NET中的W3C验证器的方式对它们进行编码似乎并非易事。

#2


0  

URI and URLs are two different things, URLs being a subset of URIs. As such, a URL has different restrictions to URIs.

URI和URL是两个不同的东西,URL是URI的子集。因此,URL对URI具有不同的限制。

To encode a path string to proper W3C URL encoding standards, use HttpUtility.UrlPathEncode(string). It'll add the encoded spaces you're after.

要将路径字符串编码为正确的W3C URL编码标准,请使用HttpUtility.UrlPathEncode(string)。它会添加你想要的编码空间。

You should store your URLs in whatever form that is most useful for you to work with them. It can be useful to refer to them as URIs until the point at which you encode them into a URL-compliant format, but that's just semantics to help your design be a little clearer.

您应该以任何对您最有用的形式存储您的URL。将它们称为URI可能很有用,直到将它们编码为符合URL的格式,但这只是语义,以帮助您的设计更清晰。

EDIT:

If you don't like the slashes being encoded, it's pretty simple to "decode" them by replacing the encoded %2f with the simpler /:

如果你不喜欢被编码的斜杠,通过用更简单的/替换编码的%2f来“解码”它们非常简单:

var path = "/tags/ASP.NET MVC";
var url = HttpUtility.UrlPathEncode(path).Replace("%2f", "/");

#3


0  

I haven't used it, but UrlPathEncode sounds like it may give what you want.

我没有使用它,但UrlPathEncode听起来可能会给你想要的东西。

You can encode a URL using with the UrlEncode() method or the UrlPathEncode() method. However, the methods return different results. The UrlEncode() method converts each space character to a plus character (+). The UrlPathEncode() method converts each space character into the string "%20", which represents a space in hexadecimal notation.

您可以使用UrlEncode()方法或UrlPathEncode()方法对URL进行编码。但是,这些方法会返回不同的结果。 UrlEncode()方法将每个空格字符转换为加号字符(+)。 UrlPathEncode()方法将每个空格字符转换为字符串“%20”,该字符串以十六进制表示法表示空格。

EDIT: The javascript method encodeURI will use %20 instead of +. Add a reference to Microsoft.JScript and call GlobalObject.encodeURI. Tried the method here and you get the result you're looking for:

编辑:javascript方法encodeURI将使用%20而不是+。添加对Microsoft.JScript的引用并调用GlobalObject.encodeURI。尝试了这里的方法,你得到了你正在寻找的结果:

#4


0  

I asked this similar question a while ago. The short answer was to replace spaces with "-" and then back out again. This is the source I used:

我刚才问了这个类似的问题。简短的回答是用“ - ”替换空格然后再次退出。这是我使用的来源:

private static string EncodeTitleInternal(string title)
{
        if (string.IsNullOrEmpty(title))
                return title;

        // Search engine friendly slug routine with help from http://www.intrepidstudios.com/blog/2009/2/10/function-to-generate-a-url-friendly-string.aspx

        // remove invalid characters
        title = Regex.Replace(title, @"[^\w\d\s-]", "");  // this is unicode safe, but may need to revert back to 'a-zA-Z0-9', need to check spec

        // convert multiple spaces/hyphens into one space       
        title = Regex.Replace(title, @"[\s-]+", " ").Trim(); 

        // If it's over 30 chars, take the first 30.
        title = title.Substring(0, title.Length <= 75 ? title.Length : 75).Trim(); 

        // hyphenate spaces
        title = Regex.Replace(title, @"\s", "-");

        return title;
}

#1


2  

It seems that the only sensible thing to do is not allow spaces in URLs. Support for encoding them correctly seems flaky in .NET :(

似乎唯一明智的做法是不允许URL中的空格。支持正确编码它们在.NET中看起来很脆弱:(

I'm going to instead replace spaces with a dash when I auto-generate them, and validate they only contain certain characters (alphanumeric, dots, dashes, slashes).

当我自动生成空格时,我会用短划线替换空格,并验证它们只包含某些字符(字母数字,点,短划线,斜线)。

I think the best way to use them would be to store %20 in the DB, as the space is "unsafe" and it seems non-trivial to then encode them in a way that will pass the W3C validator in .NET.

我认为使用它们的最佳方法是将%20存储在数据库中,因为空间是“不安全的”,然后以一种通过.NET中的W3C验证器的方式对它们进行编码似乎并非易事。

#2


0  

URI and URLs are two different things, URLs being a subset of URIs. As such, a URL has different restrictions to URIs.

URI和URL是两个不同的东西,URL是URI的子集。因此,URL对URI具有不同的限制。

To encode a path string to proper W3C URL encoding standards, use HttpUtility.UrlPathEncode(string). It'll add the encoded spaces you're after.

要将路径字符串编码为正确的W3C URL编码标准,请使用HttpUtility.UrlPathEncode(string)。它会添加你想要的编码空间。

You should store your URLs in whatever form that is most useful for you to work with them. It can be useful to refer to them as URIs until the point at which you encode them into a URL-compliant format, but that's just semantics to help your design be a little clearer.

您应该以任何对您最有用的形式存储您的URL。将它们称为URI可能很有用,直到将它们编码为符合URL的格式,但这只是语义,以帮助您的设计更清晰。

EDIT:

If you don't like the slashes being encoded, it's pretty simple to "decode" them by replacing the encoded %2f with the simpler /:

如果你不喜欢被编码的斜杠,通过用更简单的/替换编码的%2f来“解码”它们非常简单:

var path = "/tags/ASP.NET MVC";
var url = HttpUtility.UrlPathEncode(path).Replace("%2f", "/");

#3


0  

I haven't used it, but UrlPathEncode sounds like it may give what you want.

我没有使用它,但UrlPathEncode听起来可能会给你想要的东西。

You can encode a URL using with the UrlEncode() method or the UrlPathEncode() method. However, the methods return different results. The UrlEncode() method converts each space character to a plus character (+). The UrlPathEncode() method converts each space character into the string "%20", which represents a space in hexadecimal notation.

您可以使用UrlEncode()方法或UrlPathEncode()方法对URL进行编码。但是,这些方法会返回不同的结果。 UrlEncode()方法将每个空格字符转换为加号字符(+)。 UrlPathEncode()方法将每个空格字符转换为字符串“%20”,该字符串以十六进制表示法表示空格。

EDIT: The javascript method encodeURI will use %20 instead of +. Add a reference to Microsoft.JScript and call GlobalObject.encodeURI. Tried the method here and you get the result you're looking for:

编辑:javascript方法encodeURI将使用%20而不是+。添加对Microsoft.JScript的引用并调用GlobalObject.encodeURI。尝试了这里的方法,你得到了你正在寻找的结果:

#4


0  

I asked this similar question a while ago. The short answer was to replace spaces with "-" and then back out again. This is the source I used:

我刚才问了这个类似的问题。简短的回答是用“ - ”替换空格然后再次退出。这是我使用的来源:

private static string EncodeTitleInternal(string title)
{
        if (string.IsNullOrEmpty(title))
                return title;

        // Search engine friendly slug routine with help from http://www.intrepidstudios.com/blog/2009/2/10/function-to-generate-a-url-friendly-string.aspx

        // remove invalid characters
        title = Regex.Replace(title, @"[^\w\d\s-]", "");  // this is unicode safe, but may need to revert back to 'a-zA-Z0-9', need to check spec

        // convert multiple spaces/hyphens into one space       
        title = Regex.Replace(title, @"[\s-]+", " ").Trim(); 

        // If it's over 30 chars, take the first 30.
        title = title.Substring(0, title.Length <= 75 ? title.Length : 75).Trim(); 

        // hyphenate spaces
        title = Regex.Replace(title, @"\s", "-");

        return title;
}