如何在HTTP中对内容配置头的文件名参数进行编码?

时间:2022-10-19 17:59:51

Web applications that want to force a resource to be downloaded rather than directly rendered in a Web browser issue a Content-Disposition header in the HTTP response of the form:

希望强制下载资源而不是在Web浏览器中直接呈现的Web应用程序在表单的HTTP响应中发出内容配置头:

Content-Disposition: attachment; filename=FILENAME

附加:附件;文件名=文件名

The filename parameter can be used to suggest a name for the file into which the resource is downloaded by the browser. RFC 2183 (Content-Disposition), however, states in section 2.3 (The Filename Parameter) that the file name can only use US-ASCII characters:

文件名参数可用于建议浏览器将资源下载到的文件的名称。然而,RFC 2183(内容配置)第2.3节(文件名参数)规定文件名只能使用US-ASCII字符:

Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII. We recognize the great desirability of allowing arbitrary character sets in filenames, but it is beyond the scope of this document to define the necessary mechanisms.

当前的[RFC 2045]语法将参数值(以及内容配置文件名)限制为US-ASCII。我们认识到在文件名中允许任意字符集是非常可取的,但是它超出了这个文档的范围来定义必要的机制。

There is empirical evidence, nevertheless, that most popular Web browsers today seem to permit non-US-ASCII characters yet (for the lack of a standard) disagree on the encoding scheme and character set specification of the file name. Question is then, what are the various schemes and encodings employed by the popular browsers if the file name “naïvefile” (without quotes and where the third letter is U+00EF) needed to be encoded into the Content-Disposition header?

然而,有经验证据表明,当今最流行的Web浏览器似乎允许非us - ascii字符(由于缺乏标准),但在文件名的编码方案和字符集规范上存在分歧。问题是,如果需要将文件名“naivefile”(没有引号,第三个字母是U+00EF)编码到内容配置头中,那么流行的浏览器使用的各种方案和编码是什么?

For the purpose of this question, popular browsers being:

为了这个问题,流行的浏览器是:

  • Firefox
  • 火狐
  • Internet Explorer
  • Internet Explorer
  • Safari
  • Safari
  • Google Chrome
  • 谷歌浏览器
  • Opera
  • 歌剧

17 个解决方案

#1


79  

There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."

在RFC 5987“超文本传输协议(HTTP)头字段参数的字符集和语言编码”中,讨论了这个问题,包括浏览器测试和向后兼容性的链接。

RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.

RFC 2183表示,这些报头应该按照RFC 2184进行编码,RFC 2231将其淘汰,上面的RFC草案将对此进行介绍。

#2


297  

I know this is an old post but it is still very relevant. I have found that modern browsers support rfc5987, which allows utf-8 encoding, percentage encoded (url-encoded). Then Naïve file.txt becomes:

我知道这是一个古老的帖子,但它仍然非常相关。我发现现代浏览器支持rfc5987,它允许utf-8编码,百分比编码(url编码)。然后天真的文件。txt就变成:

Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt

Safari (5) does not support this. Instead you should use the Safari standard of writing the file name directly in your utf-8 encoded header:

Safari(5)不支持这一点。相反,您应该使用Safari标准,将文件名直接写入您的utf-8编码头中:

Content-Disposition: attachment; filename=Naïve file.txt

IE8 and older don't support it either and you need to use the IE standard of utf-8 encoding, percentage encoded:

IE8及以上版本也不支持,需要使用utf-8编码的IE标准,百分比编码:

Content-Disposition: attachment; filename=Na%C3%AFve%20file.txt

In ASP.Net I use the following code:

在ASP。我使用以下代码:

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.Browser.Browser == "Safari")
    contentDisposition = "attachment; filename=" + fileName;
else
    contentDisposition = "attachment; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

I tested the above using IE7, IE8, IE9, Chrome 13, Opera 11, FF5, Safari 5.

我使用IE7、IE8、IE9、Chrome 13、Opera 11、FF5、Safari 5测试了上述功能。

Update November 2013:

2013年11月更新:

Here is the code I currently use. I still have to support IE8, so I cannot get rid of the first part. It turns out that browsers on Android use the built in Android download manager and it cannot reliably parse file names in the standard way.

这是我目前使用的代码。我仍然需要支持IE8,所以我不能去掉第一部分。事实证明,Android上的浏览器使用内置的Android下载管理器,无法以标准方式可靠地解析文件名。

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.UserAgent != null && Request.UserAgent.ToLowerInvariant().Contains("android")) // android built-in download manager (all browsers on android)
    contentDisposition = "attachment; filename=\"" + MakeAndroidSafeFileName(fileName) + "\"";
else
    contentDisposition = "attachment; filename=\"" + fileName + "\"; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

The above now tested in IE7-11, Chrome 32, Opera 12, FF25, Safari 6, using this filename for download: 你好abcABCæøåÆØÅäöüïëêîâéíáóúýñ½§!#¤%&()=`@£$€{[]}+´¨^~'-_,;.txt

上面现在测试在IE7-11,Chrome 32,歌剧12日FF25,Safari 6,使用该文件名下载:你好abcABCæøaÆØAaouieeiaeiaouyn½§! #¤% &()= @£€美元{[]} +´¨^ ~”_,,. txt

On IE7 it works for some characters but not all. But who cares about IE7 nowadays?

在IE7上,它适用于一些字符,但不是所有字符。但是现在谁会关心IE7呢?

This is the function I use to generate safe file names for Android. Note that I don't know which characters are supported on Android but that I have tested that these work for sure:

这是我用来为Android生成安全文件名的函数。请注意,我不知道Android支持哪些字符,但我已经测试过这些功能:

private static readonly Dictionary<char, char> AndroidAllowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-+,@£$€!½§~'=()[]{}0123456789".ToDictionary(c => c);
private string MakeAndroidSafeFileName(string fileName)
{
    char[] newFileName = fileName.ToCharArray();
    for (int i = 0; i < newFileName.Length; i++)
    {
        if (!AndroidAllowedChars.ContainsKey(newFileName[i]))
            newFileName[i] = '_';
    }
    return new string(newFileName);
}

@TomZ: I tested in IE7 and IE8 and it turned out that I did not need to escape apostrophe ('). Do you have an example where it fails?

@TomZ:我在IE7和IE8上做了测试,结果发现我不需要逃离撇号(')。你有失败的例子吗?

@Dave Van den Eynde: Combining the two file names on one line as according to RFC6266 works except for Android and IE7+8 and I have updated the code to reflect this. Thank you for the suggestion.

@Dave Van den Eynde:根据RFC6266在一行中合并两个文件名,除了Android和IE7+8,我已经更新了代码以反映这一点。谢谢你的建议。

@Thilo: No idea about GoodReader or any other non-browser. You might have some luck using the Android approach.

@Thilo:不知道GoodReader或其他非浏览器。使用Android方法可能会有一些运气。

@Alex Zhukovskiy: I don't know why but as discussed on Connect it doesn't seem to work terribly well.

@Alex Zhukovskiy:我不知道为什么,但正如我们在Connect中讨论的,它似乎并不是很有效。

#3


145  

  • There is no interoperable way to encode non-ASCII names in Content-Disposition. Browser compatibility is a mess.

    在内容配置中,没有可互操作的方式来编码非ascii名称。浏览器兼容性是一团糟。

  • The theoretically correct syntax for use of UTF-8 in Content-Disposition is very weird: filename*=UTF-8''foo%c3%a4 (yes, that's an asterisk, and no quotes except an empty single quote in the middle)

    在内容配置中使用UTF-8的理论上正确的语法非常奇怪:filename*=UTF-8 " foo%c3%a4(是的,这是一个星号,中间没有引号,只有空的单引号)

  • This header is kinda-not-quite-standard (HTTP/1.1 spec acknowledges its existence, but doesn't require clients to support it).

    这个头是kinda-not-quite-standard (HTTP/1.1规范承认它的存在,但不要求客户端支持它)。

There is a simple and very robust alternative: use a URL that contains the filename you want.

有一个简单且非常健壮的替代方法:使用包含您想要的文件名的URL。

When the name after the last slash is the one you want, you don't need any extra headers!

当最后一个斜杠后面的名字是你想要的,你不需要任何额外的标题!

This trick works:

这种方法工作原理:

/real_script.php/fake_filename.doc

And if your server supports URL rewriting (e.g. mod_rewrite in Apache) then you can fully hide the script part.

如果您的服务器支持URL重写(例如Apache中的mod_rewrite),那么可以完全隐藏脚本部分。

Characters in URLs should be in UTF-8, urlencoded byte-by-byte:

url中的字符应该采用UTF-8,以字节为单位进行编码:

/mot%C3%B6rhead   # motörhead

#4


54  

RFC 6266 describes the “Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)”. Quoting from that:

RFC 6266描述了“在超文本传输协议(HTTP)中使用内容配置头字段”。引用:

6. Internationalization Considerations

6。国际化的考虑

The “filename*” parameter (Section 4.3), using the encoding defined in [RFC5987], allows the server to transmit characters outside the ISO-8859-1 character set, and also to optionally specify the language in use.

“filename*”参数(第4.3节)使用[RFC5987]中定义的编码,允许服务器在ISO-8859-1字符集之外传输字符,还可以选择指定使用的语言。

And in their examples section:

在他们的例子部分:

This example is the same as the one above, but adding the "filename" parameter for compatibility with user agents not implementing RFC 5987:

此示例与上面的示例相同,但是添加了“文件名”参数,以便与不实现RFC 5987的用户代理兼容:

Content-Disposition: attachment;
                     filename="EURO rates";
                     filename*=utf-8''%e2%82%ac%20rates

Note: Those user agents that do not support the RFC 5987 encoding ignore “filename*” when it occurs after “filename”.

注意:那些不支持RFC 5987编码的用户代理在“filename”后面出现时忽略“filename*”。

In Appendix D there is also a long list of suggestions to increase interoperability. It also points at a site which compares implementations. Current all-pass tests suitable for common file names include:

在附录D中,还有一长串的建议来增加互操作性。它还指向一个比较实现的站点。目前适用于通用文件名的全通测试包括:

  • attwithisofnplain: plain ISO-8859-1 file name with double quotes and without encoding. This requires a file name which is all ISO-8859-1 and does not contain percent signs, at least not in front of hex digits.
  • attwithisofnplain:纯ISO-8859-1文件名,双引号,无编码。这要求文件名全部为ISO-8859-1,并且不包含百分号,至少在十六进制数字前不包含百分号。
  • attfnboth: two parameters in the order described above. Should work for most file names on most browsers, although IE8 will use the “filename” parameter.
  • attfnboth:两个参数按照上面描述的顺序。虽然IE8将使用“filename”参数,但在大多数浏览器上应该适用于大多数文件名。

That RFC 5987 in turn references RFC 2231, which describes the actual format. 2231 is primarily for mail, and 5987 tells us what parts may be used for HTTP headers as well. Don't confuse this with MIME headers used inside a multipart/form-data HTTP body, which is governed by RFC 2388 (section 4.4 in particular) and the HTML 5 draft.

RFC 5987反过来引用了RFC 2231,它描述了实际的格式。2231主要用于邮件,5987也告诉我们哪些部分可以用于HTTP头。不要将其与多部分/表单-数据HTTP主体中使用的MIME头部混淆,后者由RFC 2388(特别是4.4节)和HTML 5草稿控制。

#5


16  

The following document linked from the draft RFC mentioned by Jim in his answer further addresses the question and definitely worth a direct note here:

以下与吉姆在回答中提到的RFC草稿相关的文件进一步解决了这个问题,绝对值得在此直接指出:

Test Cases for HTTP Content-Disposition header and RFC 2231/2047 Encoding

HTTP内容处理头和RFC 2231/2047编码的测试用例

#6


10  

in asp.net mvc2 i use something like this:

在asp.net mvc2中,我使用如下内容:

return File(
    tempFile
    , "application/octet-stream"
    , HttpUtility.UrlPathEncode(fileName)
    );

I guess if you don't use mvc(2) you could just encode the filename using

我猜如果你不使用mvc(2)你可以用它来编码文件名

HttpUtility.UrlPathEncode(fileName)

#7


8  

I use the following code snippets for encoding (assuming fileName contains the filename and extension of the file, i.e.: test.txt):

我使用以下代码片段进行编码(假设文件名包含文件的文件名和扩展名,即::用法):


PHP:

PHP:

if ( strpos ( $_SERVER [ 'HTTP_USER_AGENT' ], "MSIE" ) > 0 )
{
     header ( 'Content-Disposition: attachment; filename="' . rawurlencode ( $fileName ) . '"' );
}
else
{
     header( 'Content-Disposition: attachment; filename*=UTF-8\'\'' . rawurlencode ( $fileName ) );
}

Java:

Java:

fileName = request.getHeader ( "user-agent" ).contains ( "MSIE" ) ? URLEncoder.encode ( fileName, "utf-8") : MimeUtility.encodeWord ( fileName );
response.setHeader ( "Content-disposition", "attachment; filename=\"" + fileName + "\"");

#8


8  

In ASP.NET Web API, I url encode the filename:

在ASP。NET Web API,我url编码文件名:

public static class HttpRequestMessageExtensions
{
    public static HttpResponseMessage CreateFileResponse(this HttpRequestMessage request, byte[] data, string filename, string mediaType)
    {
        HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);
        var stream = new MemoryStream(data);
        stream.Position = 0;

        response.Content = new StreamContent(stream);

        response.Content.Headers.ContentType = 
            new MediaTypeHeaderValue(mediaType);

        // URL-Encode filename
        // Fixes behavior in IE, that filenames with non US-ASCII characters
        // stay correct (not "_utf-8_.......=_=").
        var encodedFilename = HttpUtility.UrlEncode(filename, Encoding.UTF8);

        response.Content.Headers.ContentDisposition =
            new ContentDispositionHeaderValue("attachment") { FileName = encodedFilename };
        return response;
    }
}

如何在HTTP中对内容配置头的文件名参数进行编码?
如何在HTTP中对内容配置头的文件名参数进行编码?

#9


8  

Put you file name in double quotes. Solved the problem for me. Like this:

将文件名放在双引号中。帮我解决了这个问题。是这样的:

Content-Disposition: attachment; filename="My Report.doc"

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

#10


5  

I tested the following code in all major browsers, including older Explorers (via the compatibility mode), and it works well everywhere:

我在所有主要的浏览器中测试了以下代码,包括老的探索者(通过兼容性模式),并且在任何地方都很好用:

$filename = $_GET['file']; //this string from $_GET is already decoded
if (strstr($_SERVER['HTTP_USER_AGENT'],"MSIE"))
  $filename = rawurlencode($filename);
header('Content-Disposition: attachment; filename="'.$filename.'"');

#11


5  

If you are using a nodejs backend you can use the following code I found here

如果您正在使用nodejs后端,您可以使用我在这里找到的以下代码

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

#12


4  

I ended up with the following code in my "download.php" script (based on this blogpost and these test cases).

在我的“下载”中,我得到了以下代码。php脚本(基于这个博客和这些测试用例)。

$il1_filename = utf8_decode($filename);
$to_underscore = "\"\\#*;:|<>/?";
$safe_filename = strtr($il1_filename, $to_underscore, str_repeat("_", strlen($to_underscore)));

header("Content-Disposition: attachment; filename=\"$safe_filename\""
.( $safe_filename === $filename ? "" : "; filename*=UTF-8''".rawurlencode($filename) ));

This uses the standard way of filename="..." as long as there are only iso-latin1 and "safe" characters used; if not, it adds the filename*=UTF-8'' url-encoded way. According to this specific test case, it should work from MSIE9 up, and on recent FF, Chrome, Safari; on lower MSIE version, it should offer filename containing the ISO8859-1 version of the filename, with underscores on characters not in this encoding.

这使用文件名=“…”的标准方式,只要只使用iso-latin1和“安全”字符;如果不是,则添加文件名*=UTF-8“url编码方式。根据这个特定的测试用例,它应该可以在MSIE9 up和最近的FF、Chrome、Safari上工作;在较低的MSIE版本中,它应该提供包含ISO8859-1版本的文件名的文件名,而不是在该编码中使用的字符。

Final note: the max. size for each header field is 8190 bytes on apache. UTF-8 can be up to four bytes per character; after rawurlencode, it is x3 = 12 bytes per one character. Pretty inefficient, but it should still be theoretically possible to have more than 600 "smiles" %F0%9F%98%81 in the filename.

最后注意:max。apache上每个头字段的大小为8190字节。UTF-8每个字符最多可达4字节;在rawurlencode之后,是x3 = 12字节/一个字符。效率非常低,但是理论上仍然可以在文件名中包含600多个“微笑”%F0%9F%98%81。

#13


3  

In PHP this did it for me (assuming the filename is UTF8 encoded):

在PHP中,这是为我做的(假设文件名是UTF8编码):

header('Content-Disposition: attachment;'
    . 'filename="' . addslashes(utf8_decode($filename)) . '";'
    . 'filename*=utf-8\'\'' . rawurlencode($filename));

Tested against IE8-11, Firefox and Chrome.
If the browser can interpret filename*=utf-8 it will use the UTF8 version of the filename, else it will use the decoded filename. If your filename contains characters that can't be represented in ISO-8859-1 you might want to consider using iconv instead.

测试了IE8-11,火狐和Chrome。如果浏览器可以解释文件名*=utf-8,它将使用文件名的UTF8版本,否则它将使用解码后的文件名。如果您的文件名包含不能在ISO-8859-1中表示的字符,您可能需要考虑使用iconv。

#14


1  

Classic ASP Solution

Most modern browsers support passing the Filename as UTF-8 now but as was the case with a File Upload solution I use that was based on FreeASPUpload.Net (site no longer exists, link points to archive.org) it wouldn't work as the parsing of the binary relied on reading single byte ASCII encoded strings, which worked fine when you passed UTF-8 encoded data until you get to characters ASCII doesn't support.

大多数现代浏览器现在都支持将文件名作为UTF-8传递,但就像我使用的基于FreeASPUpload的文件上传解决方案一样。Net(站点不再存在,链接指向archive.org)它不会起作用,因为对二进制代码的解析依赖于读取单个字节ASCII编码的字符串,当您通过UTF-8编码的数据时,它工作得很好,直到您得到字符ASCII不支持为止。

However I was able to find a solution to get the code to read and parse the binary as UTF-8.

但是,我找到了一个解决方案,让代码读取和解析二进制文件为UTF-8。

Public Function BytesToString(bytes)    'UTF-8..
  Dim bslen
  Dim i, k , N 
  Dim b , count 
  Dim str

  bslen = LenB(bytes)
  str=""

  i = 0
  Do While i < bslen
    b = AscB(MidB(bytes,i+1,1))

    If (b And &HFC) = &HFC Then
      count = 6
      N = b And &H1
    ElseIf (b And &HF8) = &HF8 Then
      count = 5
      N = b And &H3
    ElseIf (b And &HF0) = &HF0 Then
      count = 4
      N = b And &H7
    ElseIf (b And &HE0) = &HE0 Then
      count = 3
      N = b And &HF
    ElseIf (b And &HC0) = &HC0 Then
      count = 2
      N = b And &H1F
    Else
      count = 1
      str = str & Chr(b)
    End If

    If i + count - 1 > bslen Then
      str = str&"?"
      Exit Do
    End If

    If count>1 then
      For k = 1 To count - 1
        b = AscB(MidB(bytes,i+k+1,1))
        N = N * &H40 + (b And &H3F)
      Next
      str = str & ChrW(N)
    End If
    i = i + count
  Loop

  BytesToString = str
End Function

Credit goes to Pure ASP File Upload by implementing the BytesToString() function from include_aspuploader.asp in my own code I was able to get UTF-8 filenames working.

通过实现include_aspuploader的by睾丸素字符串()函数,Credit将被转移到纯粹的ASP文件上传。在我自己的代码中,我可以让UTF-8文件名工作。


Useful Links

#15


-1  

We had a similar problem in a web application, and ended up by reading the filename from the HTML <input type="file">, and setting that in the url-encoded form in a new HTML <input type="hidden">. Of course we had to remove the path like "C:\fakepath\" that is returned by some browsers.

我们在web应用程序中遇到了类似的问题,最后从HTML 中读取文件名,并将其设置为新的HTML 中的url编码格式。当然,我们必须删除一些浏览器返回的路径,比如“C:\fakepath\”。

Of course this does not directly answer OPs question, but may be a solution for others.

当然,这并不能直接回答操作系统的问题,但对其他人来说可能是一个解决方案。

#16


-2  

I normally URL-encode (with %xx) the filenames, and it seems to work in all browsers. You might want to do some tests anyway.

我通常对文件名进行url编码(使用%xx),并且似乎在所有浏览器中都可以使用。无论如何,您可能需要做一些测试。

#17


-3  

I found out solution, that works for all my browsers (ie. all browsers I have installed - IE8, FF16, Opera 12, Chrome 22).

我找到了适用于所有浏览器的解决方案。我安装的所有浏览器- IE8, FF16, Opera 12, Chrome 22。

My solution is described in other thread: Java servlet download filename special characters

我的解决方案在其他线程中描述:Java servlet下载文件名特殊字符。

My solution is based on the fact, how browsers trying to read value from filename parameter. If there is no charset specified in the filename parameter (for example filename*=utf-8''test.xml) browsers expect that value is encoded in browser's native encoding.

我的解决方案基于浏览器如何从文件名参数中读取值这一事实。如果filename参数(例如filename*=utf-8 " test.xml)中没有指定字符集,浏览器希望该值在浏览器的本机编码中编码。

Different browsers expect diffrent native encoding. Usually browser's native encoding is utf-8 (FireFox, Opera, Chrome). But IE's native encoding is Win-1250. (I don't know anything about other browsers.)

不同的浏览器期望不同的本地编码。通常浏览器的本机编码是utf-8 (FireFox、Opera、Chrome)。但是IE的本地编码是Win-1250。(我对其他浏览器一无所知。)

Hence, if we put value into filename parametr, that is encoded by utf-8/win-1250 according to user's browser, it should work. At least, it works for me.

因此,如果我们将值输入到文件名parametr中(根据用户的浏览器使用utf-8/win-1250编码),它应该可以工作。至少,对我来说是可行的。

In short, if we have file named omáčka.xml,
for FireFox, Opera and Chrome I response this header (encoded in utf-8):

简而言之,如果我们有文件名为omačka。xml,用于FireFox、Opera和Chrome I响应此标题(编码为utf-8):

Content-Disposition: attachment; filename="omáčka.xml"

and for IE I response this header (encoded in win-1250):

对于IE,我响应这个标题(编码为win-1250):

Content-Disposition: attachment; filename="omáèka.jpg"

Java example is in my post that is mentioned above.

Java例子在我的文章中提到过。

#1


79  

There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."

在RFC 5987“超文本传输协议(HTTP)头字段参数的字符集和语言编码”中,讨论了这个问题,包括浏览器测试和向后兼容性的链接。

RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.

RFC 2183表示,这些报头应该按照RFC 2184进行编码,RFC 2231将其淘汰,上面的RFC草案将对此进行介绍。

#2


297  

I know this is an old post but it is still very relevant. I have found that modern browsers support rfc5987, which allows utf-8 encoding, percentage encoded (url-encoded). Then Naïve file.txt becomes:

我知道这是一个古老的帖子,但它仍然非常相关。我发现现代浏览器支持rfc5987,它允许utf-8编码,百分比编码(url编码)。然后天真的文件。txt就变成:

Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt

Safari (5) does not support this. Instead you should use the Safari standard of writing the file name directly in your utf-8 encoded header:

Safari(5)不支持这一点。相反,您应该使用Safari标准,将文件名直接写入您的utf-8编码头中:

Content-Disposition: attachment; filename=Naïve file.txt

IE8 and older don't support it either and you need to use the IE standard of utf-8 encoding, percentage encoded:

IE8及以上版本也不支持,需要使用utf-8编码的IE标准,百分比编码:

Content-Disposition: attachment; filename=Na%C3%AFve%20file.txt

In ASP.Net I use the following code:

在ASP。我使用以下代码:

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.Browser.Browser == "Safari")
    contentDisposition = "attachment; filename=" + fileName;
else
    contentDisposition = "attachment; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

I tested the above using IE7, IE8, IE9, Chrome 13, Opera 11, FF5, Safari 5.

我使用IE7、IE8、IE9、Chrome 13、Opera 11、FF5、Safari 5测试了上述功能。

Update November 2013:

2013年11月更新:

Here is the code I currently use. I still have to support IE8, so I cannot get rid of the first part. It turns out that browsers on Android use the built in Android download manager and it cannot reliably parse file names in the standard way.

这是我目前使用的代码。我仍然需要支持IE8,所以我不能去掉第一部分。事实证明,Android上的浏览器使用内置的Android下载管理器,无法以标准方式可靠地解析文件名。

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.UserAgent != null && Request.UserAgent.ToLowerInvariant().Contains("android")) // android built-in download manager (all browsers on android)
    contentDisposition = "attachment; filename=\"" + MakeAndroidSafeFileName(fileName) + "\"";
else
    contentDisposition = "attachment; filename=\"" + fileName + "\"; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

The above now tested in IE7-11, Chrome 32, Opera 12, FF25, Safari 6, using this filename for download: 你好abcABCæøåÆØÅäöüïëêîâéíáóúýñ½§!#¤%&()=`@£$€{[]}+´¨^~'-_,;.txt

上面现在测试在IE7-11,Chrome 32,歌剧12日FF25,Safari 6,使用该文件名下载:你好abcABCæøaÆØAaouieeiaeiaouyn½§! #¤% &()= @£€美元{[]} +´¨^ ~”_,,. txt

On IE7 it works for some characters but not all. But who cares about IE7 nowadays?

在IE7上,它适用于一些字符,但不是所有字符。但是现在谁会关心IE7呢?

This is the function I use to generate safe file names for Android. Note that I don't know which characters are supported on Android but that I have tested that these work for sure:

这是我用来为Android生成安全文件名的函数。请注意,我不知道Android支持哪些字符,但我已经测试过这些功能:

private static readonly Dictionary<char, char> AndroidAllowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-+,@£$€!½§~'=()[]{}0123456789".ToDictionary(c => c);
private string MakeAndroidSafeFileName(string fileName)
{
    char[] newFileName = fileName.ToCharArray();
    for (int i = 0; i < newFileName.Length; i++)
    {
        if (!AndroidAllowedChars.ContainsKey(newFileName[i]))
            newFileName[i] = '_';
    }
    return new string(newFileName);
}

@TomZ: I tested in IE7 and IE8 and it turned out that I did not need to escape apostrophe ('). Do you have an example where it fails?

@TomZ:我在IE7和IE8上做了测试,结果发现我不需要逃离撇号(')。你有失败的例子吗?

@Dave Van den Eynde: Combining the two file names on one line as according to RFC6266 works except for Android and IE7+8 and I have updated the code to reflect this. Thank you for the suggestion.

@Dave Van den Eynde:根据RFC6266在一行中合并两个文件名,除了Android和IE7+8,我已经更新了代码以反映这一点。谢谢你的建议。

@Thilo: No idea about GoodReader or any other non-browser. You might have some luck using the Android approach.

@Thilo:不知道GoodReader或其他非浏览器。使用Android方法可能会有一些运气。

@Alex Zhukovskiy: I don't know why but as discussed on Connect it doesn't seem to work terribly well.

@Alex Zhukovskiy:我不知道为什么,但正如我们在Connect中讨论的,它似乎并不是很有效。

#3


145  

  • There is no interoperable way to encode non-ASCII names in Content-Disposition. Browser compatibility is a mess.

    在内容配置中,没有可互操作的方式来编码非ascii名称。浏览器兼容性是一团糟。

  • The theoretically correct syntax for use of UTF-8 in Content-Disposition is very weird: filename*=UTF-8''foo%c3%a4 (yes, that's an asterisk, and no quotes except an empty single quote in the middle)

    在内容配置中使用UTF-8的理论上正确的语法非常奇怪:filename*=UTF-8 " foo%c3%a4(是的,这是一个星号,中间没有引号,只有空的单引号)

  • This header is kinda-not-quite-standard (HTTP/1.1 spec acknowledges its existence, but doesn't require clients to support it).

    这个头是kinda-not-quite-standard (HTTP/1.1规范承认它的存在,但不要求客户端支持它)。

There is a simple and very robust alternative: use a URL that contains the filename you want.

有一个简单且非常健壮的替代方法:使用包含您想要的文件名的URL。

When the name after the last slash is the one you want, you don't need any extra headers!

当最后一个斜杠后面的名字是你想要的,你不需要任何额外的标题!

This trick works:

这种方法工作原理:

/real_script.php/fake_filename.doc

And if your server supports URL rewriting (e.g. mod_rewrite in Apache) then you can fully hide the script part.

如果您的服务器支持URL重写(例如Apache中的mod_rewrite),那么可以完全隐藏脚本部分。

Characters in URLs should be in UTF-8, urlencoded byte-by-byte:

url中的字符应该采用UTF-8,以字节为单位进行编码:

/mot%C3%B6rhead   # motörhead

#4


54  

RFC 6266 describes the “Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)”. Quoting from that:

RFC 6266描述了“在超文本传输协议(HTTP)中使用内容配置头字段”。引用:

6. Internationalization Considerations

6。国际化的考虑

The “filename*” parameter (Section 4.3), using the encoding defined in [RFC5987], allows the server to transmit characters outside the ISO-8859-1 character set, and also to optionally specify the language in use.

“filename*”参数(第4.3节)使用[RFC5987]中定义的编码,允许服务器在ISO-8859-1字符集之外传输字符,还可以选择指定使用的语言。

And in their examples section:

在他们的例子部分:

This example is the same as the one above, but adding the "filename" parameter for compatibility with user agents not implementing RFC 5987:

此示例与上面的示例相同,但是添加了“文件名”参数,以便与不实现RFC 5987的用户代理兼容:

Content-Disposition: attachment;
                     filename="EURO rates";
                     filename*=utf-8''%e2%82%ac%20rates

Note: Those user agents that do not support the RFC 5987 encoding ignore “filename*” when it occurs after “filename”.

注意:那些不支持RFC 5987编码的用户代理在“filename”后面出现时忽略“filename*”。

In Appendix D there is also a long list of suggestions to increase interoperability. It also points at a site which compares implementations. Current all-pass tests suitable for common file names include:

在附录D中,还有一长串的建议来增加互操作性。它还指向一个比较实现的站点。目前适用于通用文件名的全通测试包括:

  • attwithisofnplain: plain ISO-8859-1 file name with double quotes and without encoding. This requires a file name which is all ISO-8859-1 and does not contain percent signs, at least not in front of hex digits.
  • attwithisofnplain:纯ISO-8859-1文件名,双引号,无编码。这要求文件名全部为ISO-8859-1,并且不包含百分号,至少在十六进制数字前不包含百分号。
  • attfnboth: two parameters in the order described above. Should work for most file names on most browsers, although IE8 will use the “filename” parameter.
  • attfnboth:两个参数按照上面描述的顺序。虽然IE8将使用“filename”参数,但在大多数浏览器上应该适用于大多数文件名。

That RFC 5987 in turn references RFC 2231, which describes the actual format. 2231 is primarily for mail, and 5987 tells us what parts may be used for HTTP headers as well. Don't confuse this with MIME headers used inside a multipart/form-data HTTP body, which is governed by RFC 2388 (section 4.4 in particular) and the HTML 5 draft.

RFC 5987反过来引用了RFC 2231,它描述了实际的格式。2231主要用于邮件,5987也告诉我们哪些部分可以用于HTTP头。不要将其与多部分/表单-数据HTTP主体中使用的MIME头部混淆,后者由RFC 2388(特别是4.4节)和HTML 5草稿控制。

#5


16  

The following document linked from the draft RFC mentioned by Jim in his answer further addresses the question and definitely worth a direct note here:

以下与吉姆在回答中提到的RFC草稿相关的文件进一步解决了这个问题,绝对值得在此直接指出:

Test Cases for HTTP Content-Disposition header and RFC 2231/2047 Encoding

HTTP内容处理头和RFC 2231/2047编码的测试用例

#6


10  

in asp.net mvc2 i use something like this:

在asp.net mvc2中,我使用如下内容:

return File(
    tempFile
    , "application/octet-stream"
    , HttpUtility.UrlPathEncode(fileName)
    );

I guess if you don't use mvc(2) you could just encode the filename using

我猜如果你不使用mvc(2)你可以用它来编码文件名

HttpUtility.UrlPathEncode(fileName)

#7


8  

I use the following code snippets for encoding (assuming fileName contains the filename and extension of the file, i.e.: test.txt):

我使用以下代码片段进行编码(假设文件名包含文件的文件名和扩展名,即::用法):


PHP:

PHP:

if ( strpos ( $_SERVER [ 'HTTP_USER_AGENT' ], "MSIE" ) > 0 )
{
     header ( 'Content-Disposition: attachment; filename="' . rawurlencode ( $fileName ) . '"' );
}
else
{
     header( 'Content-Disposition: attachment; filename*=UTF-8\'\'' . rawurlencode ( $fileName ) );
}

Java:

Java:

fileName = request.getHeader ( "user-agent" ).contains ( "MSIE" ) ? URLEncoder.encode ( fileName, "utf-8") : MimeUtility.encodeWord ( fileName );
response.setHeader ( "Content-disposition", "attachment; filename=\"" + fileName + "\"");

#8


8  

In ASP.NET Web API, I url encode the filename:

在ASP。NET Web API,我url编码文件名:

public static class HttpRequestMessageExtensions
{
    public static HttpResponseMessage CreateFileResponse(this HttpRequestMessage request, byte[] data, string filename, string mediaType)
    {
        HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);
        var stream = new MemoryStream(data);
        stream.Position = 0;

        response.Content = new StreamContent(stream);

        response.Content.Headers.ContentType = 
            new MediaTypeHeaderValue(mediaType);

        // URL-Encode filename
        // Fixes behavior in IE, that filenames with non US-ASCII characters
        // stay correct (not "_utf-8_.......=_=").
        var encodedFilename = HttpUtility.UrlEncode(filename, Encoding.UTF8);

        response.Content.Headers.ContentDisposition =
            new ContentDispositionHeaderValue("attachment") { FileName = encodedFilename };
        return response;
    }
}

如何在HTTP中对内容配置头的文件名参数进行编码?
如何在HTTP中对内容配置头的文件名参数进行编码?

#9


8  

Put you file name in double quotes. Solved the problem for me. Like this:

将文件名放在双引号中。帮我解决了这个问题。是这样的:

Content-Disposition: attachment; filename="My Report.doc"

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

#10


5  

I tested the following code in all major browsers, including older Explorers (via the compatibility mode), and it works well everywhere:

我在所有主要的浏览器中测试了以下代码,包括老的探索者(通过兼容性模式),并且在任何地方都很好用:

$filename = $_GET['file']; //this string from $_GET is already decoded
if (strstr($_SERVER['HTTP_USER_AGENT'],"MSIE"))
  $filename = rawurlencode($filename);
header('Content-Disposition: attachment; filename="'.$filename.'"');

#11


5  

If you are using a nodejs backend you can use the following code I found here

如果您正在使用nodejs后端,您可以使用我在这里找到的以下代码

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}

#12


4  

I ended up with the following code in my "download.php" script (based on this blogpost and these test cases).

在我的“下载”中,我得到了以下代码。php脚本(基于这个博客和这些测试用例)。

$il1_filename = utf8_decode($filename);
$to_underscore = "\"\\#*;:|<>/?";
$safe_filename = strtr($il1_filename, $to_underscore, str_repeat("_", strlen($to_underscore)));

header("Content-Disposition: attachment; filename=\"$safe_filename\""
.( $safe_filename === $filename ? "" : "; filename*=UTF-8''".rawurlencode($filename) ));

This uses the standard way of filename="..." as long as there are only iso-latin1 and "safe" characters used; if not, it adds the filename*=UTF-8'' url-encoded way. According to this specific test case, it should work from MSIE9 up, and on recent FF, Chrome, Safari; on lower MSIE version, it should offer filename containing the ISO8859-1 version of the filename, with underscores on characters not in this encoding.

这使用文件名=“…”的标准方式,只要只使用iso-latin1和“安全”字符;如果不是,则添加文件名*=UTF-8“url编码方式。根据这个特定的测试用例,它应该可以在MSIE9 up和最近的FF、Chrome、Safari上工作;在较低的MSIE版本中,它应该提供包含ISO8859-1版本的文件名的文件名,而不是在该编码中使用的字符。

Final note: the max. size for each header field is 8190 bytes on apache. UTF-8 can be up to four bytes per character; after rawurlencode, it is x3 = 12 bytes per one character. Pretty inefficient, but it should still be theoretically possible to have more than 600 "smiles" %F0%9F%98%81 in the filename.

最后注意:max。apache上每个头字段的大小为8190字节。UTF-8每个字符最多可达4字节;在rawurlencode之后,是x3 = 12字节/一个字符。效率非常低,但是理论上仍然可以在文件名中包含600多个“微笑”%F0%9F%98%81。

#13


3  

In PHP this did it for me (assuming the filename is UTF8 encoded):

在PHP中,这是为我做的(假设文件名是UTF8编码):

header('Content-Disposition: attachment;'
    . 'filename="' . addslashes(utf8_decode($filename)) . '";'
    . 'filename*=utf-8\'\'' . rawurlencode($filename));

Tested against IE8-11, Firefox and Chrome.
If the browser can interpret filename*=utf-8 it will use the UTF8 version of the filename, else it will use the decoded filename. If your filename contains characters that can't be represented in ISO-8859-1 you might want to consider using iconv instead.

测试了IE8-11,火狐和Chrome。如果浏览器可以解释文件名*=utf-8,它将使用文件名的UTF8版本,否则它将使用解码后的文件名。如果您的文件名包含不能在ISO-8859-1中表示的字符,您可能需要考虑使用iconv。

#14


1  

Classic ASP Solution

Most modern browsers support passing the Filename as UTF-8 now but as was the case with a File Upload solution I use that was based on FreeASPUpload.Net (site no longer exists, link points to archive.org) it wouldn't work as the parsing of the binary relied on reading single byte ASCII encoded strings, which worked fine when you passed UTF-8 encoded data until you get to characters ASCII doesn't support.

大多数现代浏览器现在都支持将文件名作为UTF-8传递,但就像我使用的基于FreeASPUpload的文件上传解决方案一样。Net(站点不再存在,链接指向archive.org)它不会起作用,因为对二进制代码的解析依赖于读取单个字节ASCII编码的字符串,当您通过UTF-8编码的数据时,它工作得很好,直到您得到字符ASCII不支持为止。

However I was able to find a solution to get the code to read and parse the binary as UTF-8.

但是,我找到了一个解决方案,让代码读取和解析二进制文件为UTF-8。

Public Function BytesToString(bytes)    'UTF-8..
  Dim bslen
  Dim i, k , N 
  Dim b , count 
  Dim str

  bslen = LenB(bytes)
  str=""

  i = 0
  Do While i < bslen
    b = AscB(MidB(bytes,i+1,1))

    If (b And &HFC) = &HFC Then
      count = 6
      N = b And &H1
    ElseIf (b And &HF8) = &HF8 Then
      count = 5
      N = b And &H3
    ElseIf (b And &HF0) = &HF0 Then
      count = 4
      N = b And &H7
    ElseIf (b And &HE0) = &HE0 Then
      count = 3
      N = b And &HF
    ElseIf (b And &HC0) = &HC0 Then
      count = 2
      N = b And &H1F
    Else
      count = 1
      str = str & Chr(b)
    End If

    If i + count - 1 > bslen Then
      str = str&"?"
      Exit Do
    End If

    If count>1 then
      For k = 1 To count - 1
        b = AscB(MidB(bytes,i+k+1,1))
        N = N * &H40 + (b And &H3F)
      Next
      str = str & ChrW(N)
    End If
    i = i + count
  Loop

  BytesToString = str
End Function

Credit goes to Pure ASP File Upload by implementing the BytesToString() function from include_aspuploader.asp in my own code I was able to get UTF-8 filenames working.

通过实现include_aspuploader的by睾丸素字符串()函数,Credit将被转移到纯粹的ASP文件上传。在我自己的代码中,我可以让UTF-8文件名工作。


Useful Links

#15


-1  

We had a similar problem in a web application, and ended up by reading the filename from the HTML <input type="file">, and setting that in the url-encoded form in a new HTML <input type="hidden">. Of course we had to remove the path like "C:\fakepath\" that is returned by some browsers.

我们在web应用程序中遇到了类似的问题,最后从HTML 中读取文件名,并将其设置为新的HTML 中的url编码格式。当然,我们必须删除一些浏览器返回的路径,比如“C:\fakepath\”。

Of course this does not directly answer OPs question, but may be a solution for others.

当然,这并不能直接回答操作系统的问题,但对其他人来说可能是一个解决方案。

#16


-2  

I normally URL-encode (with %xx) the filenames, and it seems to work in all browsers. You might want to do some tests anyway.

我通常对文件名进行url编码(使用%xx),并且似乎在所有浏览器中都可以使用。无论如何,您可能需要做一些测试。

#17


-3  

I found out solution, that works for all my browsers (ie. all browsers I have installed - IE8, FF16, Opera 12, Chrome 22).

我找到了适用于所有浏览器的解决方案。我安装的所有浏览器- IE8, FF16, Opera 12, Chrome 22。

My solution is described in other thread: Java servlet download filename special characters

我的解决方案在其他线程中描述:Java servlet下载文件名特殊字符。

My solution is based on the fact, how browsers trying to read value from filename parameter. If there is no charset specified in the filename parameter (for example filename*=utf-8''test.xml) browsers expect that value is encoded in browser's native encoding.

我的解决方案基于浏览器如何从文件名参数中读取值这一事实。如果filename参数(例如filename*=utf-8 " test.xml)中没有指定字符集,浏览器希望该值在浏览器的本机编码中编码。

Different browsers expect diffrent native encoding. Usually browser's native encoding is utf-8 (FireFox, Opera, Chrome). But IE's native encoding is Win-1250. (I don't know anything about other browsers.)

不同的浏览器期望不同的本地编码。通常浏览器的本机编码是utf-8 (FireFox、Opera、Chrome)。但是IE的本地编码是Win-1250。(我对其他浏览器一无所知。)

Hence, if we put value into filename parametr, that is encoded by utf-8/win-1250 according to user's browser, it should work. At least, it works for me.

因此,如果我们将值输入到文件名parametr中(根据用户的浏览器使用utf-8/win-1250编码),它应该可以工作。至少,对我来说是可行的。

In short, if we have file named omáčka.xml,
for FireFox, Opera and Chrome I response this header (encoded in utf-8):

简而言之,如果我们有文件名为omačka。xml,用于FireFox、Opera和Chrome I响应此标题(编码为utf-8):

Content-Disposition: attachment; filename="omáčka.xml"

and for IE I response this header (encoded in win-1250):

对于IE,我响应这个标题(编码为win-1250):

Content-Disposition: attachment; filename="omáèka.jpg"

Java example is in my post that is mentioned above.

Java例子在我的文章中提到过。