浏览器中cookie存储的编码方案

时间:2021-12-17 23:45:11

As per the ECMA-262 5th Edition:

根据ECMA-262第5版:

A conforming implementation of this International standard shall interpret characters in conformance with the Unicode Standard, Version 3.0 or later and ISO/IEC 1064 6-1 with either UCS-2 or UTF-16 as the adopted encoding form, implementation level 3. If the adopted ISO/IEC 10646-1 subset is not otherwise specified, it is presumed to be the BMP subset, collection 300. If the adopted encoding form is not otherwise specified, it presumed to be the UTF-16 encoding form.

符合本国际标准的一项实施应按照Unicode标准、3.0或更高版本、ISO/ iec1064 6-1和UCS-2或UTF-16解释字符,作为采用的编码形式,实现级别3。如果没有指定采用的ISO/ iec10646 -1子集,则假定为BMP子集,集合300。如果未指定所采用的编码形式,则假定为UTF-16编码形式。

This brings me to the following questions:

这给我带来了以下问题:

  1. The UTF-16 or UCS-2 recommended by ECMAScript standard refers to the encoding form to be used for storage purposes or computation purposes?
  2. ECMAScript标准推荐的UTF-16或UCS-2是指用于存储目的或计算目的的编码形式?
  3. What character encoding (for storage purposes) is used to store cookies on the client?
  4. 什么字符编码(用于存储)用于在客户机上存储cookie ?
  5. Also, since HTTP header values don't allow non US-ASCII characters, does the browser change the character encoding to ASCII before sending cookies to a server?
  6. 而且,由于HTTP头值不允许非US-ASCII字符,浏览器在向服务器发送cookie之前是否将字符编码更改为ASCII ?

I'm particularly interested in the character encoding browsers use for storing cookies since that would let me calculate the actual number of bytes I could use per cookie.

我对浏览器用于存储cookie的字符编码特别感兴趣,因为这可以让我计算出每个cookie的实际字节数。

1 个解决方案

#1


1  

1.The UTF-16 or UCS-2 recommended by ECMAScript standard refers to the encoding form to be used for storage purposes or computation purposes?

1。ECMAScript标准推荐的UTF-16或UCS-2是指用于存储目的或计算目的的编码形式?

Computation, in as much as ECMAScript only specifies the interface presented to your scripts and not how that is implemented behind the scenes. An implementation could use any form of string storage (for example it could conceivably optimise ASCII-only strings to take only one byte per ECMAScript char/UTF-16 code unit).

计算,就像ECMAScript一样,只指定呈现给您的脚本的接口,而不是在幕后如何实现。一个实现可以使用任何形式的字符串存储(例如,它可以合理地优化只有ascii的字符串,每个ECMAScript字符/UTF-16代码单元只占用一个字节)。

2.What character encoding (for storage purposes) is used to store cookies on the client?

2。什么字符编码(用于存储)用于在客户机上存储cookie ?

Not specified by ECMAScript or any other web standard. IE stores cookie files in the locale-specific default code page (aka ANSI). Some other browsers use SQLite databases, typically with UTF-8.

没有由ECMAScript或其他web标准指定。IE将cookie文件存储在特定于位置的默认代码页(又名ANSI)中。其他一些浏览器使用SQLite数据库,通常使用UTF-8。

3.Also, since HTTP header values don't allow non US-ASCII characters, does the browser change the character encoding to ASCII before sending cookies to a server?

3所示。而且,由于HTTP头值不允许非US-ASCII字符,浏览器在向服务器发送cookie之前是否将字符编码更改为ASCII ?

Varies across browsers. Last time I checked: IE encodes to ANSI. Chrome uses UTF-8. Firefox uses the low byte of each UTF-16 code unit (compatible with ISO-8859-1 for characters that supports, else irretrievably mangled). Safari blocks non-ASCII entirely.

跨浏览器不同而不同。上次我检查:IE编码到ANSI。Chrome使用utf - 8。Firefox使用每个UTF-16代码单元的低字节(支持的字符与ISO-8859-1兼容,否则将无法修复)。Safari块完全非ascii。

Upshot: in practice non-ASCII characters are not usable in cookies at all. If you need Unicode safety and/or larger capacity, use DOM Storage.

结论:在实践中,非ascii字符根本不能用于cookie中。如果需要Unicode安全性和/或更大容量,请使用DOM Storage。

I'm particularly interested in the character encoding browsers use for storing cookies since that would let me calculate the actual number of bytes I could use per cookie.

我对浏览器用于存储cookie的字符编码特别感兴趣,因为这可以让我计算出每个cookie的实际字节数。

Browser limits vary widely in any case.

浏览器的限制在任何情况下都有很大的差异。

#1


1  

1.The UTF-16 or UCS-2 recommended by ECMAScript standard refers to the encoding form to be used for storage purposes or computation purposes?

1。ECMAScript标准推荐的UTF-16或UCS-2是指用于存储目的或计算目的的编码形式?

Computation, in as much as ECMAScript only specifies the interface presented to your scripts and not how that is implemented behind the scenes. An implementation could use any form of string storage (for example it could conceivably optimise ASCII-only strings to take only one byte per ECMAScript char/UTF-16 code unit).

计算,就像ECMAScript一样,只指定呈现给您的脚本的接口,而不是在幕后如何实现。一个实现可以使用任何形式的字符串存储(例如,它可以合理地优化只有ascii的字符串,每个ECMAScript字符/UTF-16代码单元只占用一个字节)。

2.What character encoding (for storage purposes) is used to store cookies on the client?

2。什么字符编码(用于存储)用于在客户机上存储cookie ?

Not specified by ECMAScript or any other web standard. IE stores cookie files in the locale-specific default code page (aka ANSI). Some other browsers use SQLite databases, typically with UTF-8.

没有由ECMAScript或其他web标准指定。IE将cookie文件存储在特定于位置的默认代码页(又名ANSI)中。其他一些浏览器使用SQLite数据库,通常使用UTF-8。

3.Also, since HTTP header values don't allow non US-ASCII characters, does the browser change the character encoding to ASCII before sending cookies to a server?

3所示。而且,由于HTTP头值不允许非US-ASCII字符,浏览器在向服务器发送cookie之前是否将字符编码更改为ASCII ?

Varies across browsers. Last time I checked: IE encodes to ANSI. Chrome uses UTF-8. Firefox uses the low byte of each UTF-16 code unit (compatible with ISO-8859-1 for characters that supports, else irretrievably mangled). Safari blocks non-ASCII entirely.

跨浏览器不同而不同。上次我检查:IE编码到ANSI。Chrome使用utf - 8。Firefox使用每个UTF-16代码单元的低字节(支持的字符与ISO-8859-1兼容,否则将无法修复)。Safari块完全非ascii。

Upshot: in practice non-ASCII characters are not usable in cookies at all. If you need Unicode safety and/or larger capacity, use DOM Storage.

结论:在实践中,非ascii字符根本不能用于cookie中。如果需要Unicode安全性和/或更大容量,请使用DOM Storage。

I'm particularly interested in the character encoding browsers use for storing cookies since that would let me calculate the actual number of bytes I could use per cookie.

我对浏览器用于存储cookie的字符编码特别感兴趣,因为这可以让我计算出每个cookie的实际字节数。

Browser limits vary widely in any case.

浏览器的限制在任何情况下都有很大的差异。