JSON Unicode转义序列-小写?

时间:2021-01-25 00:07:59

I was reading RFC 4627 and I can't figure out if the following is valid JSON or not. Consider this minimalistic JSON text:

我正在读取RFC 4627,我不知道以下内容是否有效。考虑一下这个极简的JSON文本:

["\u005c"]

The problem is the lowercase c.

问题是小写的c。

According to the text of the RFC it is allowed:

根据RFC的文本,允许:

Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A though F can be upper or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".

任何角色都可以被转义。如果字符在基本的多语言平面中(U+0000通过U+FFFF),那么它可以表示为一个6字符序列:一个反向的孤立点,后面跟着小写的字母U,后面跟着四个十六进制数字,编码字符的代码点。十六进制字母A虽然F可以大写或小写。因此,例如,一个只包含一个反向solidus字符的字符串可以表示为“\u005C”。

(Emphasis mine)

(强调我的)

The problem is that the RFC also contains the grammar for this:

问题是,RFC还包含以下语法:

char = unescaped /
       escape (
           %x22 /          ; "    quotation mark  U+0022
           %x5C /          ; \    reverse solidus U+005C
           %x2F /          ; /    solidus         U+002F
           %x62 /          ; b    backspace       U+0008
           %x66 /          ; f    form feed       U+000C
           %x6E /          ; n    line feed       U+000A
           %x72 /          ; r    carriage return U+000D
           %x74 /          ; t    tab             U+0009
           %x75 4HEXDIG )  ; uXXXX                U+XXXX

where HEXDIG is defined in referenced RFC 4234 as

在引用的RFC 4234中,HEXDIG的定义是什么

HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

which includes only uppercase letters.

它只包含大写字母。

FWIW, from what I researched most JSON parsers accept both upper and lowercase letters.

根据我的研究,大多数JSON解析器都接受大小写字母。

Question(s): What is actually correct? Is there a contradiction and the grammar in the RFC should be fixed?

问题(s):什么是正确的?RFC中是否存在矛盾和语法需要修正?

1 个解决方案

#1


10  

I think it's explained by this part of RFC 4234:

我想这是RFC 4234这一部分的解释:

ABNF strings are case-insensitive and the character set for these strings is us-ascii.

ABNF字符串不区分大小写,这些字符串的字符集是us-ascii。

Hence:

因此:

    rulename = "abc"

and:

和:

    rulename = "aBc"

will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC", and "ABC".

将匹配“abc”,“abc”,“abc”,“abc”,“abc”,“abc”,“abc”,“abc”。

On the other hand, the follow-on part is not terribly clear:

另一方面,后续部分并不十分清楚:

To specify a rule that IS case SENSITIVE, specify the characters individually.

要指定区分大小写的规则,请分别指定字符。

For example:

例如:

    rulename    =  %d97 %d98 %d99

or

    rulename    =  %d97.98.99

In the case of the HEXDIG rule, they're individual characters to start with - but they're specified literally as "A" etc rather than %d41, so I suspect that means they're case-insensitive. It's not the clearest spec I've read :(

在HEXDIG规则的情况下,它们都是独立的字符-但是它们在字面上被指定为“A”而不是%d41,所以我怀疑这意味着它们是不区分大小写的。这不是我读过的最清晰的规范

#1


10  

I think it's explained by this part of RFC 4234:

我想这是RFC 4234这一部分的解释:

ABNF strings are case-insensitive and the character set for these strings is us-ascii.

ABNF字符串不区分大小写,这些字符串的字符集是us-ascii。

Hence:

因此:

    rulename = "abc"

and:

和:

    rulename = "aBc"

will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC", and "ABC".

将匹配“abc”,“abc”,“abc”,“abc”,“abc”,“abc”,“abc”,“abc”。

On the other hand, the follow-on part is not terribly clear:

另一方面,后续部分并不十分清楚:

To specify a rule that IS case SENSITIVE, specify the characters individually.

要指定区分大小写的规则,请分别指定字符。

For example:

例如:

    rulename    =  %d97 %d98 %d99

or

    rulename    =  %d97.98.99

In the case of the HEXDIG rule, they're individual characters to start with - but they're specified literally as "A" etc rather than %d41, so I suspect that means they're case-insensitive. It's not the clearest spec I've read :(

在HEXDIG规则的情况下,它们都是独立的字符-但是它们在字面上被指定为“A”而不是%d41,所以我怀疑这意味着它们是不区分大小写的。这不是我读过的最清晰的规范