json。以\u0000 \x00把钥匙打开。

时间:2022-02-08 22:45:57

Here is the Go playground link.

这是去操场的链接。

Basically there are some special characters ('\u0000') in my JSON string key:

基本上在我的JSON字符串键中有一些特殊字符('\u0000'):

var j = []byte(`{"Page":1,"Fruits":["5","6"],"\u0000*\u0000_errorMessages":{"x":"123"},"*_successMessages":{"ok":"hi"}}`)

I want to Unmarshal it into a struct:

我想把它拆成一个结构:

type Response1 struct {
    Page   int
    Fruits []string
    Msg    interface{} `json:"*_errorMessages"`
    Msg1   interface{} `json:"\\u0000*\\u0000_errorMessages"`
    Msg2   interface{} `json:"\u0000*\u0000_errorMessages"`
    Msg3   interface{} `json:"\0*\0_errorMessages"`
    Msg4   interface{} `json:"\\0*\\0_errorMessages"`
    Msg5   interface{} `json:"\x00*\x00_errorMessages"`
    Msg6   interface{} `json:"\\x00*\\x00_errorMessages"`
    SMsg   interface{} `json:"*_successMessages"`
}

I tried a lot but it's not working. This link might help golang.org/src/encoding/json/encode_test.go.

我试了很多,但没用。这个链接可能会帮助golang.org/src/encoding/json/encode_test.go。

3 个解决方案

#1


6  

Short answer: With the current json implementation it is not possible using only struct tags.

简短的回答:使用当前的json实现,仅使用struct标记是不可能的。

Note: It's an implementation restriction, not a specification restriction. (It's the restriction of the json package implementation, not the restriction of the struct tags specification.)

注意:这是一个实现限制,而不是规范限制。(这是json包实现的限制,而不是struct标记规范的限制。)


Some background: you specified your tags with a raw string literal:

一些背景:您使用原始字符串文字指定了您的标记:

The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes...

原始字符串字面量的值是由引号中未解释的(隐式utf -8编码)字符组成的字符串。

So no unescaping or unquoting happens in the content of the raw string literal by the compiler.

因此,在编译器的原始字符串文字的内容中,不会出现任何未解或不引用的情况。

The convention for struct tag values quoted from reflect.StructTag:

从reflec . structtag中引用的struct标签值的约定:

By convention, tag strings are a concatenation of optionally space-separated key:"value" pairs. Each key is a non-empty string consisting of non-control characters other than space (U+0020 ' '), quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using U+0022 '"' characters and Go string literal syntax.

根据惯例,标记字符串是可选的空间分隔键的连接:“值”对。每个键都是一个非空字符串,由非控制字符组成,除了空间(U+0020 ')、引号(U+0022 ')和冒号(U+003A ':')。每个值都使用U+0022 ' '的字符和字符串文字语法进行引用。

What this means is that by convention tag values are a list of (key:"value") pairs separated by spaces. There are quite a few restrictions for keys, but values may be anything, and values (should) use "Go string literal syntax", this means that these values will be unquoted at runtime from code (by a call to strconv.Unquote(), called from StructTag.Get(), in source file reflect/type.go, currently line #809).

这意味着,按惯例标签值是由空格分隔的(键:“值”)对的列表。对于键有相当多的限制,但是值可能是任何东西,并且值(应该)使用“Go字符串字面语法”,这意味着这些值将在运行时从代码(通过对strconv.Unquote()的调用,从StructTag.Get()中调用,在源文件中反映/类型。目前,# 809行)。

So no need for double quoting. See your simplified example:

所以不需要双引号。看到你的简化示例:

type Response1 struct {
    Page   int
    Fruits []string
    Msg    interface{} `json:"\u0000_abc"`
}

Now the following code:

现在下面的代码:

t := reflect.TypeOf(Response1{})
fmt.Printf("%#v\n", t.Field(2).Tag)
fmt.Printf("%#v\n", t.Field(2).Tag.Get("json"))

Prints:

打印:

"json:\"\\u0000_abc\""
"\x00_abc"

As you can see, the value part for the json key is "\x00_abc" so it properly contains the zero character.

如您所见,json键的值部分是“\x00_abc”,因此它正确地包含了零字符。

But how will the json package use this?

但是json包将如何使用它呢?

The json package uses the value returned by StructTag.Get() (from the reflect package), exactly what we did. You can see it in the json/encode.go source file, typeFields() function, currently line #1032. So far so good.

json包使用StructTag.Get()(从反射包)返回的值,这正是我们所做的。您可以在json/encode中看到它。go源文件,typeFields()函数,当前行#1032。目前为止一切都很顺利。

Then it calls the unexported json.parseTag() function, in json/tags.go source file, currently line #17. This cuts the part after the comma (which becomes the "tag options").

然后,它以json/标记调用未导出的json.parseTag()函数。去源文件,当前行#17。这将在逗号(成为“标记选项”)后的部分进行裁剪。

And finally json.isValidTag() function is called with the previous value, in source file json/encode.go, currently line #731. This function checks the runes of the passed string, and (besides a set of pre-defined allowed characters "!#$%&()*+-./:<=>?@[]^_{|}~ ") rejects everything that is not a unicode letter or digit (as defined by unicode.IsLetter() and unicode.IsDigit()):

isvalidtag()函数在源文件json/encode中使用前一个值调用。去,目前# 731。这个函数检查传递的字符串的符文,并且(除了一组预定义的允许字符“!#$%&()*+-./:<=>?”@[]^ _ { | } ~”)拒绝一切不是一个unicode字母或数字(如由unicode.IsLetter()和unicode.IsDigit()):

if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
    return false
} 

'\u0000' is not part of the pre-defined allowed characters, and as you can guess now, it is neither a letter nor a digit:

“\u0000”不是预先定义的允许字符的一部分,正如你现在可以猜到的,它既不是字母也不是数字:

// Following code prints "INVALID":
c := '\u0000'
if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
    fmt.Println("INVALID")
}

And since isValidTag() returns false, the name (which is the value for the json key, without the "tag options" part) will be discarded (name = "") and not used. So no match will be found for the struct field containing a unicode zero.

由于isValidTag()返回false,因此名称(这是json密钥的值,没有“标记选项”部分)将被丢弃(name =“”),而不使用。因此,在包含unicode zero的struct字段中不会找到匹配项。

For an alternative solution use a map, or a custom json.Unmarshaler or use json.RawMessage.

对于另一种解决方案,使用映射或自定义json。数据或使用json.RawMessage。

But I would highly discourage using such ugly json keys. I understand likely you are just trying to parse such json response and it may be out of your reach, but you should fight against using these keys as they will just cause more problems later on (e.g. if stored in db, by inspecting records it will be very hard to spot that there are '\u0000' characters in them as they may be displayed as nothing).

但是我强烈反对使用这种丑陋的json键。我知道可能你只是试图解析json响应,可能是你的,但你应该反对使用这些键以后只会导致更多的问题(例如,如果存储在数据库中,通过检查记录将会很难发现,中有“\ u0000”人物他们可能显示为没有)。

#2


0  

You cannot do in such way due to: http://golang.org/ref/spec#Struct_types

您不能这样做,原因是:http://golang.org/ref/spec#Struct_types。

But You can unmarshal to map[string]interface{} then check field names of that object through regexp.

但是,您可以通过regexp将该对象的字段名称重新映射到map[string]接口{}。

#3


0  

I don't think this is possible with struct tags. The best thing you can do is unmarshal it into map[string]interface{} and then get the values manually:

我认为结构标签是不可能的。您所能做的最好的事情就是将它拆成map[string]接口{},然后手动获取值:

var b = []byte(`{"\u0000abc":42}`)
var m map[string]interface{}
err := json.Unmarshal(b, &m)
if err != nil {
    panic(err)
}
fmt.Println(m, m["\x00abc"])

Playground: http://play.golang.org/p/RtS7Nst0d7.

操场上:http://play.golang.org/p/RtS7Nst0d7。

#1


6  

Short answer: With the current json implementation it is not possible using only struct tags.

简短的回答:使用当前的json实现,仅使用struct标记是不可能的。

Note: It's an implementation restriction, not a specification restriction. (It's the restriction of the json package implementation, not the restriction of the struct tags specification.)

注意:这是一个实现限制,而不是规范限制。(这是json包实现的限制,而不是struct标记规范的限制。)


Some background: you specified your tags with a raw string literal:

一些背景:您使用原始字符串文字指定了您的标记:

The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes...

原始字符串字面量的值是由引号中未解释的(隐式utf -8编码)字符组成的字符串。

So no unescaping or unquoting happens in the content of the raw string literal by the compiler.

因此,在编译器的原始字符串文字的内容中,不会出现任何未解或不引用的情况。

The convention for struct tag values quoted from reflect.StructTag:

从reflec . structtag中引用的struct标签值的约定:

By convention, tag strings are a concatenation of optionally space-separated key:"value" pairs. Each key is a non-empty string consisting of non-control characters other than space (U+0020 ' '), quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using U+0022 '"' characters and Go string literal syntax.

根据惯例,标记字符串是可选的空间分隔键的连接:“值”对。每个键都是一个非空字符串,由非控制字符组成,除了空间(U+0020 ')、引号(U+0022 ')和冒号(U+003A ':')。每个值都使用U+0022 ' '的字符和字符串文字语法进行引用。

What this means is that by convention tag values are a list of (key:"value") pairs separated by spaces. There are quite a few restrictions for keys, but values may be anything, and values (should) use "Go string literal syntax", this means that these values will be unquoted at runtime from code (by a call to strconv.Unquote(), called from StructTag.Get(), in source file reflect/type.go, currently line #809).

这意味着,按惯例标签值是由空格分隔的(键:“值”)对的列表。对于键有相当多的限制,但是值可能是任何东西,并且值(应该)使用“Go字符串字面语法”,这意味着这些值将在运行时从代码(通过对strconv.Unquote()的调用,从StructTag.Get()中调用,在源文件中反映/类型。目前,# 809行)。

So no need for double quoting. See your simplified example:

所以不需要双引号。看到你的简化示例:

type Response1 struct {
    Page   int
    Fruits []string
    Msg    interface{} `json:"\u0000_abc"`
}

Now the following code:

现在下面的代码:

t := reflect.TypeOf(Response1{})
fmt.Printf("%#v\n", t.Field(2).Tag)
fmt.Printf("%#v\n", t.Field(2).Tag.Get("json"))

Prints:

打印:

"json:\"\\u0000_abc\""
"\x00_abc"

As you can see, the value part for the json key is "\x00_abc" so it properly contains the zero character.

如您所见,json键的值部分是“\x00_abc”,因此它正确地包含了零字符。

But how will the json package use this?

但是json包将如何使用它呢?

The json package uses the value returned by StructTag.Get() (from the reflect package), exactly what we did. You can see it in the json/encode.go source file, typeFields() function, currently line #1032. So far so good.

json包使用StructTag.Get()(从反射包)返回的值,这正是我们所做的。您可以在json/encode中看到它。go源文件,typeFields()函数,当前行#1032。目前为止一切都很顺利。

Then it calls the unexported json.parseTag() function, in json/tags.go source file, currently line #17. This cuts the part after the comma (which becomes the "tag options").

然后,它以json/标记调用未导出的json.parseTag()函数。去源文件,当前行#17。这将在逗号(成为“标记选项”)后的部分进行裁剪。

And finally json.isValidTag() function is called with the previous value, in source file json/encode.go, currently line #731. This function checks the runes of the passed string, and (besides a set of pre-defined allowed characters "!#$%&()*+-./:<=>?@[]^_{|}~ ") rejects everything that is not a unicode letter or digit (as defined by unicode.IsLetter() and unicode.IsDigit()):

isvalidtag()函数在源文件json/encode中使用前一个值调用。去,目前# 731。这个函数检查传递的字符串的符文,并且(除了一组预定义的允许字符“!#$%&()*+-./:<=>?”@[]^ _ { | } ~”)拒绝一切不是一个unicode字母或数字(如由unicode.IsLetter()和unicode.IsDigit()):

if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
    return false
} 

'\u0000' is not part of the pre-defined allowed characters, and as you can guess now, it is neither a letter nor a digit:

“\u0000”不是预先定义的允许字符的一部分,正如你现在可以猜到的,它既不是字母也不是数字:

// Following code prints "INVALID":
c := '\u0000'
if !unicode.IsLetter(c) && !unicode.IsDigit(c) {
    fmt.Println("INVALID")
}

And since isValidTag() returns false, the name (which is the value for the json key, without the "tag options" part) will be discarded (name = "") and not used. So no match will be found for the struct field containing a unicode zero.

由于isValidTag()返回false,因此名称(这是json密钥的值,没有“标记选项”部分)将被丢弃(name =“”),而不使用。因此,在包含unicode zero的struct字段中不会找到匹配项。

For an alternative solution use a map, or a custom json.Unmarshaler or use json.RawMessage.

对于另一种解决方案,使用映射或自定义json。数据或使用json.RawMessage。

But I would highly discourage using such ugly json keys. I understand likely you are just trying to parse such json response and it may be out of your reach, but you should fight against using these keys as they will just cause more problems later on (e.g. if stored in db, by inspecting records it will be very hard to spot that there are '\u0000' characters in them as they may be displayed as nothing).

但是我强烈反对使用这种丑陋的json键。我知道可能你只是试图解析json响应,可能是你的,但你应该反对使用这些键以后只会导致更多的问题(例如,如果存储在数据库中,通过检查记录将会很难发现,中有“\ u0000”人物他们可能显示为没有)。

#2


0  

You cannot do in such way due to: http://golang.org/ref/spec#Struct_types

您不能这样做,原因是:http://golang.org/ref/spec#Struct_types。

But You can unmarshal to map[string]interface{} then check field names of that object through regexp.

但是,您可以通过regexp将该对象的字段名称重新映射到map[string]接口{}。

#3


0  

I don't think this is possible with struct tags. The best thing you can do is unmarshal it into map[string]interface{} and then get the values manually:

我认为结构标签是不可能的。您所能做的最好的事情就是将它拆成map[string]接口{},然后手动获取值:

var b = []byte(`{"\u0000abc":42}`)
var m map[string]interface{}
err := json.Unmarshal(b, &m)
if err != nil {
    panic(err)
}
fmt.Println(m, m["\x00abc"])

Playground: http://play.golang.org/p/RtS7Nst0d7.

操场上:http://play.golang.org/p/RtS7Nst0d7。