Note: This question is very close to Embedding JSON objects in script tags, but the responses to that question provides what I already know (that in JSON /
== \/
). I want to know how to do that escaping.
注意:这个问题非常接近于在脚本标记中嵌入JSON对象,但是对该问题的响应提供了我已经知道的内容(在JSON / == \ /中)。我想知道如何逃避。
The HTML spec prohibits closed HTML tags anywhere within a <script>
element. So, this causes parse errors:
HTML规范禁止在
<script>
var assets = [{
"asset_created": null,
"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f",
"body": "<script></script>"
}];
</script>
In my case, I'm generating the invalid situation by rendering a JSON string inside a Django template, i.e.:
在我的例子中,我通过在Django模板中呈现JSON字符串来生成无效情况,即:
<script>
var assets = {{ json_string }};
</script>
I know that JSON parses \/
the same as /
, so if I can just escape my closing HTML tags in the JSON string, I'll be good. But, I'm not sure of the best way to do this.
我知道JSON解析\ /与/相同,所以如果我可以在JSON字符串中转义关闭的HTML标记,我会很好。但是,我不确定这样做的最好方法。
My naive approach would just be this:
我的天真方法就是这样:
json_string = '[{"asset_created": null, "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "<script></script>"}]'
escaped_json_string = json_string.replace('</', r'<\/')
Is there a better way? Or any gotchas that I'm overlooking?
有没有更好的办法?或者我忽略的任何陷阱?
1 个解决方案
#1
6
Updated Answer
Okay I assumed a few things incorrectly. For escaping the JSON, the simplejson library has a method JSONEncoderForHTML than can be used. You may need to install it via pip
or easy_install
if the code doesn't work. Then you can do something like this:
好吧,我假设了一些不正确的事情。为了转义JSON,simplejson库有一个比可以使用的方法JSONEncoderForHTML。如果代码不起作用,您可能需要通过pip或easy_install安装它。然后你可以做这样的事情:
import simplejson
asset_json=simplejson.loads(json_string)
encoded=simplejson.encoder.JSONEncoderForHTML().encode(assets_json)
which encoded
will give you this:
哪个编码会给你这个:
'{"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "\\u003cscript\\u003e\\u003c/script\\u003e", "asset_created": null}'
This is a more overall solution than the slash replace as it handles other encoding caveats as well.
这是一个比斜杠替换更全面的解决方案,因为它也处理其他编码警告。
The loads
part is a side-effect of having the JSON already encoded. This can be avoided by not using DJango if possible to generate the JSON and instead using simplejson:
加载部分是已经编码JSON的副作用。如果可能的话,不使用DJango来生成JSON而不是使用simplejson可以避免这种情况:
simplejson.dumps(your_object_to_encode, cls=simplejson.encoder.JSONEncoderForHTML)
Old Answer
Try wrapping your script in CDATA:
尝试将您的脚本包装在CDATA中:
<script>
//<![CDATA[
var assets = [{
"asset_created": null,
"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f",
"body": "<script></script>"
}];
//]]>
</script>
It's meant to flag the parser on this sort of thing. Otherwise you'll need to use the character escapes that have been mentioned.
它意味着在这类事情上标记解析器。否则,您将需要使用已提到的字符转义。
#1
6
Updated Answer
Okay I assumed a few things incorrectly. For escaping the JSON, the simplejson library has a method JSONEncoderForHTML than can be used. You may need to install it via pip
or easy_install
if the code doesn't work. Then you can do something like this:
好吧,我假设了一些不正确的事情。为了转义JSON,simplejson库有一个比可以使用的方法JSONEncoderForHTML。如果代码不起作用,您可能需要通过pip或easy_install安装它。然后你可以做这样的事情:
import simplejson
asset_json=simplejson.loads(json_string)
encoded=simplejson.encoder.JSONEncoderForHTML().encode(assets_json)
which encoded
will give you this:
哪个编码会给你这个:
'{"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "\\u003cscript\\u003e\\u003c/script\\u003e", "asset_created": null}'
This is a more overall solution than the slash replace as it handles other encoding caveats as well.
这是一个比斜杠替换更全面的解决方案,因为它也处理其他编码警告。
The loads
part is a side-effect of having the JSON already encoded. This can be avoided by not using DJango if possible to generate the JSON and instead using simplejson:
加载部分是已经编码JSON的副作用。如果可能的话,不使用DJango来生成JSON而不是使用simplejson可以避免这种情况:
simplejson.dumps(your_object_to_encode, cls=simplejson.encoder.JSONEncoderForHTML)
Old Answer
Try wrapping your script in CDATA:
尝试将您的脚本包装在CDATA中:
<script>
//<![CDATA[
var assets = [{
"asset_created": null,
"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f",
"body": "<script></script>"
}];
//]]>
</script>
It's meant to flag the parser on this sort of thing. Otherwise you'll need to use the character escapes that have been mentioned.
它意味着在这类事情上标记解析器。否则,您将需要使用已提到的字符转义。