在双引号内转义双引号

时间:2022-09-15 15:35:24

I have a string [{"Id":"1","msg":""Lorem Ipsum""}] in which I need to just escape the quotes inside the quotes like this [{"Id":"1","msg":"\"Lorem Ipsum\""}]. I don't have access to generator code to modify, so I'm looking for a regex solution or efficient Java solution.

我有一个字符串[{“Id”:“1”,“msg”:“”Lorem Ipsum“”}]我需要在引号内转义引号,如[{“Id”:“1”, “msg”:“\”Lorem Ipsum \“”}]。我没有访问生成器代码来修改,所以我正在寻找一个正则表达式解决方案或高效的Java解决方案。

I tried selecting matches with \"[^\"]*?(\"*)[^\"]*?\" which is of no use. Any help is really appreciated. Thanks in advance.

我尝试用\“[^ \”] *?(\“*)[^ \”] *?\“选择匹配,这是没用的。非常感谢任何帮助。提前致谢。

Note that it isn't guaranteed that the pattern is always two double quotes together, it can be something like this too "Lorem "Ipsum" test", which should become "Lorem \"Ipsum\" test".

请注意,不能保证模式总是两个双引号,它也可能是这样的“Lorem”Ipsum“test”,它应该变成“Lorem”Ipsum \“test”。

PS: I've already looked at Regular expression to escape double quotes within double quotes

PS:我已经看过正则表达式以避免双引号内的双引号

3 个解决方案

#1


3  

The problem

A finite automaton - the theoretical equivalent of a regex - can't parse recursive structures. Since you can have inner quotes, and possible inner-inner quotes, your problem can't be solved using a regex.

有限自动机 - 正则表达式的理论等价物 - 无法解析递归结构。由于您可以使用内部引号和可能的内部引号,因此使用正则表达式无法解决您的问题。

Although modern regex engines can overcome this problem with several extensions, don't waste your time on hunting quotes-within-quotes. You'll soon find out that you're actually building a full blown JSON parser.

虽然现代的正则表达式引擎可以通过几个扩展来克服这个问题,但不要浪费你的时间在引号内搜索引号。您很快就会发现,您实际上正在构建一个完整的JSON解析器。

As @johnchen902 stated, even a turing-machine powered parser can not handle ambiguities - so you better not try to suggest a fix to the broken JSON.

正如@ johnchen902所说,即使是图灵机驱动的解析器也无法处理歧义 - 所以你最好不要试图修复破坏的JSON。

Solutions

Create the JSON using a dedicated utility

The given string is not a valid JSON. It's probably created using string concatenation, which is generally a bad idea because it does not escape correctly. You should use a JSON library that can build JSON from a Java data structure, like gson. Create a list of Objects, add an Object-to-Object dictionary to it, and let the library do the escaping and conversions.

给定的字符串不是有效的JSON。它可能是使用字符串连接创建的,这通常是一个坏主意,因为它无法正确转义。您应该使用可以从Java数据结构构建JSON的JSON库,例如gson。创建一个对象列表,向其中添加一个Object-to-Object字典,然后让库进行转义和转换。

Ask the creator to use a validator

If you have received the String from an external source, it's perfectly legitimate to ask for a valid json you can work with. I guess that the creator stitched Strings together, which is the wrong way to build a structured language. Ask the original creator to use a standard library for creating JSONs, or at least use a validator. All modern programming languages offer these mechanisms.

如果您从外部源接收到字符串,那么请求您可以使用的有效json是完全合法的。我想创作者将Strings拼接在一起,这是构建结构化语言的错误方法。请原始创建者使用标准库来创建JSON,或者至少使用验证器。所有现代编程语言都提供这些机制。

在双引号内转义双引号

#2


2  

No, you can't, because a string may have several meanings.

不,你不能,因为一个字符串可能有几个含义。

For example:

例如:

[{"Id":"1","msg":""Lorem Ipsum""}]

May means

可能意味着

[{"Id":"1","msg":""Lorem Ipsum""}]

That is, it can be escaped (parsed) as

也就是说,它可以被转义(解析)为

[{"Id":"1\",\"msg\":\"\"Lorem Ipsum\""}]

There's no way for a program to determine its meaning unless more rules are given.

除非给出更多规则,否则程序无法确定其含义。

#3


0  

String escaped = str.replaceAll(":\"\"(.+?)\"\"([,}])", ":\"\\\\\"$1\\\\\"\"$2");

#1


3  

The problem

A finite automaton - the theoretical equivalent of a regex - can't parse recursive structures. Since you can have inner quotes, and possible inner-inner quotes, your problem can't be solved using a regex.

有限自动机 - 正则表达式的理论等价物 - 无法解析递归结构。由于您可以使用内部引号和可能的内部引号,因此使用正则表达式无法解决您的问题。

Although modern regex engines can overcome this problem with several extensions, don't waste your time on hunting quotes-within-quotes. You'll soon find out that you're actually building a full blown JSON parser.

虽然现代的正则表达式引擎可以通过几个扩展来克服这个问题,但不要浪费你的时间在引号内搜索引号。您很快就会发现,您实际上正在构建一个完整的JSON解析器。

As @johnchen902 stated, even a turing-machine powered parser can not handle ambiguities - so you better not try to suggest a fix to the broken JSON.

正如@ johnchen902所说,即使是图灵机驱动的解析器也无法处理歧义 - 所以你最好不要试图修复破坏的JSON。

Solutions

Create the JSON using a dedicated utility

The given string is not a valid JSON. It's probably created using string concatenation, which is generally a bad idea because it does not escape correctly. You should use a JSON library that can build JSON from a Java data structure, like gson. Create a list of Objects, add an Object-to-Object dictionary to it, and let the library do the escaping and conversions.

给定的字符串不是有效的JSON。它可能是使用字符串连接创建的,这通常是一个坏主意,因为它无法正确转义。您应该使用可以从Java数据结构构建JSON的JSON库,例如gson。创建一个对象列表,向其中添加一个Object-to-Object字典,然后让库进行转义和转换。

Ask the creator to use a validator

If you have received the String from an external source, it's perfectly legitimate to ask for a valid json you can work with. I guess that the creator stitched Strings together, which is the wrong way to build a structured language. Ask the original creator to use a standard library for creating JSONs, or at least use a validator. All modern programming languages offer these mechanisms.

如果您从外部源接收到字符串,那么请求您可以使用的有效json是完全合法的。我想创作者将Strings拼接在一起,这是构建结构化语言的错误方法。请原始创建者使用标准库来创建JSON,或者至少使用验证器。所有现代编程语言都提供这些机制。

在双引号内转义双引号

#2


2  

No, you can't, because a string may have several meanings.

不,你不能,因为一个字符串可能有几个含义。

For example:

例如:

[{"Id":"1","msg":""Lorem Ipsum""}]

May means

可能意味着

[{"Id":"1","msg":""Lorem Ipsum""}]

That is, it can be escaped (parsed) as

也就是说,它可以被转义(解析)为

[{"Id":"1\",\"msg\":\"\"Lorem Ipsum\""}]

There's no way for a program to determine its meaning unless more rules are given.

除非给出更多规则,否则程序无法确定其含义。

#3


0  

String escaped = str.replaceAll(":\"\"(.+?)\"\"([,}])", ":\"\\\\\"$1\\\\\"\"$2");