使用捕获组查找并替换json字符串中出现的所有短语

时间:2021-09-15 19:26:09

I have a stringified JSON which looks like this:

我有一个字符串化的JSON看起来是这样的:

...

"message":null,"elementId:["xyz1","l9ie","xyz1"]}}]}], "startIndex":"1", 
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId:["dgff","xyz1","jkh90"]}}]}]

... it goes on.

…它还在继续。

The part I need to work on is the value of the elementId key. (The 2nd key in the first line, and the last key).

我需要做的部分是电子键的值。(第一行的第二个键,最后一个键)

This key is present in multiple places in the JSON string. The value of this key is an array containing 4-character ids.

此键在JSON字符串中的多个位置出现。这个键的值是一个包含4个字符id的数组。

I need to replace one of these ids with a new one.

我需要用一个新的id替换其中一个。

The kernel of the idea is something like:

这个想法的核心是:

var elemId = 'xyz1' // for instance
var regex = new RegExp(elemId, 'g');
var newString = jsonString.replace(regex, newRandomId);
jsonString = newString;

There are a couple of problems with this approach. The regex will match the id anywhere in the JSON. I need a regex which only matches it inside the elementId array; and nowhere else.

这种方法有几个问题。regex将在JSON中的任何地方匹配id。我需要一个regex,它只与电子逻辑数组中的元素匹配;和其他地方。

I'm trying to use a capturing group to match just the occurrences I need, but I can't quite crack it. I have:

我尝试使用一个捕获组来匹配我需要的事件,但是我不能完全破解它。我有:

/.*elementId":\[".*(xyz1).*"\]}}]/

But this doesn't match the 1st occurence of 'xyz1 in the array.

但是这和第一次出现的xyz1并不匹配。

So, firstly, I need a regex which can match all the 'xyz1's inside elementId; but nowhere else. The sequence of square and curly brackets after elementId ends doesn't change anywhere in the string, if that helps.

所以,首先,我需要一个regex,它可以匹配所有的‘xyz1’内部电子排列;但是没有别的地方了。结束后的方括号和花括号的顺序在字符串的任何地方都不会改变,如果这有帮助的话。

Secondly, even if I have a capturing group that works, string.replace doesn't act as expected. Instead of replacing just the match inside the capturing group, it replaces the whole match.

第二,即使我有一个捕获组,字符串。替换并不像预期的那样。它不只是替换捕获组中的匹配项,而是替换整个匹配项。

So, my second requirement is replacing only the captured groups, not the whole match.

所以,我的第二个要求是只替换捕获的组,而不是整个匹配。

What a need is a piece of js code which will replace my 'xyz1's where needed and return the following string (assuming the newRandomId is 'abcd'):

需要的是一段js代码,在需要时替换‘xyz1’并返回以下字符串(假设newRandomId是‘abcd’):

"message":null,"elementId:["abcd","l9ie","abcd"]}}]}], "startIndex":"1", 
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId:["dgff","abcd","jkh9"]}}]}]

Note that the value of 'sourceId' is unaffected.

请注意,“sourceId”的值不受影响。

EDIT: I have to work with the JSON. I can't parse it and work with the object since I don't know all the places the old id might be in the object and looping through it multiple times (for multiple elements) would be time-consuming

编辑:我必须使用JSON。我无法解析它并处理对象,因为我不知道旧id可能在对象中的所有位置,并且对它进行多次循环(对于多个元素)将非常耗时

1 个解决方案

#1


1  

Assuming you can't just parse and change the JS object, you could use 2 regexes: one to extract the array and the one to change the desired ids inside:

假设不能解析和更改JS对象,可以使用两个regex:一个用于提取数组,另一个用于更改内部所需的id:

var output = input.replace(/("elementId"\s*:\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
  return start + content.replace(/"xyz1"/g, '"rand"') + end;
});

The arguments _, start, content, end are produced as result of the regex (documentation here):

参数_,start, content, end是regex(这里的文档)的结果:

  • _ is the whole matched string (from "elementId:\[ to ]). I choose this name because it's an old convention for arguments you don't use
  • _是整个匹配的字符串(来自eld:\[to])。我选择这个名字是因为它是一个不使用的参数的旧约定
  • start is the first group ("elementId:\[)
  • start是第一组(“eld:\[)”)
  • content is the second captured group, that is the internal part of the array
  • content是第二个捕获的组,即数组的内部部分
  • end id the third group, ]
  • 结束id第三组,]

Using the groups instead of hardcoding the start and end parts in the returned string serves two purposes

使用组而不是硬编码返回字符串的开始和结束部分有两个目的。

  • avoid duplication (DRY principle)
  • 避免重复(干燥原理)
  • make it possible to have variable strings (for example in my regex I accept optional spaces after the :)
  • 使变量字符串成为可能(例如,在我的regex中,我接受以下选项:)

var input = document.getElementById("input").innerHTML.trim();
var output = input.replace(/("elementId":\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
  return start + content.replace(/"xyz1"/g, '"rand"') + end;
});

document.getElementById("output").innerHTML = output;
Input:
<pre id=input>
"message":null,"elementId":["xyz1","l9ie","xyz1"]}}]}], "startIndex":"1", 
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId":["dgff","xyz1","jkh9"]}}]}]
</pre>
Output:
<pre id=output>
</pre>

Notes:

注:

  • it would be easy to do the whole operation in one regex if they weren't repetition of the searched id in one array. But the present structure makes it easy to handle several ids to replace at once.
  • 如果不重复在一个数组中搜索的id,那么在一个regex中执行整个操作将很容易。但是目前的结构使得同时处理多个id很容易。
  • I use non captured groups (?:...) in order to unclutter the arguments passed to the external replacing callback
  • 我使用非捕获的组(?:…)来清除传递给外部替换回调的参数

#1


1  

Assuming you can't just parse and change the JS object, you could use 2 regexes: one to extract the array and the one to change the desired ids inside:

假设不能解析和更改JS对象,可以使用两个regex:一个用于提取数组,另一个用于更改内部所需的id:

var output = input.replace(/("elementId"\s*:\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
  return start + content.replace(/"xyz1"/g, '"rand"') + end;
});

The arguments _, start, content, end are produced as result of the regex (documentation here):

参数_,start, content, end是regex(这里的文档)的结果:

  • _ is the whole matched string (from "elementId:\[ to ]). I choose this name because it's an old convention for arguments you don't use
  • _是整个匹配的字符串(来自eld:\[to])。我选择这个名字是因为它是一个不使用的参数的旧约定
  • start is the first group ("elementId:\[)
  • start是第一组(“eld:\[)”)
  • content is the second captured group, that is the internal part of the array
  • content是第二个捕获的组,即数组的内部部分
  • end id the third group, ]
  • 结束id第三组,]

Using the groups instead of hardcoding the start and end parts in the returned string serves two purposes

使用组而不是硬编码返回字符串的开始和结束部分有两个目的。

  • avoid duplication (DRY principle)
  • 避免重复(干燥原理)
  • make it possible to have variable strings (for example in my regex I accept optional spaces after the :)
  • 使变量字符串成为可能(例如,在我的regex中,我接受以下选项:)

var input = document.getElementById("input").innerHTML.trim();
var output = input.replace(/("elementId":\s*\[)((?:".{4}",?)*)(\])/g, function(_,start,content,end){
  return start + content.replace(/"xyz1"/g, '"rand"') + end;
});

document.getElementById("output").innerHTML = output;
Input:
<pre id=input>
"message":null,"elementId":["xyz1","l9ie","xyz1"]}}]}], "startIndex":"1", 
"transitionTime":"3","sourceId":"xyz1","isLocked":false,"autoplay":false
,"mutevideo":false,"loopvideo":false,"soundonhover":false,"videoCntrlVisibility":0,
...,"elementId":["dgff","xyz1","jkh9"]}}]}]
</pre>
Output:
<pre id=output>
</pre>

Notes:

注:

  • it would be easy to do the whole operation in one regex if they weren't repetition of the searched id in one array. But the present structure makes it easy to handle several ids to replace at once.
  • 如果不重复在一个数组中搜索的id,那么在一个regex中执行整个操作将很容易。但是目前的结构使得同时处理多个id很容易。
  • I use non captured groups (?:...) in order to unclutter the arguments passed to the external replacing callback
  • 我使用非捕获的组(?:…)来清除传递给外部替换回调的参数