需要帮助使用正则表达式从文本中提取json字符串中的字符串(python)

时间:2022-09-13 11:45:42

Hi so I'm trying to extract moderator names from a simple piece of code like this:

嗨所以我试图从一段简单的代码中提取主持人名称,如下所示:

{
  "_links": {},
  "chatter_count": 2,
  "chatters": {
    "moderators": [
      "nightbot",
      "vivbot"
    ],
    "staff": [],
    "admins": [],
    "global_mods": [],
    "viewers": []
  }
}

I've been trying to grab the moderators using \"moderators\":\s*[(\s*\"\w*\"\,)\s*] but to no success. I'm using regex over json parsing mostly for the challenge.

我一直试图使用\“版主”来抓住主持人:\ s * [(\ s * \“\ w * \”\,)\ s *]但没有成功。我正在使用regex而不是json解析主要是为了挑战。

1 个解决方案

#1


1  

moderators = list()
first = re.compile(r'moderators.*?\[([^\]]*)', re.I)
second = re.compile(r'"(.*?)"')

strings = first.findall(string)
for strings2 in strings:
  moderators = moderators + second.findall(strings2)

This should do the trick

这应该可以解决问题

The first regular expression extracts everything between 2 square braces. The second regular expression extracts the string from it.

第一个正则表达式提取2个方括号之间的所有内容。第二个正则表达式从中提取字符串。

I broke it up into 2 regex expressions for readability and ease of writing

我把它分成2个正则表达式,以便于阅读和编写

NOW, using the json module, you could do something much easier:

现在,使用json模块,您可以更轻松地做一些事情:

import json
a = json.loads(string)
moderators = a['chatters']['moderators']

#1


1  

moderators = list()
first = re.compile(r'moderators.*?\[([^\]]*)', re.I)
second = re.compile(r'"(.*?)"')

strings = first.findall(string)
for strings2 in strings:
  moderators = moderators + second.findall(strings2)

This should do the trick

这应该可以解决问题

The first regular expression extracts everything between 2 square braces. The second regular expression extracts the string from it.

第一个正则表达式提取2个方括号之间的所有内容。第二个正则表达式从中提取字符串。

I broke it up into 2 regex expressions for readability and ease of writing

我把它分成2个正则表达式,以便于阅读和编写

NOW, using the json module, you could do something much easier:

现在,使用json模块,您可以更轻松地做一些事情:

import json
a = json.loads(string)
moderators = a['chatters']['moderators']