I'm looking for regex to extract json string from text. I have the text below, which contains
我正在寻找正则表达式从文本中提取json字符串。我有下面的文字,其中包含
JSON string(mTitle, mPoster, mYear, mDate)
like that:
{"999999999":"138138138","020202020202":{"846":{"mTitle":"\u0430","mPoster":{"
small":"\/upload\/ms\/b_248.jpg","middle":"600.jpg","big":"400.jpg"},"mYear"
:"2013","mDate":"2014-01-01"},"847":{"mTitle":"\u043a","mPoster":"small":"\/upload\/ms\/241.jpg","middle":"600.jpg","big":"
138.jpg"},"mYear":"2013","mDate":"2013-12-26"},"848":{"mTitle":"\u041f","mPoster":{"small":"\/upload\/movies\/2
40.jpg","middle":"138.jpg","big":"131.jpg"},"mYear":"2013","mDate":"2013-12-19"}}}
In order to parse JSON string I should extract JSON string from the text. That is why, my question: Could you help me to get only JSON string from text? Please help.
为了解析JSON字符串,我应该从文本中提取JSON字符串。这就是为什么,我的问题:你能帮助我从文本中只获取JSON字符串吗?请帮忙。
I've tried this regular expression with no success:
我试过这个正则表达式没有成功:
{"mTitle":(\w|\W)*"mDate":(\w|\W)*}
1 个解决方案
#1
8
The following regex should work:
以下正则表达式应该起作用:
\{\s*"mTitle"\s*:\s*(.+?)\s*,\s*"mPoster":\s*(.+?)\s*,\s*"mYear"\s*:\s*(.+?)\s*,\s*"mDate"\s*:\s*(.+?)\s*\}
在这里查看演示。
The main difference from your regex is the .+?
part, that, broken down, means:
与正则表达式的主要区别在于。+?部分,分解,意思是:
- Match any character (
.
) - One or more times (
+
) - As little as possible (
?
)
匹配任何字符(。)
一次或多次(+)
尽可能少 (?)
The ?
operator after the +
is very important here --- because if you removed it, the first .+
(in \{\s*"mTitle"\s*:\s*(.+?)
) would match the whole text, not the text up to the "mPoster"
word, that is what you want.
的? +之后的运算符非常重要---因为如果你删除它,第一个。+(在\ {\ s *“mTitle”\ s *:\ s *(。+?))将与整个文本匹配,不是“mPoster”这个词的文字,这就是你想要的。
Notice it is just a more complicated version of \{"mTitle":(.+?),"mPoster":(.+?),"mYear":(.+?),"mDate":(.+?)\}
(with \s*
to match spaces, allowed by the JSON notation).
请注意,它只是\ {“mTitle”的更复杂版本:(。+?),“mPoster”:(。+?),“mYear”:(。+?),“mDate”:(。+?) \}(使用\ s *匹配空格,JSON表示法允许)。
#1
8
The following regex should work:
以下正则表达式应该起作用:
\{\s*"mTitle"\s*:\s*(.+?)\s*,\s*"mPoster":\s*(.+?)\s*,\s*"mYear"\s*:\s*(.+?)\s*,\s*"mDate"\s*:\s*(.+?)\s*\}
在这里查看演示。
The main difference from your regex is the .+?
part, that, broken down, means:
与正则表达式的主要区别在于。+?部分,分解,意思是:
- Match any character (
.
) - One or more times (
+
) - As little as possible (
?
)
匹配任何字符(。)
一次或多次(+)
尽可能少 (?)
The ?
operator after the +
is very important here --- because if you removed it, the first .+
(in \{\s*"mTitle"\s*:\s*(.+?)
) would match the whole text, not the text up to the "mPoster"
word, that is what you want.
的? +之后的运算符非常重要---因为如果你删除它,第一个。+(在\ {\ s *“mTitle”\ s *:\ s *(。+?))将与整个文本匹配,不是“mPoster”这个词的文字,这就是你想要的。
Notice it is just a more complicated version of \{"mTitle":(.+?),"mPoster":(.+?),"mYear":(.+?),"mDate":(.+?)\}
(with \s*
to match spaces, allowed by the JSON notation).
请注意,它只是\ {“mTitle”的更复杂版本:(。+?),“mPoster”:(。+?),“mYear”:(。+?),“mDate”:(。+?) \}(使用\ s *匹配空格,JSON表示法允许)。