Let's say I have a string that can be formatted a few different ways, for example:
假设我有一个字符串,可以用几种不同的方式进行格式化,例如:
- "languages:(ruby AND python) role:(software engineer or data scientist)"
- 语言(ruby和python)角色:(软件工程师或数据科学家)
- "role:(software engineer or data scientist) languages:(ruby AND python)"
- “角色:(软件工程师或数据科学家)语言:(ruby和python)”
- "languages:'python' role:'software engineer'"
- “语言:python的角色:软件工程师”
- "languages:(ruby AND python)role:(software engineer or data scientist)"
- 语言(ruby和python)角色:(软件工程师或数据科学家)
- "languages:'python'role:'software engineer'"
- “语言:python 'role:“软件工程师”
- "languages:'python'
- “语言:python
And I want to parse this string, identify if role:
is present in the string and then capture whatever word(s) are relevant to "role", excluding whatever isn't wrapped in the close parans )
OR the '
... so in this example, "languages:'python'role:'software engineer'"
would return "software engineer" and "role:(software engineer or data scientist) languages:(ruby AND python)"
would return "software engineer or data scientist".
我想要解析这个字符串,标识字符串中是否有role:,然后捕获任何与“role”相关的词,排除在关闭的parans中没有包装的词)或'…因此,在这个例子中,“语言:‘python’角色:‘软件工程师’”将返回“软件工程师”和“角色:(软件工程师或数据科学家)语言:(ruby和python)”将返回“软件工程师或数据科学家”。
Is there a way to do this with something LIKE a word boundary? Specifically, the region after the match on role:
would be delimited by either quotes or ()
?
有没有一种方法可以用一个词的边界来做?具体来说,角色匹配后的区域:将被引号或()分隔?
1 个解决方案
#1
3
You may use
你可以用
s.scan(/role:(?:\(\K[^()]+(?=\))|'\K[^']+(?='))/)
See the regex demo
看到regex演示
Details
细节
-
role:
- a literal substring - 角色:-文字子字符串
-
(?:
- start of an alternation non-capturing group:-
\(
- a(
char - \(- a) char
-
\K
- match reset operator discarding the text matched so far - \K -匹配重置操作符丢弃到目前为止匹配的文本
-
[^()]+
- 1+ chars other than(
and)
- [^()]+ - 1 +(和)以外的字符
-
(?=\))
- a)
should follow the current position - (?) - a)应跟随当前的位置
-
- (?:开始交替无组:\(-(char \ K -匹配重置运营商丢弃文本匹配到目前为止[^())+ - 1 +(和)以外的字符(? = \))- a)应遵循当前位置
-
|
- or - |——或者
-
'
- a'
char - -一个炭
-
\K
- match reset operator discarding the text matched so far - \K -匹配重置操作符丢弃到目前为止匹配的文本
-
[^']+
- 1+ chars other than'
- (^)+ - 1 +字符以外的
-
(?=')
- there must be'
char immediately to the right - (?=) -必须在右边马上有“char”
-
)
- end of the alternation group. - )-交替组的结束。
NOTE: if you do not care if there is a )
or trailing '
, remove the lookaheads to simplify the regex.
注意:如果您不关心是否有)或拖尾,请删除lookahead以简化regex。
Ruby演示:
s = "languages:(ruby AND python) role:(software engineer or data scientist) role:(software engineer or data scientist) languages:(ruby AND python) languages:'python' role:'software engineer' languages:(ruby AND python)role:(software engineer or data scientist) languages:'python'role:'software engineer' languages:'python'"
puts s.scan(/role:(?:\(\K[^()]+(?=\))|'\K[^']+(?='))/)
Output:
输出:
software engineer or data scientist
software engineer or data scientist
software engineer
software engineer or data scientist
software engineer
#1
3
You may use
你可以用
s.scan(/role:(?:\(\K[^()]+(?=\))|'\K[^']+(?='))/)
See the regex demo
看到regex演示
Details
细节
-
role:
- a literal substring - 角色:-文字子字符串
-
(?:
- start of an alternation non-capturing group:-
\(
- a(
char - \(- a) char
-
\K
- match reset operator discarding the text matched so far - \K -匹配重置操作符丢弃到目前为止匹配的文本
-
[^()]+
- 1+ chars other than(
and)
- [^()]+ - 1 +(和)以外的字符
-
(?=\))
- a)
should follow the current position - (?) - a)应跟随当前的位置
-
- (?:开始交替无组:\(-(char \ K -匹配重置运营商丢弃文本匹配到目前为止[^())+ - 1 +(和)以外的字符(? = \))- a)应遵循当前位置
-
|
- or - |——或者
-
'
- a'
char - -一个炭
-
\K
- match reset operator discarding the text matched so far - \K -匹配重置操作符丢弃到目前为止匹配的文本
-
[^']+
- 1+ chars other than'
- (^)+ - 1 +字符以外的
-
(?=')
- there must be'
char immediately to the right - (?=) -必须在右边马上有“char”
-
)
- end of the alternation group. - )-交替组的结束。
NOTE: if you do not care if there is a )
or trailing '
, remove the lookaheads to simplify the regex.
注意:如果您不关心是否有)或拖尾,请删除lookahead以简化regex。
Ruby演示:
s = "languages:(ruby AND python) role:(software engineer or data scientist) role:(software engineer or data scientist) languages:(ruby AND python) languages:'python' role:'software engineer' languages:(ruby AND python)role:(software engineer or data scientist) languages:'python'role:'software engineer' languages:'python'"
puts s.scan(/role:(?:\(\K[^()]+(?=\))|'\K[^']+(?='))/)
Output:
输出:
software engineer or data scientist
software engineer or data scientist
software engineer
software engineer or data scientist
software engineer