I need to find all the pairs of word joined with the "and" word.
我需要找到所有用“和”字加入的单词对。
So far I tried with the following:
到目前为止,我尝试了以下内容:
val salute = """.*?(\w+\W+)and(\W+\w+).*""".r
val salute(a,b) = "hello ladies and gentlemen, mesdames and messieurs, how are you?"
a: String = "ladies "
b: String = " gentlemen"
Now I'd like something like this:
现在我想要这样的事情:
salute.findAllMatches("hello ladies and gentlemen, mesdames and messieurs, how are you?")
List[(java.lang.String, java.lang.String)] = List((ladies,gentlemen), (mesdames,mesieurs))
I tried with
我试过了
salute.findAllIn("hello ladies and gentlemen, mesdames and messieurs, how are you?").toList
res14: List[String] = List(hello ladies and gentlemen, mesdames and messieurs, how are you?)
But, as you can see, without success...
但是,正如你所看到的,没有成功......
2 个解决方案
#1
3
Your regex
你的正则表达式
.*?(\w+\W+)and(\W+\w+).*
will already match everything because of .* before and after. Change it to (or similar based on requirements):
因为。*之前和之后都会匹配所有内容。将其更改为(或根据要求类似):
(\w+\W+)and(\W+\w+)
#2
0
For getting the result as a list of tuples as you described above you could do these two things:
要将结果作为上面描述的元组列表获取,您可以执行以下两项操作:
Change your regex to be not so greedy i.e. to not consume the whole string at once For example:
将你的正则表达式更改为不那么贪心,即不立即消耗整个字符串例如:
""".(\w+) and (\w+)""".r
Use findAllIn and use the RegexExtractor on all matches to get the parts in the catching parantheses
使用findAllIn并在所有匹配项上使用RegexExtractor来获取捕获parantheses中的部分
Putting everything together a solution producing the desired result might look like this:
将所有内容放在一起产生所需结果的解决方案可能如下所示:
val salute = """.(\w+) and (\w+)""".r
val string = "hello ladies and gentlemen, mesdames and messieurs, how are you?"
val results = for {
salute(left,right) <- (salute findAllIn string)
} yield (left,right)
println(results toList)
results in
结果是
List((ladies,gentlemen), (mesdames,messieurs))
#1
3
Your regex
你的正则表达式
.*?(\w+\W+)and(\W+\w+).*
will already match everything because of .* before and after. Change it to (or similar based on requirements):
因为。*之前和之后都会匹配所有内容。将其更改为(或根据要求类似):
(\w+\W+)and(\W+\w+)
#2
0
For getting the result as a list of tuples as you described above you could do these two things:
要将结果作为上面描述的元组列表获取,您可以执行以下两项操作:
Change your regex to be not so greedy i.e. to not consume the whole string at once For example:
将你的正则表达式更改为不那么贪心,即不立即消耗整个字符串例如:
""".(\w+) and (\w+)""".r
Use findAllIn and use the RegexExtractor on all matches to get the parts in the catching parantheses
使用findAllIn并在所有匹配项上使用RegexExtractor来获取捕获parantheses中的部分
Putting everything together a solution producing the desired result might look like this:
将所有内容放在一起产生所需结果的解决方案可能如下所示:
val salute = """.(\w+) and (\w+)""".r
val string = "hello ladies and gentlemen, mesdames and messieurs, how are you?"
val results = for {
salute(left,right) <- (salute findAllIn string)
} yield (left,right)
println(results toList)
results in
结果是
List((ladies,gentlemen), (mesdames,messieurs))