Scala以包含所有连字符的行分割多行字符串

时间:2022-02-02 21:44:38

I have a multiline string that contains:

我有一个多行字符串包含:

this is line 1

------------------------------

this is line 2

+++++++++++++++++++++++++

this is line 3 

---------------

this is line 4

I want to divide this string into chunks by splitting on lines that contains only (-,+), I tried regular expression (^++$)|(^-+$) that worked fine on regex validators, but it's not working in Scala.

我想把这个字符串分成块通过分裂线只包含(-,+),我试着正则表达式(^ + + $)|(^ + $),正则表达式验证器工作得很好,但它不是在Scala中工作。

1 个解决方案

#1


2  

You need to use a multiline modifier to make ^ match the start of a line and $ to match the end of the line. Also, enclosing the pattern with \s* (zero or more whitespaces) will trim the items in the resulting list:

你需要使用多行修改器^匹配一行的开始和结束美元来匹配。此外,用\s*(0或更多的白空格)封装模式将修饰结果列表中的项目:

val rx = """(?m)\s*^(\++|-+)$\s*"""
val res = text.split(rx)
print(res.toList)
// => List(this is line 1, this is line 2, this is line 3, this is line 4)

See the Scala demo

看到Scala演示

Note I also shortened the pattern by using a single grouping construct like ^(\++|-+)$. It matches the start of a line, then 1+ plus or hyphen symbols, and then end of a line (thus, no need repeating ^ and $).

注意我也缩短了模式通过使用单个分组构造像^(\ + + | - +)美元。它匹配的开始,然后1 + +或连字符符号,然后结束一行(因此,没有必要重复^和$)。

Another solution can be splitting the string with line breaks, and then filtering out empty lines, or the lines that only contain plus or hyphen only symbols:

另一种解决方案是用换行符来分割字符串,然后过滤掉空行,或者只包含加号或连字符的行:

print(text.split("\\r?\\n").filter(line=>line.matches("""(\++|-+)?""") == false).toList)
// => List(this is line 1, this is line 2, this is line 3 , this is line 4)

See another Scala demo

看到另一个Scala演示

#1


2  

You need to use a multiline modifier to make ^ match the start of a line and $ to match the end of the line. Also, enclosing the pattern with \s* (zero or more whitespaces) will trim the items in the resulting list:

你需要使用多行修改器^匹配一行的开始和结束美元来匹配。此外,用\s*(0或更多的白空格)封装模式将修饰结果列表中的项目:

val rx = """(?m)\s*^(\++|-+)$\s*"""
val res = text.split(rx)
print(res.toList)
// => List(this is line 1, this is line 2, this is line 3, this is line 4)

See the Scala demo

看到Scala演示

Note I also shortened the pattern by using a single grouping construct like ^(\++|-+)$. It matches the start of a line, then 1+ plus or hyphen symbols, and then end of a line (thus, no need repeating ^ and $).

注意我也缩短了模式通过使用单个分组构造像^(\ + + | - +)美元。它匹配的开始,然后1 + +或连字符符号,然后结束一行(因此,没有必要重复^和$)。

Another solution can be splitting the string with line breaks, and then filtering out empty lines, or the lines that only contain plus or hyphen only symbols:

另一种解决方案是用换行符来分割字符串,然后过滤掉空行,或者只包含加号或连字符的行:

print(text.split("\\r?\\n").filter(line=>line.matches("""(\++|-+)?""") == false).toList)
// => List(this is line 1, this is line 2, this is line 3 , this is line 4)

See another Scala demo

看到另一个Scala演示