如何删除Scala中两个特定字符之间的子字符串

时间:2022-08-11 17:06:41

I have this List in Scala:

我在Scala中有这个List:

List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

And I want to obtain the same List with the substrings between | and ] removed and | removed too.

我希望获得与|之间的子串相同的List并删除和|也删除了。

So the result would be:

结果将是:

List[String] = List([[aaa]], [[ccc]], [[ooo]])

I tried something making a String with the List and using replaceAll, but I want to conserve the List.

我尝试用List创建一个String并使用replaceAll,但我想保存List。

Thanks.

谢谢。

3 个解决方案

#1


3  

You can use a simple \|.*?]] regex to match these substrings you need to remove.

您可以使用简单的\ |。*?]]正则表达式匹配您需要删除的这些子字符串。

Here is a way to perform the replacement in Scala code:

这是一种在Scala代码中执行替换的方法:

val l = List[String]("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
println(l.map(x => x.replaceAll("""\|.*?(]])""", "$1"))) 

See the Scala demo

请参阅Scala演示

I added a capturing group around ]] and used a $1 backreference in the replacement pattern to insert the ]] back into the result.

我添加了一个捕获组]]并在替换模式中使用$ 1反向引用将]]插回到结果中。

Details:

细节:

  • \| - a literal | pi[e symbol (since it is a special char outide of a character class, it must be escaped)
  • \ | - 文字| pi [e符号(因为它是字符类的特殊字符,它必须被转义)
  • .*? - any zero or more symbols other than line break symbols
  • 。*? - 除换行符号以外的任何零个或多个符号
  • (]]) - Group 1 capturing ]] substring (note that ] outside of a character class does not need escaping, it is just the opposite of the case with |).
  • (]]) - 组1捕获]]子串(注意)在字符类之外不需要转义,它与|)的情况正好相反。

#2


4  

Here is a simple solution that should be quite good in performance:

这是一个性能非常好的简单解决方案:

val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list.map(str => str.takeWhile(_ != '|') + "]]" )

It assumes that the format of the strings is:

它假定字符串的格式是:

  • Two left square brackets [ at the beginning,
  • 两个左方括号[开头,
  • then the word we want to extract,
  • 那么我们要提取的词,
  • and then a pipe |.
  • 然后是管道。

#3


0  

Replace the 3 characters between | and } with ].

替换|之间的3个字符与 ]。

regex is "\\|(.{3})\\]" (do not forget to escape | and })

正则表达式是“\\ |(。{3})\\]”(不要忘记逃避|和})

scala> val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list: List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

scala> list.map(_.replaceAll("\\|(.{3})\\]", "]"))
res16: List[String] = List([[aaa]], [[ccc]], [[ooo]])

#1


3  

You can use a simple \|.*?]] regex to match these substrings you need to remove.

您可以使用简单的\ |。*?]]正则表达式匹配您需要删除的这些子字符串。

Here is a way to perform the replacement in Scala code:

这是一种在Scala代码中执行替换的方法:

val l = List[String]("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
println(l.map(x => x.replaceAll("""\|.*?(]])""", "$1"))) 

See the Scala demo

请参阅Scala演示

I added a capturing group around ]] and used a $1 backreference in the replacement pattern to insert the ]] back into the result.

我添加了一个捕获组]]并在替换模式中使用$ 1反向引用将]]插回到结果中。

Details:

细节:

  • \| - a literal | pi[e symbol (since it is a special char outide of a character class, it must be escaped)
  • \ | - 文字| pi [e符号(因为它是字符类的特殊字符,它必须被转义)
  • .*? - any zero or more symbols other than line break symbols
  • 。*? - 除换行符号以外的任何零个或多个符号
  • (]]) - Group 1 capturing ]] substring (note that ] outside of a character class does not need escaping, it is just the opposite of the case with |).
  • (]]) - 组1捕获]]子串(注意)在字符类之外不需要转义,它与|)的情况正好相反。

#2


4  

Here is a simple solution that should be quite good in performance:

这是一个性能非常好的简单解决方案:

val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list.map(str => str.takeWhile(_ != '|') + "]]" )

It assumes that the format of the strings is:

它假定字符串的格式是:

  • Two left square brackets [ at the beginning,
  • 两个左方括号[开头,
  • then the word we want to extract,
  • 那么我们要提取的词,
  • and then a pipe |.
  • 然后是管道。

#3


0  

Replace the 3 characters between | and } with ].

替换|之间的3个字符与 ]。

regex is "\\|(.{3})\\]" (do not forget to escape | and })

正则表达式是“\\ |(。{3})\\]”(不要忘记逃避|和})

scala> val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list: List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

scala> list.map(_.replaceAll("\\|(.{3})\\]", "]"))
res16: List[String] = List([[aaa]], [[ccc]], [[ooo]])