尝试在Java中拆分字符串,包含换行符,!,空格,?,逗号等

时间:2021-05-31 21:43:48

I know there are many threads about this, but I try the regular expression //W+ and it doesn't work like I expect it to.

我知道有很多关于这个的线程,但我尝试使用正则表达式// W +并且它不像我期望的那样工作。

I'm taking a Java course, and I have a long string of text, that's actually an excerpt from a Shakespeare play. So, it has many punctuation signs, spaces, new line characters, etc. The explanation for the exercise tells me to use message.split("//W+") to split it and receive an array, with each field containing one of the words.

我正在学习Java课程,而且我有一长串文字,这实际上是莎士比亚剧本的摘录。因此,它有许多标点符号,空格,新行字符等。练习的解释告诉我使用message.split(“// W +”)来拆分它并接收一个数组,每个字段包含一个话。

But it's not working for me. The exercise seems to work with another regex, if I use message.split(" ") for example. I get fields with the words that are separated by spaces, but many words are joined by \n or have ! at the end.

但它对我不起作用。如果我使用message.split(“”),这个练习似乎适用于另一个正则表达式。我得到的字段用空格分隔,但很多单词都用\ n连接或者有!在末尾。

This is my code, with a short text:

这是我的代码,带有简短的文字:

public void testSplit(){
    String message = ("This is the message to split!");
    String[] splitMsg= message.split("//W+");
    for (int k=0; k<splitMsg.length;k++){
        System.out.println(splitMsg[k]);
    }
 }

The output is the string, This is the message to split!

输出是字符串,这是要拆分的消息!

Thanks!

2 个解决方案

#1


2  

That's because the escaping is with \\ not with //.

那是因为逃避是\而不是//。

Update: Try to test your sample with this tool Regexr. You'll see that works with \w+, but how this expression just checks for words, the exclamation char will be not included.

更新:尝试使用此工具Regexr测试您的样品。你会看到它与\ w +一起使用,但是这个表达式只检查单词,不包括感叹号。

#2


1  

You're passing the wrong regex argument into the split function. \W+ should be //W+

您将错误的正则表达式参数传递给split函数。 \ W +应为// W +

public void testSplit(){
    String message = ("This is the message to split!");
    String[] splitMsg= message.split("//W+");
    for (int k=0; k<splitMsg.length;k++){
        System.out.println(splitMsg[k]);
    }
 }

#1


2  

That's because the escaping is with \\ not with //.

那是因为逃避是\而不是//。

Update: Try to test your sample with this tool Regexr. You'll see that works with \w+, but how this expression just checks for words, the exclamation char will be not included.

更新:尝试使用此工具Regexr测试您的样品。你会看到它与\ w +一起使用,但是这个表达式只检查单词,不包括感叹号。

#2


1  

You're passing the wrong regex argument into the split function. \W+ should be //W+

您将错误的正则表达式参数传递给split函数。 \ W +应为// W +

public void testSplit(){
    String message = ("This is the message to split!");
    String[] splitMsg= message.split("//W+");
    for (int k=0; k<splitMsg.length;k++){
        System.out.println(splitMsg[k]);
    }
 }