将字符串拆分为单个单词Java

时间:2022-02-17 21:35:43

I would like to know how to split up a large string into a series of smaller strings or words. For example:

我想知道如何将一个大字符串拆分成一系列较小的字符串或单词。例如:

I want to walk my dog.

我想遛狗。

I want to have a string: "I", another string:"want", etc.

我想要一个字符串:“我”,另一个字符串:“想要”等。

How would I do this?

我该怎么办?

9 个解决方案

#1


60  

Use split() method

使用split()方法

Eg:

String s = "I want to walk my dog";
String[] arr = s.split(" ");    

for ( String ss : arr) {
    System.out.println(ss);
}

#2


46  

As a more general solution (but ASCII only!), to include any other separators between words (like commas and semicolons), I suggest:

作为一个更通用的解决方案(但仅限ASCII!),要包括单词之间的任何其他分隔符(如逗号和分号),我建议:

String s = "I want to walk my dog, cat, and tarantula; maybe even my tortoise.";
String[] words = s.split("\\W+");

The regex means that the delimiters will be anything that is not a word [\W], in groups of at least one [+]. Because [+] is greedy, it will take for instance ';' and ' ' together as one delimiter.

正则表达式意味着分隔符将是任何不是单词[\ W]的东西,至少有一个[+]的组。因为[+]是贪婪的,所以需要例如';'和''一起作为一个分隔符。

#3


23  

A regex can also be used to split words.

正则表达式也可用于分割单词。

\w can be used to match word characters ([A-Za-z0-9_]), so that punctuation is removed from the results:

\ w可用于匹配单词字符([A-Za-z0-9_]),以便从结果中删除标点符号:

String s = "I want to walk my dog, and why not?";
Pattern pattern = Pattern.compile("\\w+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
    System.out.println(matcher.group());
}

Outputs:

I
want
to
walk
my
dog
and
why
not

See Java API documentation for Pattern

请参阅Pattern的Java API文档

#4


6  

See my other answer if your phrase contains accentuated characters :

如果您的短语包含突出的字符,请参阅我的其他答案:

String[] listeMots = phrase.split("\\P{L}+");

#5


3  

Yet another method, using StringTokenizer :

另一种方法,使用StringTokenizer:

String s = "I want to walk my dog";
StringTokenizer tokenizer = new StringTokenizer(s);

while(tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken());
}

#6


2  

You can use split(" ") method of the String class and can get each word as code given below:

您可以使用String类的split(“”)方法,并可以将每个单词作为下面给出的代码:

String s = "I want to walk my dog";
String []strArray=s.split(" ");
for(int i=0; i<strArray.length;i++) {
     System.out.println(strArray[i]);
}

#7


1  

Use split()

String words[] = stringInstance.split(" ");

#8


1  

To include any separators between words (like everything except all lower case and upper case letters), we can do:

要在单词之间包含任何分隔符(如除了所有小写和大写字母之外的所有内容),我们可以:

String mystring = "hi, there,hi Leo";
String[] arr = mystring.split("[^a-zA-Z]+");
for(int i = 0; i < arr.length; i += 1)
{
     System.out.println(arr[i]);
}

Here the regex means that the separators will be anything that is not a upper or lower case letter [^a-zA-Z], in groups of at least one [+].

这里的正则表达式意味着分隔符将是不是大写或小写字母[^ a-zA-Z]的任何东西,在至少一个[+]的组中。

#9


0  

you can use Apache commons' StringUtils class

您可以使用Apache commons的StringUtils类

    String[] partsOfString = StringUtils.split("I want to walk my dog",StringUtils.SPACE)

#1


60  

Use split() method

使用split()方法

Eg:

String s = "I want to walk my dog";
String[] arr = s.split(" ");    

for ( String ss : arr) {
    System.out.println(ss);
}

#2


46  

As a more general solution (but ASCII only!), to include any other separators between words (like commas and semicolons), I suggest:

作为一个更通用的解决方案(但仅限ASCII!),要包括单词之间的任何其他分隔符(如逗号和分号),我建议:

String s = "I want to walk my dog, cat, and tarantula; maybe even my tortoise.";
String[] words = s.split("\\W+");

The regex means that the delimiters will be anything that is not a word [\W], in groups of at least one [+]. Because [+] is greedy, it will take for instance ';' and ' ' together as one delimiter.

正则表达式意味着分隔符将是任何不是单词[\ W]的东西,至少有一个[+]的组。因为[+]是贪婪的,所以需要例如';'和''一起作为一个分隔符。

#3


23  

A regex can also be used to split words.

正则表达式也可用于分割单词。

\w can be used to match word characters ([A-Za-z0-9_]), so that punctuation is removed from the results:

\ w可用于匹配单词字符([A-Za-z0-9_]),以便从结果中删除标点符号:

String s = "I want to walk my dog, and why not?";
Pattern pattern = Pattern.compile("\\w+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
    System.out.println(matcher.group());
}

Outputs:

I
want
to
walk
my
dog
and
why
not

See Java API documentation for Pattern

请参阅Pattern的Java API文档

#4


6  

See my other answer if your phrase contains accentuated characters :

如果您的短语包含突出的字符,请参阅我的其他答案:

String[] listeMots = phrase.split("\\P{L}+");

#5


3  

Yet another method, using StringTokenizer :

另一种方法,使用StringTokenizer:

String s = "I want to walk my dog";
StringTokenizer tokenizer = new StringTokenizer(s);

while(tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken());
}

#6


2  

You can use split(" ") method of the String class and can get each word as code given below:

您可以使用String类的split(“”)方法,并可以将每个单词作为下面给出的代码:

String s = "I want to walk my dog";
String []strArray=s.split(" ");
for(int i=0; i<strArray.length;i++) {
     System.out.println(strArray[i]);
}

#7


1  

Use split()

String words[] = stringInstance.split(" ");

#8


1  

To include any separators between words (like everything except all lower case and upper case letters), we can do:

要在单词之间包含任何分隔符(如除了所有小写和大写字母之外的所有内容),我们可以:

String mystring = "hi, there,hi Leo";
String[] arr = mystring.split("[^a-zA-Z]+");
for(int i = 0; i < arr.length; i += 1)
{
     System.out.println(arr[i]);
}

Here the regex means that the separators will be anything that is not a upper or lower case letter [^a-zA-Z], in groups of at least one [+].

这里的正则表达式意味着分隔符将是不是大写或小写字母[^ a-zA-Z]的任何东西,在至少一个[+]的组中。

#9


0  

you can use Apache commons' StringUtils class

您可以使用Apache commons的StringUtils类

    String[] partsOfString = StringUtils.split("I want to walk my dog",StringUtils.SPACE)