Is it possible to truncate a Java string to the closest word boundary after a number of characters. Similar to the PHP wordwrap() function, shown in this example.
是否可以在多个字符后将Java字符串截断为最接近的字边界。类似于PHP wordwrap()函数,如本例所示。
4 个解决方案
#1
Use a java.text.BreakIterator
, something like this:
使用java.text.BreakIterator,如下所示:
String s = ...;
int number_chars = ...;
BreakIterator bi = BreakIterator.getWordInstance();
bi.setText(s);
int first_after = bi.following(number_chars);
// to truncate:
s = s.substring(0, first_after);
#2
You can use regular expression
您可以使用正则表达式
Matcher m = Pattern.compile("^.{0,10}\\b").matches(str);
m.find();
String first10char = m.group(0);
#3
With the first approach you will end up with a length bigger than number_chars. If you need an exact maximum or less, like for a Twitter message, see my implementation below.
使用第一种方法,您的最终长度将大于number_chars。如果您需要精确的最大值或更少,例如Twitter消息,请参阅下面的我的实现。
Note that the regexp approach uses a space to delimit the words, while BreakIterator breaks up words even if they have commas and other characters. This is more desirable.
请注意,regexp方法使用空格来分隔单词,而BreakIterator即使有逗号和其他字符也会分解单词。这是更理想的。
Here is my full function:
这是我的全部功能:
/**
* Truncate text to the nearest word, up to a maximum length specified.
*
* @param text
* @param maxLength
* @return
*/
private String truncateText(String text, int maxLength) {
if(text != null && text.length() > maxLength) {
BreakIterator bi = BreakIterator.getWordInstance();
bi.setText(text);
if(bi.isBoundary(maxLength-1)) {
return text.substring(0, maxLength-2);
} else {
int preceding = bi.preceding(maxLength-1);
return text.substring(0, preceding-1);
}
} else {
return text;
}
}
#4
Solution with BreakIterator
is not really straightforward when breaking sentence is URL, it breaks URL not very nice way. I rather used mine solution:
使用BreakIterator的解决方案在破解句子是URL时并不是非常简单,它打破URL不是很好的方式。我宁愿用我的解决方案:
public static String truncateText(String text, int maxLength) {
if (text != null && text.length() < maxLength) {
return text;
}
List<String> words = Splitter.on(" ").splitToList(text);
List<String> truncated = new ArrayList<>();
int totalCount = 0;
for (String word : words) {
int wordLength = word.length();
if (totalCount + 1 + wordLength > maxLength) { // +1 because of space
break;
}
totalCount += 1; // space
totalCount += wordLength;
truncated.add(word);
}
String truncResult = Joiner.on(" ").join(truncated);
return truncResult + " ...";
}
Splitter/Joiner is from guava. I am also adding ...
at the end in my use cas (can be ommited).
Splitter / Joiner来自番石榴。我也在添加...最后在我的使用cas(可以是ommited)。
#1
Use a java.text.BreakIterator
, something like this:
使用java.text.BreakIterator,如下所示:
String s = ...;
int number_chars = ...;
BreakIterator bi = BreakIterator.getWordInstance();
bi.setText(s);
int first_after = bi.following(number_chars);
// to truncate:
s = s.substring(0, first_after);
#2
You can use regular expression
您可以使用正则表达式
Matcher m = Pattern.compile("^.{0,10}\\b").matches(str);
m.find();
String first10char = m.group(0);
#3
With the first approach you will end up with a length bigger than number_chars. If you need an exact maximum or less, like for a Twitter message, see my implementation below.
使用第一种方法,您的最终长度将大于number_chars。如果您需要精确的最大值或更少,例如Twitter消息,请参阅下面的我的实现。
Note that the regexp approach uses a space to delimit the words, while BreakIterator breaks up words even if they have commas and other characters. This is more desirable.
请注意,regexp方法使用空格来分隔单词,而BreakIterator即使有逗号和其他字符也会分解单词。这是更理想的。
Here is my full function:
这是我的全部功能:
/**
* Truncate text to the nearest word, up to a maximum length specified.
*
* @param text
* @param maxLength
* @return
*/
private String truncateText(String text, int maxLength) {
if(text != null && text.length() > maxLength) {
BreakIterator bi = BreakIterator.getWordInstance();
bi.setText(text);
if(bi.isBoundary(maxLength-1)) {
return text.substring(0, maxLength-2);
} else {
int preceding = bi.preceding(maxLength-1);
return text.substring(0, preceding-1);
}
} else {
return text;
}
}
#4
Solution with BreakIterator
is not really straightforward when breaking sentence is URL, it breaks URL not very nice way. I rather used mine solution:
使用BreakIterator的解决方案在破解句子是URL时并不是非常简单,它打破URL不是很好的方式。我宁愿用我的解决方案:
public static String truncateText(String text, int maxLength) {
if (text != null && text.length() < maxLength) {
return text;
}
List<String> words = Splitter.on(" ").splitToList(text);
List<String> truncated = new ArrayList<>();
int totalCount = 0;
for (String word : words) {
int wordLength = word.length();
if (totalCount + 1 + wordLength > maxLength) { // +1 because of space
break;
}
totalCount += 1; // space
totalCount += wordLength;
truncated.add(word);
}
String truncResult = Joiner.on(" ").join(truncated);
return truncResult + " ...";
}
Splitter/Joiner is from guava. I am also adding ...
at the end in my use cas (can be ommited).
Splitter / Joiner来自番石榴。我也在添加...最后在我的使用cas(可以是ommited)。