在Java中将字符串拆分为相等长度的子字符串

How to split the string "Thequickbrownfoxjumps" to substrings of equal size in Java.Eg. "Thequickbrownfoxjumps" of 4 equal size should give the output.

如何将字符串“Thequickbrownfoxjumps”拆分为Java.Eg中相同大小的子字符串。 4个相同大小的“thequickbrownfoxjumps”应该给出输出。

["Theq","uick","brow","nfox","jump","s"]

Similar Question:

Split string into equal-length substrings in Scala

在Scala中将字符串拆分为等长子串

20 个解决方案

#1

Here's the regex one-liner version:

这是正则表达式的单行版本:

System.out.println(Arrays.toString(    "Thequickbrownfoxjumps".split("(?<=\\G.{4})")));

\G is a zero-width assertion that matches the position where the previous match ended. If there was no previous match, it matches the beginning of the input, the same as \A. The enclosing lookbehind matches the position that's four characters along from the end of the last match.

\ G是一个零宽度断言,匹配前一个匹配结束的位置。如果没有先前的匹配,则它与输入的开头匹配,与\ A相同。封闭的lookbehind匹配从最后一个匹配结束开始的四个字符的位置。

Both lookbehind and \G are advanced regex features, not supported by all flavors. Furthermore, \G is not implemented consistently across the flavors that do support it. This trick will work (for example) in Java, Perl, .NET and JGSoft, but not in PHP (PCRE), Ruby 1.9+ or TextMate (both Oniguruma). JavaScript's /y (sticky flag) isn't as flexible as \G, and couldn't be used this way even if JS did support lookbehind.

lookbehind和\ G都是高级正则表达式功能,并不是所有版本都支持。此外,\ G并没有在支持它的各种风格中实现一致。这个技巧(例如)可以在Java,Perl,.NET和JGSoft中使用,但不能在PHP(PCRE),Ruby 1.9+或TextMate(都是Oniguruma)中使用。 JavaScript / y(粘性标记)不如\ G灵活,即使JS确实支持lookbehind也不能以这种方式使用。

I should mention that I don't necessarily recommend this solution if you have other options. The non-regex solutions in the other answers may be longer, but they're also self-documenting; this one's just about the opposite of that. ;)

我应该提一下,如果你有其他选择,我不一定会推荐这个解决方案。其他答案中的非正则表达式解决方案可能更长,但它们也是自我记录的;这个与此恰恰相反。 ;)

Also, this doesn't work in Android, which doesn't support the use of \G in lookbehinds.

此外,这在Android中不起作用,Android不支持在外观中使用\ G.

#2

Well, it's fairly easy to do this by brute force:

好吧,通过蛮力这样做很容易:

public static List<String> splitEqually(String text, int size) {    // Give the list the right capacity to start with. You could use an array    // instead if you wanted.    List<String> ret = new ArrayList<String>((text.length() + size - 1) / size);    for (int start = 0; start < text.length(); start += size) {        ret.add(text.substring(start, Math.min(text.length(), start + size)));    }    return ret;}

I don't think it's really worth using a regex for this.

我认为使用正则表达式并不值得。

EDIT: My reasoning for not using a regex:

编辑:我不使用正则表达式的原因:

This doesn't use any of the real pattern matching of regexes. It's just counting.

这不使用正则表达式的任何实际模式匹配。这只是数数。

I suspect the above will be more efficient, although in most cases it won't matter

我怀疑上述内容会更有效率,尽管在大多数情况下无关紧要

If you need to use variable sizes in different places, you've either got repetition or a helper function to build the regex itself based on a parameter - ick.

如果你需要在不同的地方使用变量大小,你可能需要重复或辅助函数来根据参数构建正则表达式 - ick。

The regex provided in another answer firstly didn't compile (invalid escaping), and then didn't work. My code worked first time. That's more a testament to the usability of regexes vs plain code, IMO.

另一个答案中提供的正则表达式首先没有编译(无效转义),然后没有工作。我的代码第一次工作。这更像是对正则表达式与普通代码IMO的可用性的证明。

#3

This is very easy with Google Guava:

使用Google Guava非常容易:

for(final String token :    Splitter        .fixedLength(4)        .split("Thequickbrownfoxjumps")){    System.out.println(token);}

Output:

Thequickbrownfoxjumps

Or if you need the result as an array, you can use this code:

或者,如果您需要将结果作为数组,则可以使用以下代码:

String[] tokens =    Iterables.toArray(        Splitter            .fixedLength(4)            .split("Thequickbrownfoxjumps"),        String.class    );

Reference:

Note: Splitter construction is shown inline above, but since Splitters are immutable and reusable, it's a good practice to store them in constants:

注意:拆分器结构如上所示,但由于拆分器是不可变的并且可重用,因此将它们存储在常量中是一种很好的做法:

private static final Splitter FOUR_LETTERS = Splitter.fixedLength(4);// more codefor(final String token : FOUR_LETTERS.split("Thequickbrownfoxjumps")){    System.out.println(token);}

#4

If you're using Google's guava general-purpose libraries (and quite honestly, any new Java project probably should be), this is insanely trivial with the Splitter class:

如果你正在使用谷歌的guava通用库(而且老实说,任何新的Java项目可能都应该这样),这对于Splitter类来说是非常微不足道的:

for (String substring : Splitter.fixedLength(4).split(inputString)) {    doSomethingWith(substring);}

and that's it. Easy as!

就是这样。很容易!

#5

public static String[] split(String src, int len) {    String[] result = new String[(int)Math.ceil((double)src.length()/(double)len)];    for (int i=0; i<result.length; i++)        result[i] = src.substring(i*len, Math.min(src.length(), (i+1)*len));    return result;}

#6

public String[] splitInParts(String s, int partLength){    int len = s.length();    // Number of parts    int nparts = (len + partLength - 1) / partLength;    String parts[] = new String[nparts];    // Break into parts    int offset= 0;    int i = 0;    while (i < nparts)    {        parts[i] = s.substring(offset, Math.min(offset + partLength, len));        offset += partLength;        i++;    }    return parts;}

#7

You can use substring from String.class (handling exceptions) or from Apache lang commons (it handles exceptions for you)

您可以使用String.class中的子字符串(处理异常)或Apache lang commons(它为您处理异常)

static String   substring(String str, int start, int end)

Put it inside a loop and you are good to go.

把它放在一个循环中,你很高兴。

#8

Here is a one liner implementation using Java8 streams:

这是使用Java8流的单线程实现:

String input = "Thequickbrownfoxjumps";final AtomicInteger atomicInteger = new AtomicInteger(0);Collection<String> result = input.chars()                                    .mapToObj(c -> String.valueOf((char)c) )                                    .collect(Collectors.groupingBy(c -> atomicInteger.getAndIncrement() / 4                                                                ,Collectors.joining()))                                    .values();

It gives the following output:

它给出了以下输出:

[Theq, uick, brow, nfox, jump, s]

#9

Here's a one-liner version which uses Java 8 IntStream to determine the indexes of the slice beginnings:

这是一个单行版本,它使用Java 8 IntStream来确定切片开头的索引:

String x = "Thequickbrownfoxjumps";String[] result = IntStream                    .iterate(0, i -> i + 4)                    .limit((int) Math.ceil(x.length() / 4.0))                    .mapToObj(i ->                        x.substring(i, Math.min(i + 4, x.length())                    )                    .toArray(String[]::new);

#10

I'd rather this simple solution:

我宁愿这个简单的解决方案:

String content = "Thequickbrownfoxjumps";while(content.length() > 4) {    System.out.println(content.substring(0, 4));    content = content.substring(4);}System.out.println(content);

#11

In case you want to split the string equally backwards, i.e. from right to left, for example, to split 1010001111 to [10, 1000, 1111], here's the code:

如果你想平均向后分割字符串,例如从右到左,分割1010001111到[10,1000,1111],这里是代码:

/** * @param s         the string to be split * @param subLen    length of the equal-length substrings. * @param backwards true if the splitting is from right to left, false otherwise * @return an array of equal-length substrings * @throws ArithmeticException: / by zero when subLen == 0 */public static String[] split(String s, int subLen, boolean backwards) {    assert s != null;    int groups = s.length() % subLen == 0 ? s.length() / subLen : s.length() / subLen + 1;    String[] strs = new String[groups];    if (backwards) {        for (int i = 0; i < groups; i++) {            int beginIndex = s.length() - subLen * (i + 1);            int endIndex = beginIndex + subLen;            if (beginIndex < 0)                beginIndex = 0;            strs[groups - i - 1] = s.substring(beginIndex, endIndex);        }    } else {        for (int i = 0; i < groups; i++) {            int beginIndex = subLen * i;            int endIndex = beginIndex + subLen;            if (endIndex > s.length())                endIndex = s.length();            strs[i] = s.substring(beginIndex, endIndex);        }    }    return strs;}

#12

i use the following java 8 solution:

我使用以下java 8解决方案:

public static List<String> splitString(final String string, final int chunkSize) {  final int numberOfChunks = (string.length() + chunkSize - 1) / chunkSize;  return IntStream.range(0, numberOfChunks)                  .mapToObj(index -> string.substring(index * chunkSize, Math.min((index + 1) * chunkSize, string.length())))                  .collect(toList());}

#13

Java 8 solution (like this but a bit simpler):

Java 8解决方案(这样但有点简单):

public static List<String> partition(String string, int partSize) {  List<String> parts = IntStream.range(0, string.length() / partSize)    .mapToObj(i -> string.substring(i * partSize, (i + 1) * partSize))    .collect(toList());  if ((string.length() % partSize) != 0)    parts.add(string.substring(string.length() / partSize * partSize));  return parts;}

#14

I asked @Alan Moore in a comment to the accepted solution how strings with newlines could be handled. He suggested using DOTALL.

我问@Alan Moore对已接受的解决方案的评论如何处理换行符的字符串。他建议使用DOTALL。

Using his suggestion I created a small sample of how that works:

使用他的建议,我创建了一个小样本:

public void regexDotAllExample() throws UnsupportedEncodingException {    final String input = "The\nquick\nbrown\r\nfox\rjumps";    final String regex = "(?<=\\G.{4})";    Pattern splitByLengthPattern;    String[] split;    splitByLengthPattern = Pattern.compile(regex);    split = splitByLengthPattern.split(input);    System.out.println("---- Without DOTALL ----");    for (int i = 0; i < split.length; i++) {        byte[] s = split[i].getBytes("utf-8");        System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);    }    /* Output is a single entry longer than the desired split size:    ---- Without DOTALL ----    [Idx: 0, length: 26] - [B@17cdc4a5     */    //DOTALL suggested in Alan Moores comment on SO: https://*.com/a/3761521/1237974    splitByLengthPattern = Pattern.compile(regex, Pattern.DOTALL);    split = splitByLengthPattern.split(input);    System.out.println("---- With DOTALL ----");    for (int i = 0; i < split.length; i++) {        byte[] s = split[i].getBytes("utf-8");        System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);    }    /* Output is as desired 7 entries with each entry having a max length of 4:    ---- With DOTALL ----    [Idx: 0, length: 4] - [B@77b22abc    [Idx: 1, length: 4] - [B@5213da08    [Idx: 2, length: 4] - [B@154f6d51    [Idx: 3, length: 4] - [B@1191ebc5    [Idx: 4, length: 4] - [B@30ddb86    [Idx: 5, length: 4] - [B@2c73bfb    [Idx: 6, length: 2] - [B@6632dd29     */}

But I like @Jon Skeets solution in https://*.com/a/3760193/1237974 also. For maintainability in larger projects where not everyone are equally experienced in Regular expressions I would probably use Jons solution.

但我也喜欢https://*.com/a/3760193/1237974中的@Jon Skeets解决方案。对于大型项目的可维护性,并不是每个人都在正则表达式中有相同的经验,我可能会使用Jons解决方案。

#15

Another brute force solution could be,

另一种蛮力解决方案可能是,

    String input = "thequickbrownfoxjumps";    int n = input.length()/4;    String[] num = new String[n];    for(int i = 0, x=0, y=4; i<n; i++){    num[i]  = input.substring(x,y);    x += 4;    y += 4;    System.out.println(num[i]);    }

Where the code just steps through the string with substrings

代码只是通过子串遍历字符串

#16

    import static java.lang.System.exit;   import java.util.Scanner;   import Java.util.Arrays.*; public class string123 {public static void main(String[] args) {  Scanner sc=new Scanner(System.in);    System.out.println("Enter String");    String r=sc.nextLine();    String[] s=new String[10];    int len=r.length();       System.out.println("Enter length Of Sub-string");    int l=sc.nextInt();    int last;    int f=0;    for(int i=0;;i++){        last=(f+l);            if((last)>=len) last=len;        s[i]=r.substring(f,last);     // System.out.println(s[i]);      if (last==len)break;       f=(f+l);    }     System.out.print(Arrays.tostring(s));    }}

Result

 Enter String Thequickbrownfoxjumps Enter length Of Sub-string 4 ["Theq","uick","brow","nfox","jump","s"]

#17

@Testpublic void regexSplit() {    String source = "Thequickbrownfoxjumps";    // define matcher, any char, min length 1, max length 4    Matcher matcher = Pattern.compile(".{1,4}").matcher(source);    List<String> result = new ArrayList<>();    while (matcher.find()) {        result.add(source.substring(matcher.start(), matcher.end()));    }    String[] expected = {"Theq", "uick", "brow", "nfox", "jump", "s"};    assertArrayEquals(result.toArray(), expected);}

#18

Here is my version based on RegEx and Java 8 streams. It's worth to mention that Matcher.results() method is available since Java 9.

这是基于RegEx和Java 8流的我的版本。值得一提的是,自Java 9以来,Matcher.results()方法已经可用。

Test included.

public static List<String> splitString(String input, int splitSize) {    Matcher matcher = Pattern.compile("(?:(.{" + splitSize + "}))+?").matcher(input);    return matcher.results().map(MatchResult::group).collect(Collectors.toList());}@Testpublic void shouldSplitStringToEqualLengthParts() {    String anyValidString = "Split me equally!";    String[] expectedTokens2 = {"Sp", "li", "t ", "me", " e", "qu", "al", "ly"};    String[] expectedTokens3 = {"Spl", "it ", "me ", "equ", "all"};    Assert.assertArrayEquals(expectedTokens2, splitString(anyValidString, 2).toArray());    Assert.assertArrayEquals(expectedTokens3, splitString(anyValidString, 3).toArray());}

#19

public static String[] split(String input, int length) throws IllegalArgumentException {    if(length == 0 || input == null)        return new String[0];    int lengthD = length * 2;    int size = input.length();    if(size == 0)        return new String[0];    int rep = (int) Math.ceil(size * 1d / length);    ByteArrayInputStream stream = new ByteArrayInputStream(input.getBytes(StandardCharsets.UTF_16LE));    String[] out = new String[rep];    byte[]  buf = new byte[lengthD];    int d = 0;    for (int i = 0; i < rep; i++) {        try {            d = stream.read(buf);        } catch (IOException e) {            e.printStackTrace();        }        if(d != lengthD)        {            out[i] = new String(buf,0,d, StandardCharsets.UTF_16LE);            continue;        }        out[i] = new String(buf, StandardCharsets.UTF_16LE);    }    return out;}

#20

public static List<String> getSplittedString(String stringtoSplit,            int length) {        List<String> returnStringList = new ArrayList<String>(                (stringtoSplit.length() + length - 1) / length);        for (int start = 0; start < stringtoSplit.length(); start += length) {            returnStringList.add(stringtoSplit.substring(start,                    Math.min(stringtoSplit.length(), start + length)));        }        return returnStringList;    }

#1