最有效的方法是使字符串小写的第一个字符?

时间:2022-09-23 17:13:11

What is the most efficient way to make the first character of a String lower case?

使String小写的第一个字符最有效的方法是什么?

I can think of a number of ways to do this:

我可以想到许多方法来做到这一点:

Using charAt() with substring()

使用charAt()和substring()

String input   = "SomeInputString";
String output  = Character.toLowerCase(input.charAt(0)) +
                   (input.length() > 1 ? input.substring(1) : "");

Or using a char array

或者使用char数组

 String input  = "SomeInputString";
 char c[]      = input.toCharArray();
 c[0]          = Character.toLowerCase(c[0]);
 String output = new String(c);

I am sure there are many other great ways to achieve this. What do you recommend?

我相信还有很多其他很好的方法可以实现这一目标。你有什么建议?

10 个解决方案

#1


95  

I tested the promising approaches using JMH. Full benchmark code.

我使用JMH测试了有前景的方法。完整的基准代码。

Assumption during the tests (to avoid checking the corner cases every time): the input String length is always greater than 1.

测试期间的假设(避免每次检查转角情况):输入字符串长度始终大于1。

Results

Benchmark           Mode  Cnt         Score        Error  Units
MyBenchmark.test1  thrpt   20  10463220.493 ± 288805.068  ops/s
MyBenchmark.test2  thrpt   20  14730158.709 ± 530444.444  ops/s
MyBenchmark.test3  thrpt   20  16079551.751 ±  56884.357  ops/s
MyBenchmark.test4  thrpt   20   9762578.446 ± 584316.582  ops/s
MyBenchmark.test5  thrpt   20   6093216.066 ± 180062.872  ops/s
MyBenchmark.test6  thrpt   20   2104102.578 ±  18705.805  ops/s

The score are operations per second, the more the better.

得分是每秒操作,越多越好。

Tests

  1. test1 was first Andy's and Hllink's approach:

    test1是Andy和Hllink的第一个方法:

    string = Character.toLowerCase(string.charAt(0)) + string.substring(1);
    
  2. test2 was second Andy's approach. It is also Introspector.decapitalize() suggested by Daniel, but without two if statements. First if was removed because of the testing assumption. The second one was removed, because it was violating correctness (i.e. input "HI" would return "HI"). This was almost the fastest.

    test2是Andy的第二个方法。它也是Daniel建议的Introspector.decapitalize(),但没有两个if语句。首先是因为测试假设而被删除。第二个被删除,因为它违反了正确性(即输入“HI”将返回“HI”)。这几乎是最快的。

    char c[] = string.toCharArray();
    c[0] = Character.toLowerCase(c[0]);
    string = new String(c);
    
  3. test3 was a modification of test2, but instead of Character.toLowerCase(), I was adding 32, which works correctly if and only if the string is in ASCII. This was the fastest. c[0] |= ' ' from Mike's comment gave the same performance.

    test3是对test2的修改,但是我没有使用Character.toLowerCase(),而是添加了32,当且仅当字符串是ASCII时才能正常工作。这是最快的。来自Mike的评论中的c [0] | =''给出了相同的表现。

    char c[] = string.toCharArray();
    c[0] += 32;
    string = new String(c);
    
  4. test4 used StringBuilder.

    test4使用了StringBuilder。

    StringBuilder sb = new StringBuilder(string);
    sb.setCharAt(0, Character.toLowerCase(sb.charAt(0)));
    string = sb.toString();
    
  5. test5 used two substring() calls.

    test5使用了两个substring()调用。

    string = string.substring(0, 1).toLowerCase() + string.substring(1);
    
  6. test6 uses reflection to change char value[] directly in String. This was the slowest.

    test6使用反射直接在String中更改char值[]。这是最慢的。

    try {
        Field field = String.class.getDeclaredField("value");
        field.setAccessible(true);
        char[] value = (char[]) field.get(string);
        value[0] = Character.toLowerCase(value[0]);
    } catch (IllegalAccessException e) {
        e.printStackTrace();
    } catch (NoSuchFieldException e) {
        e.printStackTrace();
    }
    

Conclusions

If the String length is always greater than 0, use test2.

如果String长度始终大于0,请使用test2。

If not, we have to check the corner cases:

如果没有,我们必须检查角落案例:

public static String decapitalize(String string)
    if (string == null || string.length() == 0) {
        return string;
    }
    char c[] = string.toCharArray();
    c[0] = Character.toLowerCase(c[0]);
    return new String(c);
}

If you are sure that your text will be always in ASCII and you are looking for extreme performance because you found this code in the bottleneck, use test3.

如果您确定您的文本将始终使用ASCII并且您正在寻找极端性能,因为您在瓶颈中发现了此代码,请使用test3。

#2


84  

I came across a nice alternative if you don't want to use a third-party library:

如果您不想使用第三方库,我遇到了一个不错的选择:

import java.beans.Introspector;

Assert.assertEquals("someInputString", Introspector.decapitalize("SomeInputString"));

#3


20  

When it comes to string manipulation take a look to Jakarta Commons Lang StringUtils.

谈到字符串操作,请查看Jakarta Commons Lang StringUtils。

#4


11  

If you want to use Apache Commons you can do the following:

如果您想使用Apache Commons,您可以执行以下操作:

import org.apache.commons.lang3.text.WordUtils;
[...] 
String s = "SomeString"; 
String firstLower = WordUtils.uncapitalize(s);

Result: someString

结果:someString

#5


10  

Despite a char oriented approach I would suggest a String oriented solution. String.toLowerCase is Locale specific, so I would take this issue into account. String.toLowerCase is to prefer for lower-caseing according to Character.toLowerCase. Also a char oriented solution is not full unicode compatible, because Character.toLowerCase cannot handle supplementary characters.

尽管采用了面向字符的方法,但我建议使用面向字符串的解String.toLowerCase是特定于语言环境的,所以我会考虑这个问题。根据Character.toLowerCase,String.toLowerCase更喜欢小写。另外一个面向char的解决方案不是完全兼容unicode,因为Character.toLowerCase不能处理补充字符。

public static final String uncapitalize(final String originalStr,
            final Locale locale) {
        final int splitIndex = 1;
        final String result;
        if (originalStr.isEmpty()) {
        result = originalStr;
        } else {
        final String first = originalStr.substring(0, splitIndex).toLowerCase(
                locale);
        final String rest = originalStr.substring(splitIndex);
        final StringBuilder uncapStr = new StringBuilder(first).append(rest);
        result = uncapStr.toString();
        }
        return result;
    }

UPDATE: As an example how important the locale setting is let us lowercase I in turkish and german:

更新:作为一个例子,区域设置的重要性让我们在土耳其语和德语中小写I:

System.out.println(uncapitalize("I", new Locale("TR","tr")));
System.out.println(uncapitalize("I", new Locale("DE","de")));

will output two different results:

将输出两个不同的结果:

ı

一世

i

一世

#6


7  

Strings in Java are immutable, so either way a new string will be created.

Java中的字符串是不可变的,因此无论哪种方式都会创建一个新字符串。

Your first example will probably be slightly more efficient because it only needs to create a new string and not a temporary character array.

您的第一个示例可能会稍微提高效率,因为它只需要创建一个新字符串而不是临时字符数组。

#7


3  

A very short and simple static method to archive what you want:

一种非常简短的静态方法,可以存档您想要的内容:

public static String decapitalizeString(String string) {
    return string == null || string.isEmpty() ? "" : Character.toLowerCase(string.charAt(0)) + string.substring(1);
}

#8


2  

If what you need is very simple (eg. java class names, no locales), you can also use the CaseFormat class in the Google Guava library.

如果您需要的是非常简单的(例如,java类名,没有语言环境),您还可以在Google Guava库中使用CaseFormat类。

String converted = CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_CAMEL, "FooBar");
assertEquals("fooBar", converted);

Or you can prepare and reuse a converter object, which could be more efficient.

或者您可以准备和重用转换器对象,这可能更有效。

Converter<String, String> converter=
    CaseFormat.UPPER_CAMEL.converterTo(CaseFormat.LOWER_CAMEL);

assertEquals("fooBar", converter.convert("FooBar"));

To better understand philosophy of the Google Guava string manipulation, check out this wiki page.

要更好地理解Google Guava字符串操作的哲学,请查看此Wiki页面。

#9


1  

String testString = "SomeInputString";
String firstLetter = testString.substring(0,1).toLowerCase();
String restLetters = testString.substring(1);
String resultString = firstLetter + restLetters;

#10


0  

I have come accross this only today. Tried to do it myself in the most pedestrian way. That took one line, tho longish. Here goes

我今天才到这里来。试图以最行人的方式自己做。这花了一条线,很长。开始

String str = "TaxoRank"; 

System.out.println(" Before str = " + str); 

str = str.replaceFirst(str.substring(0,1), str.substring(0,1).toLowerCase());

System.out.println(" After str = " + str);

Gives:

得到:

Before str = TaxoRanks

在str = TaxoRanks之前

After str = taxoRanks

在str = taxoRanks之后

#1


95  

I tested the promising approaches using JMH. Full benchmark code.

我使用JMH测试了有前景的方法。完整的基准代码。

Assumption during the tests (to avoid checking the corner cases every time): the input String length is always greater than 1.

测试期间的假设(避免每次检查转角情况):输入字符串长度始终大于1。

Results

Benchmark           Mode  Cnt         Score        Error  Units
MyBenchmark.test1  thrpt   20  10463220.493 ± 288805.068  ops/s
MyBenchmark.test2  thrpt   20  14730158.709 ± 530444.444  ops/s
MyBenchmark.test3  thrpt   20  16079551.751 ±  56884.357  ops/s
MyBenchmark.test4  thrpt   20   9762578.446 ± 584316.582  ops/s
MyBenchmark.test5  thrpt   20   6093216.066 ± 180062.872  ops/s
MyBenchmark.test6  thrpt   20   2104102.578 ±  18705.805  ops/s

The score are operations per second, the more the better.

得分是每秒操作,越多越好。

Tests

  1. test1 was first Andy's and Hllink's approach:

    test1是Andy和Hllink的第一个方法:

    string = Character.toLowerCase(string.charAt(0)) + string.substring(1);
    
  2. test2 was second Andy's approach. It is also Introspector.decapitalize() suggested by Daniel, but without two if statements. First if was removed because of the testing assumption. The second one was removed, because it was violating correctness (i.e. input "HI" would return "HI"). This was almost the fastest.

    test2是Andy的第二个方法。它也是Daniel建议的Introspector.decapitalize(),但没有两个if语句。首先是因为测试假设而被删除。第二个被删除,因为它违反了正确性(即输入“HI”将返回“HI”)。这几乎是最快的。

    char c[] = string.toCharArray();
    c[0] = Character.toLowerCase(c[0]);
    string = new String(c);
    
  3. test3 was a modification of test2, but instead of Character.toLowerCase(), I was adding 32, which works correctly if and only if the string is in ASCII. This was the fastest. c[0] |= ' ' from Mike's comment gave the same performance.

    test3是对test2的修改,但是我没有使用Character.toLowerCase(),而是添加了32,当且仅当字符串是ASCII时才能正常工作。这是最快的。来自Mike的评论中的c [0] | =''给出了相同的表现。

    char c[] = string.toCharArray();
    c[0] += 32;
    string = new String(c);
    
  4. test4 used StringBuilder.

    test4使用了StringBuilder。

    StringBuilder sb = new StringBuilder(string);
    sb.setCharAt(0, Character.toLowerCase(sb.charAt(0)));
    string = sb.toString();
    
  5. test5 used two substring() calls.

    test5使用了两个substring()调用。

    string = string.substring(0, 1).toLowerCase() + string.substring(1);
    
  6. test6 uses reflection to change char value[] directly in String. This was the slowest.

    test6使用反射直接在String中更改char值[]。这是最慢的。

    try {
        Field field = String.class.getDeclaredField("value");
        field.setAccessible(true);
        char[] value = (char[]) field.get(string);
        value[0] = Character.toLowerCase(value[0]);
    } catch (IllegalAccessException e) {
        e.printStackTrace();
    } catch (NoSuchFieldException e) {
        e.printStackTrace();
    }
    

Conclusions

If the String length is always greater than 0, use test2.

如果String长度始终大于0,请使用test2。

If not, we have to check the corner cases:

如果没有,我们必须检查角落案例:

public static String decapitalize(String string)
    if (string == null || string.length() == 0) {
        return string;
    }
    char c[] = string.toCharArray();
    c[0] = Character.toLowerCase(c[0]);
    return new String(c);
}

If you are sure that your text will be always in ASCII and you are looking for extreme performance because you found this code in the bottleneck, use test3.

如果您确定您的文本将始终使用ASCII并且您正在寻找极端性能,因为您在瓶颈中发现了此代码,请使用test3。

#2


84  

I came across a nice alternative if you don't want to use a third-party library:

如果您不想使用第三方库,我遇到了一个不错的选择:

import java.beans.Introspector;

Assert.assertEquals("someInputString", Introspector.decapitalize("SomeInputString"));

#3


20  

When it comes to string manipulation take a look to Jakarta Commons Lang StringUtils.

谈到字符串操作,请查看Jakarta Commons Lang StringUtils。

#4


11  

If you want to use Apache Commons you can do the following:

如果您想使用Apache Commons,您可以执行以下操作:

import org.apache.commons.lang3.text.WordUtils;
[...] 
String s = "SomeString"; 
String firstLower = WordUtils.uncapitalize(s);

Result: someString

结果:someString

#5


10  

Despite a char oriented approach I would suggest a String oriented solution. String.toLowerCase is Locale specific, so I would take this issue into account. String.toLowerCase is to prefer for lower-caseing according to Character.toLowerCase. Also a char oriented solution is not full unicode compatible, because Character.toLowerCase cannot handle supplementary characters.

尽管采用了面向字符的方法,但我建议使用面向字符串的解String.toLowerCase是特定于语言环境的,所以我会考虑这个问题。根据Character.toLowerCase,String.toLowerCase更喜欢小写。另外一个面向char的解决方案不是完全兼容unicode,因为Character.toLowerCase不能处理补充字符。

public static final String uncapitalize(final String originalStr,
            final Locale locale) {
        final int splitIndex = 1;
        final String result;
        if (originalStr.isEmpty()) {
        result = originalStr;
        } else {
        final String first = originalStr.substring(0, splitIndex).toLowerCase(
                locale);
        final String rest = originalStr.substring(splitIndex);
        final StringBuilder uncapStr = new StringBuilder(first).append(rest);
        result = uncapStr.toString();
        }
        return result;
    }

UPDATE: As an example how important the locale setting is let us lowercase I in turkish and german:

更新:作为一个例子,区域设置的重要性让我们在土耳其语和德语中小写I:

System.out.println(uncapitalize("I", new Locale("TR","tr")));
System.out.println(uncapitalize("I", new Locale("DE","de")));

will output two different results:

将输出两个不同的结果:

ı

一世

i

一世

#6


7  

Strings in Java are immutable, so either way a new string will be created.

Java中的字符串是不可变的,因此无论哪种方式都会创建一个新字符串。

Your first example will probably be slightly more efficient because it only needs to create a new string and not a temporary character array.

您的第一个示例可能会稍微提高效率,因为它只需要创建一个新字符串而不是临时字符数组。

#7


3  

A very short and simple static method to archive what you want:

一种非常简短的静态方法,可以存档您想要的内容:

public static String decapitalizeString(String string) {
    return string == null || string.isEmpty() ? "" : Character.toLowerCase(string.charAt(0)) + string.substring(1);
}

#8


2  

If what you need is very simple (eg. java class names, no locales), you can also use the CaseFormat class in the Google Guava library.

如果您需要的是非常简单的(例如,java类名,没有语言环境),您还可以在Google Guava库中使用CaseFormat类。

String converted = CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_CAMEL, "FooBar");
assertEquals("fooBar", converted);

Or you can prepare and reuse a converter object, which could be more efficient.

或者您可以准备和重用转换器对象,这可能更有效。

Converter<String, String> converter=
    CaseFormat.UPPER_CAMEL.converterTo(CaseFormat.LOWER_CAMEL);

assertEquals("fooBar", converter.convert("FooBar"));

To better understand philosophy of the Google Guava string manipulation, check out this wiki page.

要更好地理解Google Guava字符串操作的哲学,请查看此Wiki页面。

#9


1  

String testString = "SomeInputString";
String firstLetter = testString.substring(0,1).toLowerCase();
String restLetters = testString.substring(1);
String resultString = firstLetter + restLetters;

#10


0  

I have come accross this only today. Tried to do it myself in the most pedestrian way. That took one line, tho longish. Here goes

我今天才到这里来。试图以最行人的方式自己做。这花了一条线,很长。开始

String str = "TaxoRank"; 

System.out.println(" Before str = " + str); 

str = str.replaceFirst(str.substring(0,1), str.substring(0,1).toLowerCase());

System.out.println(" After str = " + str);

Gives:

得到:

Before str = TaxoRanks

在str = TaxoRanks之前

After str = taxoRanks

在str = taxoRanks之后