How do I go from this string: "ThisIsMyCapsDelimitedString"
如何从这个字符串开始:"ThisIsMyCapsDelimitedString"
...to this string: "This Is My Caps Delimited String"
…对于这个字符串:“这是我的大写Delimited字符串”
Fewest lines of code in VB.net is preferred but C# is also welcome.
VB.net中最少的代码行是首选,但c#也是受欢迎的。
Cheers!
干杯!
16 个解决方案
#1
160
I made this a while ago. It matches each component of a CamelCase name.
这是我刚才做的。它匹配CamelCase名称的每个组件。
/([A-Z]+(?=$|[A-Z][a-z])|[A-Z]?[a-z]+)/g
For example:
例如:
"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]
"camelCase" => ["camel", "Case"]
To convert that to just insert spaces between the words:
将其转换为仅在单词之间插入空格:
Regex.Replace(s, "([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))", "$1 ")
If you need to handle digits:
如果你需要处理数字:
/([A-Z]+(?=$|[A-Z][a-z]|[0-9])|[A-Z]?[a-z]+|[0-9]+)/g
Regex.Replace(s,"([a-z](?=[A-Z]|[0-9])|[A-Z](?=[A-Z][a-z]|[0-9])|[0-9](?=[^0-9]))","$1 ")
#2
35
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")
#3
19
Great answer, MizardX! I tweaked it slightly to treat numerals as separate words, so that "AddressLine1" would become "Address Line 1" instead of "Address Line1":
伟大的回答,MizardX !我稍微调整了一下,把数字作为单独的单词,这样“AddressLine1”就变成了“地址1”而不是“AddressLine1”:
Regex.Replace(s, "([a-z](?=[A-Z0-9])|[A-Z](?=[A-Z][a-z]))", "$1 ")
#4
16
Just for a little variety... Here's an extension method that doesn't use a regex.
只是为了一点变化……这里有一个不使用正则表达式的扩展方法。
public static class CamelSpaceExtensions
{
public static string SpaceCamelCase(this String input)
{
return new string(InsertSpacesBeforeCaps(input).ToArray());
}
private static IEnumerable<char> InsertSpacesBeforeCaps(IEnumerable<char> input)
{
foreach (char c in input)
{
if (char.IsUpper(c))
{
yield return ' ';
}
yield return c;
}
}
}
#5
11
Grant Wagner's excellent comment aside:
Grant Wagner的精彩评论:
Dim s As String = RegularExpressions.Regex.Replace("ThisIsMyCapsDelimitedString", "([A-Z])", " $1")
#6
8
I needed a solution that supports acronyms and numbers. This Regex-based solution treats the following patterns as individual "words":
我需要一个支持首字母缩写和数字的解决方案。这个基于regex的解决方案将以下模式视为单个“单词”:
- A capital letter followed by lowercase letters
- 大写字母后面是小写字母。
- A sequence of consecutive numbers
- 连续数的序列。
- Consecutive capital letters (interpreted as acronyms) - a new word can begin using the last capital, e.g. HTMLGuide => "HTML Guide", "TheATeam" => "The A Team"
- 连续的大写字母(解释为首字母缩写)-一个新词可以开始使用最后的资本,例如HTMLGuide => "HTML Guide", "TheATeam" => " the a Team"
You could do it as a one-liner:
你可以这样做:
Regex.Replace(value, @"(?<!^)((?<!\d)\d|(?(?<=[A-Z])[A-Z](?=[a-z])|[A-Z]))", " $1")
A more readable approach might be better:
一种可读性更好的方法可能会更好:
using System.Text.RegularExpressions;
namespace Demo
{
public class IntercappedStringHelper
{
private static readonly Regex SeparatorRegex;
static IntercappedStringHelper()
{
const string pattern = @"
(?<!^) # Not start
(
# Digit, not preceded by another digit
(?<!\d)\d
|
# Upper-case letter, followed by lower-case letter if
# preceded by another upper-case letter, e.g. 'G' in HTMLGuide
(?(?<=[A-Z])[A-Z](?=[a-z])|[A-Z])
)";
var options = RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled;
SeparatorRegex = new Regex(pattern, options);
}
public static string SeparateWords(string value, string separator = " ")
{
return SeparatorRegex.Replace(value, separator + "$1");
}
}
}
Here's an extract from the (XUnit) tests:
以下是(XUnit)测试的摘录:
[Theory]
[InlineData("PurchaseOrders", "Purchase-Orders")]
[InlineData("purchaseOrders", "purchase-Orders")]
[InlineData("2Unlimited", "2-Unlimited")]
[InlineData("The2Unlimited", "The-2-Unlimited")]
[InlineData("Unlimited2", "Unlimited-2")]
[InlineData("222Unlimited", "222-Unlimited")]
[InlineData("The222Unlimited", "The-222-Unlimited")]
[InlineData("Unlimited222", "Unlimited-222")]
[InlineData("ATeam", "A-Team")]
[InlineData("TheATeam", "The-A-Team")]
[InlineData("TeamA", "Team-A")]
[InlineData("HTMLGuide", "HTML-Guide")]
[InlineData("TheHTMLGuide", "The-HTML-Guide")]
[InlineData("TheGuideToHTML", "The-Guide-To-HTML")]
[InlineData("HTMLGuide5", "HTML-Guide-5")]
[InlineData("TheHTML5Guide", "The-HTML-5-Guide")]
[InlineData("TheGuideToHTML5", "The-Guide-To-HTML-5")]
[InlineData("TheUKAllStars", "The-UK-All-Stars")]
[InlineData("AllStarsUK", "All-Stars-UK")]
[InlineData("UKAllStars", "UK-All-Stars")]
#7
4
For more variety, using plain old C# objects, the following produces the same output as @MizardX's excellent regular expression.
对于更多样化的,使用普通的c#对象,下面的输出与@MizardX的优秀正则表达式产生相同的输出。
public string FromCamelCase(string camel)
{ // omitted checking camel for null
StringBuilder sb = new StringBuilder();
int upperCaseRun = 0;
foreach (char c in camel)
{ // append a space only if we're not at the start
// and we're not already in an all caps string.
if (char.IsUpper(c))
{
if (upperCaseRun == 0 && sb.Length != 0)
{
sb.Append(' ');
}
upperCaseRun++;
}
else if( char.IsLower(c) )
{
if (upperCaseRun > 1) //The first new word will also be capitalized.
{
sb.Insert(sb.Length - 1, ' ');
}
upperCaseRun = 0;
}
else
{
upperCaseRun = 0;
}
sb.Append(c);
}
return sb.ToString();
}
#8
3
Below is a prototype that converts the following to Title Case:
下面是一个原型,它将以下内容转换为标题案例:
- snake_case
- snake_case
- camelCase
- camelCase
- PascalCase
- PascalCase
- sentence case
- 句子中
- Title Case (keep current formatting)
- 标题案例(保持当前格式)
Obviously you would only need the "ToTitleCase" method yourself.
显然,你自己只需要“ToTitleCase”方法。
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
var examples = new List<string> {
"THEQuickBrownFox",
"theQUICKBrownFox",
"TheQuickBrownFOX",
"TheQuickBrownFox",
"the_quick_brown_fox",
"theFOX",
"FOX",
"QUICK"
};
foreach (var example in examples)
{
Console.WriteLine(ToTitleCase(example));
}
}
private static string ToTitleCase(string example)
{
var fromSnakeCase = example.Replace("_", " ");
var lowerToUpper = Regex.Replace(fromSnakeCase, @"(\p{Ll})(\p{Lu})", "$1 $2");
var sentenceCase = Regex.Replace(lowerToUpper, @"(\p{Lu}+)(\p{Lu}\p{Ll})", "$1 $2");
return new CultureInfo("en-US", false).TextInfo.ToTitleCase(sentenceCase);
}
}
The console out would be as follows:
控制台输出如下:
THE Quick Brown Fox The QUICK Brown Fox The Quick Brown FOX The Quick Brown Fox The Quick Brown Fox The FOX FOX QUICK
博客文章引用
#9
2
string s = "ThisIsMyCapsDelimitedString";
string t = Regex.Replace(s, "([A-Z])", " $1").Substring(1);
#10
2
Regex is about 10-12 times slower than a simple loop:
Regex比一个简单的循环慢10-12倍:
public static string CamelCaseToSpaceSeparated(this string str)
{
if (string.IsNullOrEmpty(str))
{
return str;
}
var res = new StringBuilder();
res.Append(str[0]);
for (var i = 1; i < str.Length; i++)
{
if (char.IsUpper(str[i]))
{
res.Append(' ');
}
res.Append(str[i]);
}
return res.ToString();
}
#11
1
Naive regex solution. Will not handle O'Conner, and adds a space at the start of the string as well.
天真的正则表达式的解决方案。将不会处理O'Conner,并在字符串的开始处添加一个空格。
s = "ThisIsMyCapsDelimitedString"
split = Regex.Replace(s, "[A-Z0-9]", " $&");
#12
0
There's probably a more elegant solution, but this is what I come up with off the top of my head:
也许有一个更优雅的解决方案,但这就是我的想法:
string myString = "ThisIsMyCapsDelimitedString";
for (int i = 1; i < myString.Length; i++)
{
if (myString[i].ToString().ToUpper() == myString[i].ToString())
{
myString = myString.Insert(i, " ");
i++;
}
}
#13
0
Try to use
尝试使用
"([A-Z]*[^A-Z]*)"
The result will fit for alphabet mix with numbers
这个结果将适合字母表与数字的混合。
Regex.Replace("AbcDefGH123Weh", "([A-Z]*[^A-Z]*)", "$1 ");
Abc Def GH123 Weh
Regex.Replace("camelCase", "([A-Z]*[^A-Z]*)", "$1 ");
camel Case
#14
0
Implementing the psudo code from: https://*.com/a/5796394/4279201
实现psudo代码:https://*.com/a/5796394/4279201。
private static StringBuilder camelCaseToRegular(string i_String)
{
StringBuilder output = new StringBuilder();
int i = 0;
foreach (char character in i_String)
{
if (character <= 'Z' && character >= 'A' && i > 0)
{
output.Append(" ");
}
output.Append(character);
i++;
}
return output;
}
#15
0
To match between non-uppercase and Uppercase Letter Unicode Category : (?<=\P{Lu})(?=\p{Lu})
在非大写字母和大写字母之间进行匹配:(?<=\P{Lu})(?=\ P{Lu})
Dim s = Regex.Replace("CorrectHorseBatteryStaple", "(?<=\P{Lu})(?=\p{Lu})", " ")
#16
0
Procedural and fast impl:
程序和快速impl:
/// <summary>
/// Get the words in a code <paramref name="identifier"/>.
/// </summary>
/// <param name="identifier">The code <paramref name="identifier"/></param> to extract words from.
public static string[] GetWords(this string identifier) {
Contract.Ensures(Contract.Result<string[]>() != null, "returned array of string is not null but can be empty");
if (identifier == null) { return new string[0]; }
if (identifier.Length == 0) { return new string[0]; }
const int MIN_WORD_LENGTH = 2; // Ignore one letter or one digit words
var length = identifier.Length;
var list = new List<string>(1 + length/2); // Set capacity, not possible more words since we discard one char words
var sb = new StringBuilder();
CharKind cKindCurrent = GetCharKind(identifier[0]); // length is not zero here
CharKind cKindNext = length == 1 ? CharKind.End : GetCharKind(identifier[1]);
for (var i = 0; i < length; i++) {
var c = identifier[i];
CharKind cKindNextNext = (i >= length - 2) ? CharKind.End : GetCharKind(identifier[i + 2]);
// Process cKindCurrent
switch (cKindCurrent) {
case CharKind.Digit:
case CharKind.LowerCaseLetter:
sb.Append(c); // Append digit or lowerCaseLetter to sb
if (cKindNext == CharKind.UpperCaseLetter) {
goto TURN_SB_INTO_WORD; // Finish word if next char is upper
}
goto CHAR_PROCESSED;
case CharKind.Other:
goto TURN_SB_INTO_WORD;
default: // charCurrent is never Start or End
Debug.Assert(cKindCurrent == CharKind.UpperCaseLetter);
break;
}
// Here cKindCurrent is UpperCaseLetter
// Append UpperCaseLetter to sb anyway
sb.Append(c);
switch (cKindNext) {
default:
goto CHAR_PROCESSED;
case CharKind.UpperCaseLetter:
// "SimpleHTTPServer" when we are at 'P' we need to see that NextNext is 'e' to get the word!
if (cKindNextNext == CharKind.LowerCaseLetter) {
goto TURN_SB_INTO_WORD;
}
goto CHAR_PROCESSED;
case CharKind.End:
case CharKind.Other:
break; // goto TURN_SB_INTO_WORD;
}
//------------------------------------------------
TURN_SB_INTO_WORD:
string word = sb.ToString();
sb.Length = 0;
if (word.Length >= MIN_WORD_LENGTH) {
list.Add(word);
}
CHAR_PROCESSED:
// Shift left for next iteration!
cKindCurrent = cKindNext;
cKindNext = cKindNextNext;
}
string lastWord = sb.ToString();
if (lastWord.Length >= MIN_WORD_LENGTH) {
list.Add(lastWord);
}
return list.ToArray();
}
private static CharKind GetCharKind(char c) {
if (char.IsDigit(c)) { return CharKind.Digit; }
if (char.IsLetter(c)) {
if (char.IsUpper(c)) { return CharKind.UpperCaseLetter; }
Debug.Assert(char.IsLower(c));
return CharKind.LowerCaseLetter;
}
return CharKind.Other;
}
enum CharKind {
End, // For end of string
Digit,
UpperCaseLetter,
LowerCaseLetter,
Other
}
Tests:
测试:
[TestCase((string)null, "")]
[TestCase("", "")]
// Ignore one letter or one digit words
[TestCase("A", "")]
[TestCase("4", "")]
[TestCase("_", "")]
[TestCase("Word_m_Field", "Word Field")]
[TestCase("Word_4_Field", "Word Field")]
[TestCase("a4", "a4")]
[TestCase("ABC", "ABC")]
[TestCase("abc", "abc")]
[TestCase("AbCd", "Ab Cd")]
[TestCase("AbcCde", "Abc Cde")]
[TestCase("ABCCde", "ABC Cde")]
[TestCase("Abc42Cde", "Abc42 Cde")]
[TestCase("Abc42cde", "Abc42cde")]
[TestCase("ABC42Cde", "ABC42 Cde")]
[TestCase("42ABC", "42 ABC")]
[TestCase("42abc", "42abc")]
[TestCase("abc_cde", "abc cde")]
[TestCase("Abc_Cde", "Abc Cde")]
[TestCase("_Abc__Cde_", "Abc Cde")]
[TestCase("ABC_CDE_FGH", "ABC CDE FGH")]
[TestCase("ABC CDE FGH", "ABC CDE FGH")] // Should not happend (white char) anything that is not a letter/digit/'_' is considered as a separator
[TestCase("ABC,CDE;FGH", "ABC CDE FGH")] // Should not happend (,;) anything that is not a letter/digit/'_' is considered as a separator
[TestCase("abc<cde", "abc cde")]
[TestCase("abc<>cde", "abc cde")]
[TestCase("abc<D>cde", "abc cde")] // Ignore one letter or one digit words
[TestCase("abc<Da>cde", "abc Da cde")]
[TestCase("abc<cde>", "abc cde")]
[TestCase("SimpleHTTPServer", "Simple HTTP Server")]
[TestCase("SimpleHTTPS2erver", "Simple HTTPS2erver")]
[TestCase("camelCase", "camel Case")]
[TestCase("m_Field", "Field")]
[TestCase("mm_Field", "mm Field")]
public void Test_GetWords(string identifier, string expectedWordsStr) {
var expectedWords = expectedWordsStr.Split(' ');
if (identifier == null || identifier.Length <= 1) {
expectedWords = new string[0];
}
var words = identifier.GetWords();
Assert.IsTrue(words.SequenceEqual(expectedWords));
}
#1
160
I made this a while ago. It matches each component of a CamelCase name.
这是我刚才做的。它匹配CamelCase名称的每个组件。
/([A-Z]+(?=$|[A-Z][a-z])|[A-Z]?[a-z]+)/g
For example:
例如:
"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]
"camelCase" => ["camel", "Case"]
To convert that to just insert spaces between the words:
将其转换为仅在单词之间插入空格:
Regex.Replace(s, "([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))", "$1 ")
If you need to handle digits:
如果你需要处理数字:
/([A-Z]+(?=$|[A-Z][a-z]|[0-9])|[A-Z]?[a-z]+|[0-9]+)/g
Regex.Replace(s,"([a-z](?=[A-Z]|[0-9])|[A-Z](?=[A-Z][a-z]|[0-9])|[0-9](?=[^0-9]))","$1 ")
#2
35
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")
#3
19
Great answer, MizardX! I tweaked it slightly to treat numerals as separate words, so that "AddressLine1" would become "Address Line 1" instead of "Address Line1":
伟大的回答,MizardX !我稍微调整了一下,把数字作为单独的单词,这样“AddressLine1”就变成了“地址1”而不是“AddressLine1”:
Regex.Replace(s, "([a-z](?=[A-Z0-9])|[A-Z](?=[A-Z][a-z]))", "$1 ")
#4
16
Just for a little variety... Here's an extension method that doesn't use a regex.
只是为了一点变化……这里有一个不使用正则表达式的扩展方法。
public static class CamelSpaceExtensions
{
public static string SpaceCamelCase(this String input)
{
return new string(InsertSpacesBeforeCaps(input).ToArray());
}
private static IEnumerable<char> InsertSpacesBeforeCaps(IEnumerable<char> input)
{
foreach (char c in input)
{
if (char.IsUpper(c))
{
yield return ' ';
}
yield return c;
}
}
}
#5
11
Grant Wagner's excellent comment aside:
Grant Wagner的精彩评论:
Dim s As String = RegularExpressions.Regex.Replace("ThisIsMyCapsDelimitedString", "([A-Z])", " $1")
#6
8
I needed a solution that supports acronyms and numbers. This Regex-based solution treats the following patterns as individual "words":
我需要一个支持首字母缩写和数字的解决方案。这个基于regex的解决方案将以下模式视为单个“单词”:
- A capital letter followed by lowercase letters
- 大写字母后面是小写字母。
- A sequence of consecutive numbers
- 连续数的序列。
- Consecutive capital letters (interpreted as acronyms) - a new word can begin using the last capital, e.g. HTMLGuide => "HTML Guide", "TheATeam" => "The A Team"
- 连续的大写字母(解释为首字母缩写)-一个新词可以开始使用最后的资本,例如HTMLGuide => "HTML Guide", "TheATeam" => " the a Team"
You could do it as a one-liner:
你可以这样做:
Regex.Replace(value, @"(?<!^)((?<!\d)\d|(?(?<=[A-Z])[A-Z](?=[a-z])|[A-Z]))", " $1")
A more readable approach might be better:
一种可读性更好的方法可能会更好:
using System.Text.RegularExpressions;
namespace Demo
{
public class IntercappedStringHelper
{
private static readonly Regex SeparatorRegex;
static IntercappedStringHelper()
{
const string pattern = @"
(?<!^) # Not start
(
# Digit, not preceded by another digit
(?<!\d)\d
|
# Upper-case letter, followed by lower-case letter if
# preceded by another upper-case letter, e.g. 'G' in HTMLGuide
(?(?<=[A-Z])[A-Z](?=[a-z])|[A-Z])
)";
var options = RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled;
SeparatorRegex = new Regex(pattern, options);
}
public static string SeparateWords(string value, string separator = " ")
{
return SeparatorRegex.Replace(value, separator + "$1");
}
}
}
Here's an extract from the (XUnit) tests:
以下是(XUnit)测试的摘录:
[Theory]
[InlineData("PurchaseOrders", "Purchase-Orders")]
[InlineData("purchaseOrders", "purchase-Orders")]
[InlineData("2Unlimited", "2-Unlimited")]
[InlineData("The2Unlimited", "The-2-Unlimited")]
[InlineData("Unlimited2", "Unlimited-2")]
[InlineData("222Unlimited", "222-Unlimited")]
[InlineData("The222Unlimited", "The-222-Unlimited")]
[InlineData("Unlimited222", "Unlimited-222")]
[InlineData("ATeam", "A-Team")]
[InlineData("TheATeam", "The-A-Team")]
[InlineData("TeamA", "Team-A")]
[InlineData("HTMLGuide", "HTML-Guide")]
[InlineData("TheHTMLGuide", "The-HTML-Guide")]
[InlineData("TheGuideToHTML", "The-Guide-To-HTML")]
[InlineData("HTMLGuide5", "HTML-Guide-5")]
[InlineData("TheHTML5Guide", "The-HTML-5-Guide")]
[InlineData("TheGuideToHTML5", "The-Guide-To-HTML-5")]
[InlineData("TheUKAllStars", "The-UK-All-Stars")]
[InlineData("AllStarsUK", "All-Stars-UK")]
[InlineData("UKAllStars", "UK-All-Stars")]
#7
4
For more variety, using plain old C# objects, the following produces the same output as @MizardX's excellent regular expression.
对于更多样化的,使用普通的c#对象,下面的输出与@MizardX的优秀正则表达式产生相同的输出。
public string FromCamelCase(string camel)
{ // omitted checking camel for null
StringBuilder sb = new StringBuilder();
int upperCaseRun = 0;
foreach (char c in camel)
{ // append a space only if we're not at the start
// and we're not already in an all caps string.
if (char.IsUpper(c))
{
if (upperCaseRun == 0 && sb.Length != 0)
{
sb.Append(' ');
}
upperCaseRun++;
}
else if( char.IsLower(c) )
{
if (upperCaseRun > 1) //The first new word will also be capitalized.
{
sb.Insert(sb.Length - 1, ' ');
}
upperCaseRun = 0;
}
else
{
upperCaseRun = 0;
}
sb.Append(c);
}
return sb.ToString();
}
#8
3
Below is a prototype that converts the following to Title Case:
下面是一个原型,它将以下内容转换为标题案例:
- snake_case
- snake_case
- camelCase
- camelCase
- PascalCase
- PascalCase
- sentence case
- 句子中
- Title Case (keep current formatting)
- 标题案例(保持当前格式)
Obviously you would only need the "ToTitleCase" method yourself.
显然,你自己只需要“ToTitleCase”方法。
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
var examples = new List<string> {
"THEQuickBrownFox",
"theQUICKBrownFox",
"TheQuickBrownFOX",
"TheQuickBrownFox",
"the_quick_brown_fox",
"theFOX",
"FOX",
"QUICK"
};
foreach (var example in examples)
{
Console.WriteLine(ToTitleCase(example));
}
}
private static string ToTitleCase(string example)
{
var fromSnakeCase = example.Replace("_", " ");
var lowerToUpper = Regex.Replace(fromSnakeCase, @"(\p{Ll})(\p{Lu})", "$1 $2");
var sentenceCase = Regex.Replace(lowerToUpper, @"(\p{Lu}+)(\p{Lu}\p{Ll})", "$1 $2");
return new CultureInfo("en-US", false).TextInfo.ToTitleCase(sentenceCase);
}
}
The console out would be as follows:
控制台输出如下:
THE Quick Brown Fox The QUICK Brown Fox The Quick Brown FOX The Quick Brown Fox The Quick Brown Fox The FOX FOX QUICK
博客文章引用
#9
2
string s = "ThisIsMyCapsDelimitedString";
string t = Regex.Replace(s, "([A-Z])", " $1").Substring(1);
#10
2
Regex is about 10-12 times slower than a simple loop:
Regex比一个简单的循环慢10-12倍:
public static string CamelCaseToSpaceSeparated(this string str)
{
if (string.IsNullOrEmpty(str))
{
return str;
}
var res = new StringBuilder();
res.Append(str[0]);
for (var i = 1; i < str.Length; i++)
{
if (char.IsUpper(str[i]))
{
res.Append(' ');
}
res.Append(str[i]);
}
return res.ToString();
}
#11
1
Naive regex solution. Will not handle O'Conner, and adds a space at the start of the string as well.
天真的正则表达式的解决方案。将不会处理O'Conner,并在字符串的开始处添加一个空格。
s = "ThisIsMyCapsDelimitedString"
split = Regex.Replace(s, "[A-Z0-9]", " $&");
#12
0
There's probably a more elegant solution, but this is what I come up with off the top of my head:
也许有一个更优雅的解决方案,但这就是我的想法:
string myString = "ThisIsMyCapsDelimitedString";
for (int i = 1; i < myString.Length; i++)
{
if (myString[i].ToString().ToUpper() == myString[i].ToString())
{
myString = myString.Insert(i, " ");
i++;
}
}
#13
0
Try to use
尝试使用
"([A-Z]*[^A-Z]*)"
The result will fit for alphabet mix with numbers
这个结果将适合字母表与数字的混合。
Regex.Replace("AbcDefGH123Weh", "([A-Z]*[^A-Z]*)", "$1 ");
Abc Def GH123 Weh
Regex.Replace("camelCase", "([A-Z]*[^A-Z]*)", "$1 ");
camel Case
#14
0
Implementing the psudo code from: https://*.com/a/5796394/4279201
实现psudo代码:https://*.com/a/5796394/4279201。
private static StringBuilder camelCaseToRegular(string i_String)
{
StringBuilder output = new StringBuilder();
int i = 0;
foreach (char character in i_String)
{
if (character <= 'Z' && character >= 'A' && i > 0)
{
output.Append(" ");
}
output.Append(character);
i++;
}
return output;
}
#15
0
To match between non-uppercase and Uppercase Letter Unicode Category : (?<=\P{Lu})(?=\p{Lu})
在非大写字母和大写字母之间进行匹配:(?<=\P{Lu})(?=\ P{Lu})
Dim s = Regex.Replace("CorrectHorseBatteryStaple", "(?<=\P{Lu})(?=\p{Lu})", " ")
#16
0
Procedural and fast impl:
程序和快速impl:
/// <summary>
/// Get the words in a code <paramref name="identifier"/>.
/// </summary>
/// <param name="identifier">The code <paramref name="identifier"/></param> to extract words from.
public static string[] GetWords(this string identifier) {
Contract.Ensures(Contract.Result<string[]>() != null, "returned array of string is not null but can be empty");
if (identifier == null) { return new string[0]; }
if (identifier.Length == 0) { return new string[0]; }
const int MIN_WORD_LENGTH = 2; // Ignore one letter or one digit words
var length = identifier.Length;
var list = new List<string>(1 + length/2); // Set capacity, not possible more words since we discard one char words
var sb = new StringBuilder();
CharKind cKindCurrent = GetCharKind(identifier[0]); // length is not zero here
CharKind cKindNext = length == 1 ? CharKind.End : GetCharKind(identifier[1]);
for (var i = 0; i < length; i++) {
var c = identifier[i];
CharKind cKindNextNext = (i >= length - 2) ? CharKind.End : GetCharKind(identifier[i + 2]);
// Process cKindCurrent
switch (cKindCurrent) {
case CharKind.Digit:
case CharKind.LowerCaseLetter:
sb.Append(c); // Append digit or lowerCaseLetter to sb
if (cKindNext == CharKind.UpperCaseLetter) {
goto TURN_SB_INTO_WORD; // Finish word if next char is upper
}
goto CHAR_PROCESSED;
case CharKind.Other:
goto TURN_SB_INTO_WORD;
default: // charCurrent is never Start or End
Debug.Assert(cKindCurrent == CharKind.UpperCaseLetter);
break;
}
// Here cKindCurrent is UpperCaseLetter
// Append UpperCaseLetter to sb anyway
sb.Append(c);
switch (cKindNext) {
default:
goto CHAR_PROCESSED;
case CharKind.UpperCaseLetter:
// "SimpleHTTPServer" when we are at 'P' we need to see that NextNext is 'e' to get the word!
if (cKindNextNext == CharKind.LowerCaseLetter) {
goto TURN_SB_INTO_WORD;
}
goto CHAR_PROCESSED;
case CharKind.End:
case CharKind.Other:
break; // goto TURN_SB_INTO_WORD;
}
//------------------------------------------------
TURN_SB_INTO_WORD:
string word = sb.ToString();
sb.Length = 0;
if (word.Length >= MIN_WORD_LENGTH) {
list.Add(word);
}
CHAR_PROCESSED:
// Shift left for next iteration!
cKindCurrent = cKindNext;
cKindNext = cKindNextNext;
}
string lastWord = sb.ToString();
if (lastWord.Length >= MIN_WORD_LENGTH) {
list.Add(lastWord);
}
return list.ToArray();
}
private static CharKind GetCharKind(char c) {
if (char.IsDigit(c)) { return CharKind.Digit; }
if (char.IsLetter(c)) {
if (char.IsUpper(c)) { return CharKind.UpperCaseLetter; }
Debug.Assert(char.IsLower(c));
return CharKind.LowerCaseLetter;
}
return CharKind.Other;
}
enum CharKind {
End, // For end of string
Digit,
UpperCaseLetter,
LowerCaseLetter,
Other
}
Tests:
测试:
[TestCase((string)null, "")]
[TestCase("", "")]
// Ignore one letter or one digit words
[TestCase("A", "")]
[TestCase("4", "")]
[TestCase("_", "")]
[TestCase("Word_m_Field", "Word Field")]
[TestCase("Word_4_Field", "Word Field")]
[TestCase("a4", "a4")]
[TestCase("ABC", "ABC")]
[TestCase("abc", "abc")]
[TestCase("AbCd", "Ab Cd")]
[TestCase("AbcCde", "Abc Cde")]
[TestCase("ABCCde", "ABC Cde")]
[TestCase("Abc42Cde", "Abc42 Cde")]
[TestCase("Abc42cde", "Abc42cde")]
[TestCase("ABC42Cde", "ABC42 Cde")]
[TestCase("42ABC", "42 ABC")]
[TestCase("42abc", "42abc")]
[TestCase("abc_cde", "abc cde")]
[TestCase("Abc_Cde", "Abc Cde")]
[TestCase("_Abc__Cde_", "Abc Cde")]
[TestCase("ABC_CDE_FGH", "ABC CDE FGH")]
[TestCase("ABC CDE FGH", "ABC CDE FGH")] // Should not happend (white char) anything that is not a letter/digit/'_' is considered as a separator
[TestCase("ABC,CDE;FGH", "ABC CDE FGH")] // Should not happend (,;) anything that is not a letter/digit/'_' is considered as a separator
[TestCase("abc<cde", "abc cde")]
[TestCase("abc<>cde", "abc cde")]
[TestCase("abc<D>cde", "abc cde")] // Ignore one letter or one digit words
[TestCase("abc<Da>cde", "abc Da cde")]
[TestCase("abc<cde>", "abc cde")]
[TestCase("SimpleHTTPServer", "Simple HTTP Server")]
[TestCase("SimpleHTTPS2erver", "Simple HTTPS2erver")]
[TestCase("camelCase", "camel Case")]
[TestCase("m_Field", "Field")]
[TestCase("mm_Field", "mm Field")]
public void Test_GetWords(string identifier, string expectedWordsStr) {
var expectedWords = expectedWordsStr.Split(' ');
if (identifier == null || identifier.Length <= 1) {
expectedWords = new string[0];
}
var words = identifier.GetWords();
Assert.IsTrue(words.SequenceEqual(expectedWords));
}