I'm trying to come up with regular expression which will split full names.
我试着提出一个正则表达式,它会把全名分开。
The first part is validation - I want to make sure the name matches the pattern "Name Name" or "Name MI Name", where MI can be one character optionally followed by a period. This weeds out complex names like "Jose Jacinto De La Pena" - and that's fine. The expression I came up with is ^([a-zA-Z]+\s)([a-zA-Z](\.?)\s){0,1}([a-zA-Z'-]+)$
and it seems to do the job.
第一部分是验证——我希望确保名称与模式“name name”或“name MI name”匹配,其中MI可以是一个字符,也可以是一个句号。这些名字就像“Jose Jacinto De La Pena”这样的复杂名字,很好。我想出的表达式是^([a-zA-Z]+ \ s)([a-zA-Z](\ ?)\ s){ 0,1 }([a-zA-Z”——]+)和美元似乎做这项工作。
But how do I modify it to split the name into two parts only? If middle initial is present, I want it to be a part of the first "name", in other words "James T. Kirk" should be split into "James T." and "Kirk". TIA.
但是我如何修改它才能把名字分成两部分呢?如果中间名出现,我希望它是第一个“名字”的一部分,换句话说,“James T. Kirk”应该分为“James T.”和“Kirk”。TIA。
4 个解决方案
#1
3
Just add some parenthesis
只是添加一些括号
^(([a-z]+\s)([a-z](\.?))\s){0,1}([a-z'-]+)$
Your match will be in group 1 now
你的对手现在在第一组
string resultString = null;
try {
resultString = Regex.Match(subjectString, @"^(([a-z]+\s)([a-z](\.?))\s){0,1}([a-z'-]+)$", RegexOptions.IgnoreCase).Groups[1].Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Also, I made the regex case insensitive so that you can make it shorter (no a-zA-Z but a-z)
另外,我将regex大小写不敏感,以便您可以使它更短(不是a-zA-Z,而是a-z)
Update 1
更新1
The number groups don't work well for the case there is no initial so I wrote the regex from sratch
数字组不适合这种情况没有初始值,所以我写了来自sratch的regex
^(\w+\s(\w\.\s)?)(\w+)$
\w stands for any word charater and this is maybe what you need (you can replace it by a-z if that works better)
\w代表任何单词charater(如果效果更好,你可以用a-z替换)
Update 2
更新2
There is a nice feature in C# where you can name your captures
c#中有一个很好的特性,您可以在其中命名捕获
^(?<First>\w+\s(?:\w\.\s)?)(?<Last>\w+)$
Now you can refer to the group by name instead of number (think it's a bit more readable)
现在你可以通过名称而不是数字来引用这个组(认为它更容易读)
var subjectString = "James T. Kirk";
Regex regexObj = new Regex(@"^(?<First>\w+\s(?:\w\.\s)?)(?<Last>\w+)$", RegexOptions.IgnoreCase);
var groups = regexObj.Match(subjectString).Groups;
var firstName = groups["First"].Value;
var lastName = groups["Last"].Value;
#2
0
You can accomplish this by making what is currently your second capturing group a non-capturing group by adding ?:
just before the opening parentheses, and then moving that entire second group into the end of the first group, so it would become the following:
您可以通过添加?:就在开始括号之前,然后将整个第二组移动到第一组的末尾,从而使当前的第二个捕获组成为非捕获组,从而实现这一点。
^([a-zA-Z]+\s(?:[a-zA-Z](\.?)\s)?)([a-zA-Z'-]+)
Note that I also replaced the {0,1}
with ?
, because they are equivalent.
注意,我还用?替换了{0,1},因为它们是等价的。
This will result in two capturing groups, one for the first name and middle initial (if it exists), and one for the last name.
这将导致两个捕获组,一个用于第一个名称和中间初始(如果存在),一个用于姓氏。
#3
0
I'm not sure if you want this way, but there is a method of doing it without regular expressions.
我不确定您是否希望这样做,但是有一种方法可以在没有正则表达式的情况下进行。
If the name is in the form of Name Name
then you could do this:
如果名字是以名字的形式出现,你可以这样做:
// fullName is a string that has the full name, in the form of 'Name Name'
string firstName = fullName.Split(' ')[0];
string lastName = fullName.Split(' ')[1];
And if the name is in the form of Name MIName
then you can do this:
如果名字是以名字的形式出现,你可以这样做:
string firstName = fullName.Split('.')[0] + ".";
string lastName = fullName.Split('.')[1].Trim();
Hope this helps!
希望这可以帮助!
#4
0
Just put the optional part in the first capturing group:
把可选的部分放在第一个捕获组:
(?i)^([a-z]+(?:\s[a-z]\.?)?)\s([a-z'-]+)$
#1
3
Just add some parenthesis
只是添加一些括号
^(([a-z]+\s)([a-z](\.?))\s){0,1}([a-z'-]+)$
Your match will be in group 1 now
你的对手现在在第一组
string resultString = null;
try {
resultString = Regex.Match(subjectString, @"^(([a-z]+\s)([a-z](\.?))\s){0,1}([a-z'-]+)$", RegexOptions.IgnoreCase).Groups[1].Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Also, I made the regex case insensitive so that you can make it shorter (no a-zA-Z but a-z)
另外,我将regex大小写不敏感,以便您可以使它更短(不是a-zA-Z,而是a-z)
Update 1
更新1
The number groups don't work well for the case there is no initial so I wrote the regex from sratch
数字组不适合这种情况没有初始值,所以我写了来自sratch的regex
^(\w+\s(\w\.\s)?)(\w+)$
\w stands for any word charater and this is maybe what you need (you can replace it by a-z if that works better)
\w代表任何单词charater(如果效果更好,你可以用a-z替换)
Update 2
更新2
There is a nice feature in C# where you can name your captures
c#中有一个很好的特性,您可以在其中命名捕获
^(?<First>\w+\s(?:\w\.\s)?)(?<Last>\w+)$
Now you can refer to the group by name instead of number (think it's a bit more readable)
现在你可以通过名称而不是数字来引用这个组(认为它更容易读)
var subjectString = "James T. Kirk";
Regex regexObj = new Regex(@"^(?<First>\w+\s(?:\w\.\s)?)(?<Last>\w+)$", RegexOptions.IgnoreCase);
var groups = regexObj.Match(subjectString).Groups;
var firstName = groups["First"].Value;
var lastName = groups["Last"].Value;
#2
0
You can accomplish this by making what is currently your second capturing group a non-capturing group by adding ?:
just before the opening parentheses, and then moving that entire second group into the end of the first group, so it would become the following:
您可以通过添加?:就在开始括号之前,然后将整个第二组移动到第一组的末尾,从而使当前的第二个捕获组成为非捕获组,从而实现这一点。
^([a-zA-Z]+\s(?:[a-zA-Z](\.?)\s)?)([a-zA-Z'-]+)
Note that I also replaced the {0,1}
with ?
, because they are equivalent.
注意,我还用?替换了{0,1},因为它们是等价的。
This will result in two capturing groups, one for the first name and middle initial (if it exists), and one for the last name.
这将导致两个捕获组,一个用于第一个名称和中间初始(如果存在),一个用于姓氏。
#3
0
I'm not sure if you want this way, but there is a method of doing it without regular expressions.
我不确定您是否希望这样做,但是有一种方法可以在没有正则表达式的情况下进行。
If the name is in the form of Name Name
then you could do this:
如果名字是以名字的形式出现,你可以这样做:
// fullName is a string that has the full name, in the form of 'Name Name'
string firstName = fullName.Split(' ')[0];
string lastName = fullName.Split(' ')[1];
And if the name is in the form of Name MIName
then you can do this:
如果名字是以名字的形式出现,你可以这样做:
string firstName = fullName.Split('.')[0] + ".";
string lastName = fullName.Split('.')[1].Trim();
Hope this helps!
希望这可以帮助!
#4
0
Just put the optional part in the first capturing group:
把可选的部分放在第一个捕获组:
(?i)^([a-z]+(?:\s[a-z]\.?)?)\s([a-z'-]+)$