按姓氏字母顺序排列字符串的单元数组

时间:2022-05-10 07:20:57

In Matlab, I have a cell array like:

在Matlab中,我有一个单元数组,比如:

names = {
    'John Doe',
    'Jane Watkins',
    'Jeremy Jason Taylor',
    'Roger Adrian'
    }

I would like to sort these such that the last names appear in alphabetical order. In my example, it would come out being:

我想把这些东西分类,最后的名字是按字母顺序排列的。在我的例子中,结果是:

names_sorted = {
    'Roger Adrian',
    'John Doe',
    'Jeremy Jason Taylor',
    'Jane Watkins'
    }

I know of inelegant ways of doing this. For instance, I could tokenize at space, make a separate last_names cell array, sort that, and apply the indexing to my original array.

我知道做这件事的方式很不雅。例如,我可以在空间上进行标记,创建一个单独的last_names单元数组,对其进行排序,并将索引应用到原始数组中。

My question is, is there a better way?

我的问题是,有没有更好的方法?

Because someone is sure to come up with that list of assumptions you can't make with regards to people names in a database, let me assure you that all my names are either "FIRST MIDDLE LAST" or "FIRST LAST". I checked.

因为有人肯定会提出你无法在数据库中对人名进行的假设列表,让我向你保证,我的名字要么是“第一个中间”,要么是“第一个”。我检查过了。

1 个解决方案

#1


3  

If all first names had the same length, then you would be able to use sortrows, but in your case, that would require padding and modifying your array, anyway, so that you're better off converting it into "LAST FIRST MIDDLE" before applying sort. Fortunately, there's a simple regular expression for that:

如果所有的名字都有相同的长度,那么您就可以使用sortrows,但是在您的例子中,这将需要填充和修改您的数组,因此您最好在应用sort之前将它转换为“最后的第一个中间”。幸运的是,有一个简单的正则表达式:

names = {'John Doe';'Roger Adrian';'John Fitzgerald Kennedy'};
names_rearranged = regexprep(names,'(.*) (\w*)$','$2 $1')
names_rearranged = 
    'Doe John'
    'Adrian Roger'
    'Kennedy John Fitzgerald'

[names_rearranged_sorted, idx_sorted] = sort(names_rearranged);

names_sorted = names(idx_sorted)
names_sorted = 
    'Roger Adrian'
    'John Doe'
    'John Fitzgerald Kennedy'

#1


3  

If all first names had the same length, then you would be able to use sortrows, but in your case, that would require padding and modifying your array, anyway, so that you're better off converting it into "LAST FIRST MIDDLE" before applying sort. Fortunately, there's a simple regular expression for that:

如果所有的名字都有相同的长度,那么您就可以使用sortrows,但是在您的例子中,这将需要填充和修改您的数组,因此您最好在应用sort之前将它转换为“最后的第一个中间”。幸运的是,有一个简单的正则表达式:

names = {'John Doe';'Roger Adrian';'John Fitzgerald Kennedy'};
names_rearranged = regexprep(names,'(.*) (\w*)$','$2 $1')
names_rearranged = 
    'Doe John'
    'Adrian Roger'
    'Kennedy John Fitzgerald'

[names_rearranged_sorted, idx_sorted] = sort(names_rearranged);

names_sorted = names(idx_sorted)
names_sorted = 
    'Roger Adrian'
    'John Doe'
    'John Fitzgerald Kennedy'