R按内部数字排序

时间:2021-12-26 20:10:43

I have a list of file names to be open. the format is like below.

我有一个要打开的文件名列表。格式如下。

'xxxxx_xxxxxx 00.02.xls'

first 00 refers to year, second 02 refers to month.

第一个00表示年份,第二个02表示月份。

Is there anyway to sort this list first with year index, than month index.

无论如何,首先使用年度索引对月份索引进行排序。

2 个解决方案

#1


If the amount of characters may change in the filename, a regex may be able to locate the year and month for you. I like str_match from the stringr package.

如果文件名中的字符数量可能会发生变化,则正则表达式可能会为您找到年份和月份。我喜欢stringr包中的str_match。

library(stringr)
extract <- str_match(vec, "([0-9]{2})\\.([0-9]{2}).xls")
vec[order(rank(extract[,2]))]

That way, if you decided that you wanted to one day order it by month, you could change the last line from 2 into 3.

这样,如果您决定有一天按月订购,则可以将最后一行从2更改为3。

If you want the years descending, add rev to it. Like this, vec[rev(order(rank(extract[,2])))]

如果您希望年份下降,请添加rev。像这样,vec [rev(order(rank(extract [,2])))]

The great thing about str_match is that it tells you what it matched and creates columns for the tokens that you put in the parantheses. You can then subset those columns like any other data frame.

str_match的优点在于它会告诉您匹配的内容并为您放入parantheses中的标记创建列。然后,您可以像其他数据框一样对这些列进行子集化。

extract
     [,1]        [,2] [,3]
[1,] "07.02.xls" "07" "02"
[2,] "15.12.xls" "15" "12"
[3,] "01.02.xls" "01" "02"

Example

vec <- c("xxxxxxxx_xxxxxx 07.02.xls", "xxxxx_xxx 15.12.xls", "xxxxx_xxxxxx 01.02.xls")
extract <- str_match(vec, "([0-9]{2})\\.([0-9]{2}).xls")
vec[order(rank(extract[,2]))]
[1] "xxxxx_xxxxxx 01.02.xls"    "xxxxxxxx_xxxxxx 07.02.xls" "xxxxx_xxx 15.12.xls" 

#or reversed

vec[rev(order(rank(extract[,2])))]
[1] "xxxxx_xxx 15.12.xls"       "xxxxxxxx_xxxxxx 07.02.xls" "xxxxx_xxxxxx 01.02.xls" 

#2


If there are always 13 characters before the two year digits, then you can do this (assuming your vector of file names is called x):

如果在两年数字之前总是有13个字符,那么你可以这样做(假设你的文件名向量叫做x):

x[order(substr(x,14,18))]

#1


If the amount of characters may change in the filename, a regex may be able to locate the year and month for you. I like str_match from the stringr package.

如果文件名中的字符数量可能会发生变化,则正则表达式可能会为您找到年份和月份。我喜欢stringr包中的str_match。

library(stringr)
extract <- str_match(vec, "([0-9]{2})\\.([0-9]{2}).xls")
vec[order(rank(extract[,2]))]

That way, if you decided that you wanted to one day order it by month, you could change the last line from 2 into 3.

这样,如果您决定有一天按月订购,则可以将最后一行从2更改为3。

If you want the years descending, add rev to it. Like this, vec[rev(order(rank(extract[,2])))]

如果您希望年份下降,请添加rev。像这样,vec [rev(order(rank(extract [,2])))]

The great thing about str_match is that it tells you what it matched and creates columns for the tokens that you put in the parantheses. You can then subset those columns like any other data frame.

str_match的优点在于它会告诉您匹配的内容并为您放入parantheses中的标记创建列。然后,您可以像其他数据框一样对这些列进行子集化。

extract
     [,1]        [,2] [,3]
[1,] "07.02.xls" "07" "02"
[2,] "15.12.xls" "15" "12"
[3,] "01.02.xls" "01" "02"

Example

vec <- c("xxxxxxxx_xxxxxx 07.02.xls", "xxxxx_xxx 15.12.xls", "xxxxx_xxxxxx 01.02.xls")
extract <- str_match(vec, "([0-9]{2})\\.([0-9]{2}).xls")
vec[order(rank(extract[,2]))]
[1] "xxxxx_xxxxxx 01.02.xls"    "xxxxxxxx_xxxxxx 07.02.xls" "xxxxx_xxx 15.12.xls" 

#or reversed

vec[rev(order(rank(extract[,2])))]
[1] "xxxxx_xxx 15.12.xls"       "xxxxxxxx_xxxxxx 07.02.xls" "xxxxx_xxxxxx 01.02.xls" 

#2


If there are always 13 characters before the two year digits, then you can do this (assuming your vector of file names is called x):

如果在两年数字之前总是有13个字符,那么你可以这样做(假设你的文件名向量叫做x):

x[order(substr(x,14,18))]