This is the dataset
这是数据集
df1 <- data.frame("id" = c("ebi.ac.uk:MIAMExpress:Reporter:A-MEXP-503.100044",
"ebi.ac.uk:MIAMExpress:Reporter:A-MEXP-783.100435",
"ebi.ac.uk:MIAMExpress:Reporter:C-DEA-783.100435"),
"Name" = c("ABC", "DEF", ""))
The product of the dataset
数据集的乘积
id Name
1 ebi.ac.uk:MIAMExpress:Reporter:A-MEXP-503.100044 ABC
2 ebi.ac.uk:MIAMExpress:Reporter:A-MEXP-503.100435 DEF
3 ebi.ac.uk:MIAMExpress:Reporter:A-MEXP-503.100488
I want to make the dataframe look like this
我想让数据框看起来像这样
id Name
1 100044 ABC
2 100435 DEF
3 100488 NA
Can anyone show me how to approach this problem?
任何人都可以告诉我如何处理这个问题?
1 个解决方案
#1
2
Regex way to find the last dot:
正则表达式找到最后一个点:
df1$id <- as.character(df1$id)
regexpr("\\.[^\\.]*$", df1$id) # may not need \\ on second one
or sapply(gregexpr("\\.", x), tail, 1)
或者是sapply(gregexpr(“\\。”,x),tail,1)
Easier to remember, non-regex way:
更容易记住,非正则表达方式:
df1$id <- as.character(df1$id)
df1$id <- sapply(strsplit(df1$id,split="\\."),tail,1)
df1$Name[df1$Name == ""] <- NA
df1
id Name 1 100044 ABC 2 100435 DEF 3 100435 <NA>
sapply(strsplit(df1$id,split="\\."),tail,1)
is from here.
sapply(strsplit(df1 $ id,split =“\\。”),tail,1)来自这里。
#1
2
Regex way to find the last dot:
正则表达式找到最后一个点:
df1$id <- as.character(df1$id)
regexpr("\\.[^\\.]*$", df1$id) # may not need \\ on second one
or sapply(gregexpr("\\.", x), tail, 1)
或者是sapply(gregexpr(“\\。”,x),tail,1)
Easier to remember, non-regex way:
更容易记住,非正则表达方式:
df1$id <- as.character(df1$id)
df1$id <- sapply(strsplit(df1$id,split="\\."),tail,1)
df1$Name[df1$Name == ""] <- NA
df1
id Name 1 100044 ABC 2 100435 DEF 3 100435 <NA>
sapply(strsplit(df1$id,split="\\."),tail,1)
is from here.
sapply(strsplit(df1 $ id,split =“\\。”),tail,1)来自这里。