Sorry in case of duplication, but the solutions I have seen does not solve my issue.
抱歉,如果有重复,但是我看到的解决方案并不能解决我的问题。
I have a data frame (df). One of its variables (df$Year) includes a list of years, such as:
我有一个数据帧(df)。其中一个变量(df$Year)包括年份列表,例如:
> df$Year
Year
2001–
2013–
2016–
2003–
2012–2013
2013–
1993–2007, 2010–
In case of multiple years, I just want to keep the last one (i.e. rather than '1993–2007, 2010–' only '2010') and get rid of the '-'. Yet, I have tried with:
如果是多年,我只想保留最后一个(也就是说,不是“1993-2007年,2010 -只有“2010年”),去掉“-”。然而,我尝试过:
unlist(str_extract_all(df$Year, "[[:digit:]]4$"))
but this does not seem to work.
但这似乎行不通。
Any hint?
有提示吗?
1 个解决方案
#1
2
We can use sub
for a one liner:
我们可以用潜水艇装一个衬垫:
df$Year <- sub(".*(\\d{4})\\–?", "\\1", df$Year)
df$Year
[1] "2001" "2013" "2016" "2003" "2013" "2013" "2010"
Demo
Note that the dashes you use in your year ranges appear to be em dashes (or maybe en dashes), not the regular ASCII character.
注意,您在年范围中使用的破折号看起来是em破折号(或者可能是en破折号),而不是普通的ASCII字符。
#1
2
We can use sub
for a one liner:
我们可以用潜水艇装一个衬垫:
df$Year <- sub(".*(\\d{4})\\–?", "\\1", df$Year)
df$Year
[1] "2001" "2013" "2016" "2003" "2013" "2013" "2010"
Demo
Note that the dashes you use in your year ranges appear to be em dashes (or maybe en dashes), not the regular ASCII character.
注意,您在年范围中使用的破折号看起来是em破折号(或者可能是en破折号),而不是普通的ASCII字符。