I have a list of txt files stored in A.path
that I would like to use grep
on to find the year associated with that file, and save this year to a vector. However, as some of these txt files have multiple years in their text, I would only like to store the first year. How can I do this?
我有一个存储在A.path中的txt文件列表,我想使用grep来查找与该文件关联的年份,并将今年保存到向量中。但是,由于其中一些txt文件的文本有多年,我只想存储第一年。我怎样才能做到这一点?
I've done similar things using lapply
, and this is how I began approaching this problem:
我使用lapply做了类似的事情,这就是我开始解决这个问题的方法:
lapply(A.path, function(i){
j <- paste0(scan(i, what = character(), comment.char='', quote=NULL), collapse = " ")
year <- vector()
year[i] <- grep('[0-9][0-9][0-9][0-9]', j)
})
grep
probably isn't the right function to use, as this returns the entirety of j
for each i
. What is the right function to use here?
grep可能不是正确使用的函数,因为它返回每个i的j的全部。在这里使用的功能是什么?
1 个解决方案
#1
5
Converting comment to answer, you can use gsub
with \\1
to extract the value of the first match (ie. the text matched between ()
in the regex)
将注释转换为答案,您可以使用带有\\ 1的gsub来提取第一个匹配的值(即正则表达式中的()之间匹配的文本)
gsub(".*?([0-9]{4}).*", "\\1", j)
#1
5
Converting comment to answer, you can use gsub
with \\1
to extract the value of the first match (ie. the text matched between ()
in the regex)
将注释转换为答案,您可以使用带有\\ 1的gsub来提取第一个匹配的值(即正则表达式中的()之间匹配的文本)
gsub(".*?([0-9]{4}).*", "\\1", j)