I am importing some columns from multiple csv files from R. I want to delete all the data after row 1472.
我从R的多个csv文件导入一些列。我想删除第1472行之后的所有数据。
temp = list.files(pattern="*.csv") #Importing csv files
Normalyears<-c(temp[1],temp[2],temp[3],temp[5],temp[6],temp[7],temp[9],temp[10],temp[11],temp[13],temp[14],temp[15],temp[17],temp[18],temp[19],temp[21],temp[22],temp[23])
leapyears<-c(temp[4],temp[8],temp[12],temp[16],temp[20]) #separating csv files with based on leap years and normal years.
Importing only the second column of each csv file.
myfiles_Normalyears = lapply(Normalyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
myfiles_leapyears = lapply(leapyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
new.data.leapyears <- NULL
for(i in 1:length(myfiles_leapyears)) {
in.data <- read.table(if(is.null(myfiles_leapyears[i])),skip=c(1472:4399),sep=",")
new.data.leapyears <- rbind(new.data.leapyears, in.data)}
the loop is suppose to delete all the rows starting from 1472 to 4399.
循环假设删除从1472到4399的所有行。
Error: Error in read.table(myfiles_leapyears[i], skip = c(1472:4399), sep = ",") :
'file' must be a character string or connection
'file'必须是字符串或连接
3 个解决方案
#1
0
Your myfiles_leapyears
is a list
. When subsetting a list, you need double brackets to access a single element, otherwise you just get a sublist of length 1.
你的myfiles_leapyears是一个列表。在对列表进行子集化时,您需要使用双括号来访问单个元素,否则您只需获得长度为1的子列表。
So replace
所以更换
myfiles_leapyears[i]
with
同
myfiles_leapyears[[i]]
that will at least take care of invalid subscript type 'list'
errors. I'd second Josh W. that the nrows
argument seems smarter than the skip
argument.
这至少会处理无效的下标类型'list'错误。我第二个Josh W.认为nrows参数似乎比skip参数更聪明。
Alternatively, if you define using sapply
("s" for simplify) instead of lapply
("l" for list), you'll probably be fine using [i]
:
或者,如果您使用sapply(“s”表示简化)而不是lapply(“l”表示列表)定义,使用[i]可能会很好:
myfiles_leapyears = lapply(leapyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
#2
1
There is a nrows
parameter to read.table
, so why not try
read.table有一个nrows参数,所以为什么不试试
read.table(myfiles_leapyears[i], nrows = 1471,sep=",")
read.table(myfiles_leapyears [i],nrows = 1471,sep =“,”)
#3
0
It is fine. I just turned the data from a list into a dataframe.
没事。我只是将列表中的数据转换为数据帧。
df <- as.data.frame(myfiles_leapyears,byrow=T)
leap_df<-head(df,-2928)
#1
0
Your myfiles_leapyears
is a list
. When subsetting a list, you need double brackets to access a single element, otherwise you just get a sublist of length 1.
你的myfiles_leapyears是一个列表。在对列表进行子集化时,您需要使用双括号来访问单个元素,否则您只需获得长度为1的子列表。
So replace
所以更换
myfiles_leapyears[i]
with
同
myfiles_leapyears[[i]]
that will at least take care of invalid subscript type 'list'
errors. I'd second Josh W. that the nrows
argument seems smarter than the skip
argument.
这至少会处理无效的下标类型'list'错误。我第二个Josh W.认为nrows参数似乎比skip参数更聪明。
Alternatively, if you define using sapply
("s" for simplify) instead of lapply
("l" for list), you'll probably be fine using [i]
:
或者,如果您使用sapply(“s”表示简化)而不是lapply(“l”表示列表)定义,使用[i]可能会很好:
myfiles_leapyears = lapply(leapyears, read.delim,colClasses=c('NULL','numeric'),sep =",")
#2
1
There is a nrows
parameter to read.table
, so why not try
read.table有一个nrows参数,所以为什么不试试
read.table(myfiles_leapyears[i], nrows = 1471,sep=",")
read.table(myfiles_leapyears [i],nrows = 1471,sep =“,”)
#3
0
It is fine. I just turned the data from a list into a dataframe.
没事。我只是将列表中的数据转换为数据帧。
df <- as.data.frame(myfiles_leapyears,byrow=T)
leap_df<-head(df,-2928)