在R编程中,不允许重复“行。名称”。

时间:2021-03-16 07:22:29

I am trying to load a csv file that has 14 columns like this:

我正在尝试加载一个csv文件,它有14个这样的列:

StartDate, var1, var2, var3, ...., var14

StartDate可以,var1 var2、var3 ....,var14

when I issue this command:

当我发出这个命令:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",")

I get "duplicate row.names are not allowed error message".

我得到“重复的行。名称是不允许错误消息”。

It seems to me that the first column name is causing the issue. When I manually download the file and remove the StartDate name from the file, R successfully reads the file and replaces the first column name with X. Can someone tell me what is going on? The file is a comma saparated csv file.

在我看来,第一个列名引起了这个问题。当我手动下载文件并从文件中删除StartDate名称时,R成功地读取了文件,并用x替换了第一个列名,是否有人告诉我发生了什么?该文件是一个逗号分隔的csv文件。

10 个解决方案

#1


71  

Then tell read.table not to use row.names:

然后告诉读。表不使用行。名称:

systems <- read.table("http://getfile.pl?test.csv", 
                      header=TRUE, sep=",", row.names=NULL)

and now your rows will simply be numbered.

现在你的行就被编号了。

Also look at read.csv which is a wrapper for read.table which already sets the sep=',' and header=TRUE arguments so that your call simplifies to

还要看阅读。csv是用于读取的包装器。已经设置sep=','和header=TRUE参数的表,使您的调用简化为。

systems <- read.csv("http://getfile.pl?test.csv", row.names=NULL)

#2


26  

See this related post.

看到相关的文章。

Your header row likely has 1 fewer column than the rest of the file. You can fix this by

您的标题行可能比其他文件的列少1个列。你可以解决这个问题。

  1. adding a delimeter to the end of your header row in the source file, or,
  2. 在源文件的标题行末尾添加一个delimeter,或者,
  3. removing any trailing delimeters in your data
  4. 删除您的数据中的任何拖尾。

e.g. header has one fewer column:

标题栏少了一个栏:

v1,v2,v3
a,a,a,
b,b,b,

v1、v2、v3,a,a,b,b,b,

e.g add trailing delimeter to header:

e。g在标头中添加拖尾

v1,v2,v3,
a,a,a,
b,b,b,

v1、v2、v3,a,a,a,b,b,b,

#3


0  

It seems the problem can arise from more than one reasons. Following two steps worked when I was having same error. 1. I saved my file as MS-DOS csv. ( Earlier it was saved in as just csv , excel starter 2010 ). Opened the csv in notepad++. No coma was inconsistent (consistency as described above @Brian).
2. Noticed I was not using argument sep="," . I used and it worked ( even though that is default argument!)

似乎这个问题的起因不止一个。当我有相同的错误时,有两个步骤。1。我将文件保存为MS-DOS csv。(早些时候它被保存为csv, excel starter 2010)。在notepad++中打开csv。没有昏迷是不一致的(如上面描述的一致性)。2。注意到我没有使用参数sep=","。我使用了它,而且它成功了(尽管这是默认参数!)

#4


0  

I couldn't post this due to not enough points in the comments for the discussion above, but I found this discussion hard to read and solution tough to implement. However, the answer here (https://*.com/a/22408965/2236315) by @adrianoesch should help (e.g., solves "If you know of a solution that does not require the awkward workaround mentioned in your comment (shift the column names, copy the data), that would be great." and "...requiring that the data be copied" proposed by @Frank).

在上面讨论的评论中,我不能发表这篇文章,但是我发现这个讨论很难读,而且很难实现。但是,@adrianoesch的答案(https://*.com/a/22408965/2236315)应该会有所帮助(例如,如果您知道一个解决方案,不需要您在注释中提到的笨拙的工作(转换列名称、复制数据),那就太好了。要求复制“由@Frank提出的数据”。

Note that if you open in some text editor, you should see that the number of header fields less than number of columns below the header row. In my case, the data set had a "," missing at the end of the last header field.

注意,如果您在某个文本编辑器中打开,您应该看到头字段的数量小于标题行下面的列数。在我的例子中,数据集有一个“,”在最后一个头字段的末尾缺失。

#5


0  

Another possible reason for this error is that you have entire rows duplicated. If that is the case, the problem is solved by removing the duplicate rows.

这个错误的另一个可能的原因是,您已经将整个行复制了。如果是这样,则通过删除重复的行来解决这个问题。

#6


0  

You can open in excel and save it there. It will reformat the CSV so it works.

您可以在excel中打开并保存它。它会重新格式化CSV,这样它就能工作了。

#7


0  

Similar problem here too. By using the below function, in a dataframe containing factor variables (nominal / ordinal type), by transforming all these propotional tables per variable into a data frame, it results, a dataframe whose first column contain duplicated names.

类似的问题也在这里。通过使用下面的函数,在一个包含因子变量(名义/序数类型)的dataframe中,通过将每个变量的所有这些参数表转换为一个数据帧,结果是一个dataframe,其第一列包含重复的名称。

These names cannot be transformed into factors, in order to keep them? How this can be happen? This may be the solution to that problem! :)

这些名称不能被转换成因子,以保留它们?这是怎么发生的?这可能是解决这个问题的方法!:)

tblFun <- function(x){
  tbl <- table(x)
  res <- cbind(tbl,round(prop.table(tbl)*100,2))
  colnames(res) <- c('Count','Percentage')
  res
}

do.call(rbind,lapply(df,tblFun))

Example df:

df示例:

Agree           413      77.34
Disagree         27       5.06
Dont know        16       3.00
Agree           505      94.57
Disagree         13       2.43
Dont know         0       0.00

sincerely, Elias "Estatistics" Tsolis

真诚,伊莱亚斯Tsolis“Estatistics”

#8


0  

Irrespective of read.csv or read.table,make row.names = NULL while reading the file. It should work. It worked for me the same way.

无论阅读。csv或阅读。在读取文件时,使行。name = NULL。它应该工作。它对我也有同样的效果。

#9


0  

I had this error when opening a CSV file and one of the fields had commas embedded in it. The field had quotes around it, and I had cut and paste the read.table with quote="" in it. Once I took quote="" out, the default behavior of read.table took over and killed the problem. So I went from this:

在打开CSV文件时,我遇到了这个错误,其中一个字段中嵌入了逗号。这个字段有引号,我剪切并粘贴了读。用引号括起来的表格。一旦我引用了“out”,默认的读取行为。桌子接手并解决了这个问题。所以我从这个开始:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",", quote="")

to this:

:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",")

#10


0  

In my case was a comma at the end of every line. By removing that worked

在我的例子中,每一行的末尾都有一个逗号。通过删除工作

#1


71  

Then tell read.table not to use row.names:

然后告诉读。表不使用行。名称:

systems <- read.table("http://getfile.pl?test.csv", 
                      header=TRUE, sep=",", row.names=NULL)

and now your rows will simply be numbered.

现在你的行就被编号了。

Also look at read.csv which is a wrapper for read.table which already sets the sep=',' and header=TRUE arguments so that your call simplifies to

还要看阅读。csv是用于读取的包装器。已经设置sep=','和header=TRUE参数的表,使您的调用简化为。

systems <- read.csv("http://getfile.pl?test.csv", row.names=NULL)

#2


26  

See this related post.

看到相关的文章。

Your header row likely has 1 fewer column than the rest of the file. You can fix this by

您的标题行可能比其他文件的列少1个列。你可以解决这个问题。

  1. adding a delimeter to the end of your header row in the source file, or,
  2. 在源文件的标题行末尾添加一个delimeter,或者,
  3. removing any trailing delimeters in your data
  4. 删除您的数据中的任何拖尾。

e.g. header has one fewer column:

标题栏少了一个栏:

v1,v2,v3
a,a,a,
b,b,b,

v1、v2、v3,a,a,b,b,b,

e.g add trailing delimeter to header:

e。g在标头中添加拖尾

v1,v2,v3,
a,a,a,
b,b,b,

v1、v2、v3,a,a,a,b,b,b,

#3


0  

It seems the problem can arise from more than one reasons. Following two steps worked when I was having same error. 1. I saved my file as MS-DOS csv. ( Earlier it was saved in as just csv , excel starter 2010 ). Opened the csv in notepad++. No coma was inconsistent (consistency as described above @Brian).
2. Noticed I was not using argument sep="," . I used and it worked ( even though that is default argument!)

似乎这个问题的起因不止一个。当我有相同的错误时,有两个步骤。1。我将文件保存为MS-DOS csv。(早些时候它被保存为csv, excel starter 2010)。在notepad++中打开csv。没有昏迷是不一致的(如上面描述的一致性)。2。注意到我没有使用参数sep=","。我使用了它,而且它成功了(尽管这是默认参数!)

#4


0  

I couldn't post this due to not enough points in the comments for the discussion above, but I found this discussion hard to read and solution tough to implement. However, the answer here (https://*.com/a/22408965/2236315) by @adrianoesch should help (e.g., solves "If you know of a solution that does not require the awkward workaround mentioned in your comment (shift the column names, copy the data), that would be great." and "...requiring that the data be copied" proposed by @Frank).

在上面讨论的评论中,我不能发表这篇文章,但是我发现这个讨论很难读,而且很难实现。但是,@adrianoesch的答案(https://*.com/a/22408965/2236315)应该会有所帮助(例如,如果您知道一个解决方案,不需要您在注释中提到的笨拙的工作(转换列名称、复制数据),那就太好了。要求复制“由@Frank提出的数据”。

Note that if you open in some text editor, you should see that the number of header fields less than number of columns below the header row. In my case, the data set had a "," missing at the end of the last header field.

注意,如果您在某个文本编辑器中打开,您应该看到头字段的数量小于标题行下面的列数。在我的例子中,数据集有一个“,”在最后一个头字段的末尾缺失。

#5


0  

Another possible reason for this error is that you have entire rows duplicated. If that is the case, the problem is solved by removing the duplicate rows.

这个错误的另一个可能的原因是,您已经将整个行复制了。如果是这样,则通过删除重复的行来解决这个问题。

#6


0  

You can open in excel and save it there. It will reformat the CSV so it works.

您可以在excel中打开并保存它。它会重新格式化CSV,这样它就能工作了。

#7


0  

Similar problem here too. By using the below function, in a dataframe containing factor variables (nominal / ordinal type), by transforming all these propotional tables per variable into a data frame, it results, a dataframe whose first column contain duplicated names.

类似的问题也在这里。通过使用下面的函数,在一个包含因子变量(名义/序数类型)的dataframe中,通过将每个变量的所有这些参数表转换为一个数据帧,结果是一个dataframe,其第一列包含重复的名称。

These names cannot be transformed into factors, in order to keep them? How this can be happen? This may be the solution to that problem! :)

这些名称不能被转换成因子,以保留它们?这是怎么发生的?这可能是解决这个问题的方法!:)

tblFun <- function(x){
  tbl <- table(x)
  res <- cbind(tbl,round(prop.table(tbl)*100,2))
  colnames(res) <- c('Count','Percentage')
  res
}

do.call(rbind,lapply(df,tblFun))

Example df:

df示例:

Agree           413      77.34
Disagree         27       5.06
Dont know        16       3.00
Agree           505      94.57
Disagree         13       2.43
Dont know         0       0.00

sincerely, Elias "Estatistics" Tsolis

真诚,伊莱亚斯Tsolis“Estatistics”

#8


0  

Irrespective of read.csv or read.table,make row.names = NULL while reading the file. It should work. It worked for me the same way.

无论阅读。csv或阅读。在读取文件时,使行。name = NULL。它应该工作。它对我也有同样的效果。

#9


0  

I had this error when opening a CSV file and one of the fields had commas embedded in it. The field had quotes around it, and I had cut and paste the read.table with quote="" in it. Once I took quote="" out, the default behavior of read.table took over and killed the problem. So I went from this:

在打开CSV文件时,我遇到了这个错误,其中一个字段中嵌入了逗号。这个字段有引号,我剪切并粘贴了读。用引号括起来的表格。一旦我引用了“out”,默认的读取行为。桌子接手并解决了这个问题。所以我从这个开始:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",", quote="")

to this:

:

systems <- read.table("http://getfile.pl?test.csv", header=TRUE, sep=",")

#10


0  

In my case was a comma at the end of every line. By removing that worked

在我的例子中,每一行的末尾都有一个逗号。通过删除工作