I have looked for such a long time, and haven't been able to figure out how to run Principal Component Analysis in R with the csv file I have. I continue to get this error:
我已经找了这么长时间了,还没有弄清楚如何使用我拥有的csv文件在R中运行主组件分析。我继续得到这个错误:
Error in cov.wt(z) : 'x' must contain finite values only
cov.wt(z)中的错误:“x”必须只包含有限值。
all I have so far is
到目前为止我所拥有的就是
data <- read.csv("2014 07 24 Pct Chg Variables.csv")
pca <- princomp(data3, cor=T)
Error in cov.wt(z) : 'x' must contain finite values only
I have some ""
in my csv file, and have tried
我的csv文件中有一些“”,我已经试过了
data2 <- apply(data, 1, f1)
data3 <- as.numeric(data2)
where f1
is a function to apply the mean where the value is a blank.
其中,f1是一个函数,用于应用该值为空的平均值。
3 个解决方案
#1
12
princomp.default
cannot deal with NA
values:
默认值不能处理NA值:
USArrests[3,2] <- NA
princomp(USArrests, cor = TRUE)
#Error in cov.wt(z) : 'x' must contain finite values only
You need to handle NA
s:
你需要处理NAs:
princomp(na.omit(USArrests), cor = TRUE)
#works
Or use princomp.formula
:
或者使用princomp.formula:
princomp(~ ., data = USArrests, cor = TRUE)
#works too (by calling na.omit` per default)
#2
5
The first column was date.. once I tried
第一列是日期。曾经我试着
pca <- princomp(data[2:21], cor=T)
it worked.
它工作。
#3
2
Make sure you only send the numeric part of the matrix.
确保只发送矩阵的数字部分。
data=read.csv("file.csv", sep="[if not sep by comma]", header=TRUE)
#Calculate number of rows and col
rows<-length(data[,1])
cols<-length(data[1,])
#Remove header and save each column to a matrix
for ( i in 1:rows){
for ( j in 1:cols){
if(data[i,j]=="NA"){
data[i,j]="0"
}
}
}
pca_a=princomp(data, cor=True, covmat=NULL, scores=TRUE)
#Print scree plot
require(graphics)
plot(pca_a)
#plot pca
biplot(pca_a)
#plot scores with labels
plot(pca_a$loadings[,1:2],type="n", main="Title", sub="A subtitle")
text(pca_a$loadings[,1],pca_a$loadings[,2],c("Var1","Var2","..."))
That should help. This way you can change all NA or other things to 0. You could also remove rows that have Strings if there aren't many.
应该帮助。这样你就可以把所有的NA或其他东西都变成0。如果字符串不多,也可以删除有字符串的行。
#1
12
princomp.default
cannot deal with NA
values:
默认值不能处理NA值:
USArrests[3,2] <- NA
princomp(USArrests, cor = TRUE)
#Error in cov.wt(z) : 'x' must contain finite values only
You need to handle NA
s:
你需要处理NAs:
princomp(na.omit(USArrests), cor = TRUE)
#works
Or use princomp.formula
:
或者使用princomp.formula:
princomp(~ ., data = USArrests, cor = TRUE)
#works too (by calling na.omit` per default)
#2
5
The first column was date.. once I tried
第一列是日期。曾经我试着
pca <- princomp(data[2:21], cor=T)
it worked.
它工作。
#3
2
Make sure you only send the numeric part of the matrix.
确保只发送矩阵的数字部分。
data=read.csv("file.csv", sep="[if not sep by comma]", header=TRUE)
#Calculate number of rows and col
rows<-length(data[,1])
cols<-length(data[1,])
#Remove header and save each column to a matrix
for ( i in 1:rows){
for ( j in 1:cols){
if(data[i,j]=="NA"){
data[i,j]="0"
}
}
}
pca_a=princomp(data, cor=True, covmat=NULL, scores=TRUE)
#Print scree plot
require(graphics)
plot(pca_a)
#plot pca
biplot(pca_a)
#plot scores with labels
plot(pca_a$loadings[,1:2],type="n", main="Title", sub="A subtitle")
text(pca_a$loadings[,1],pca_a$loadings[,2],c("Var1","Var2","..."))
That should help. This way you can change all NA or other things to 0. You could also remove rows that have Strings if there aren't many.
应该帮助。这样你就可以把所有的NA或其他东西都变成0。如果字符串不多,也可以删除有字符串的行。