When trying to plot my data I get these error messages:
当试图绘制我的数据时,我得到这些错误信息:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
This reproducible code works:
这种可再生的代码工作原理:
# some dummy data
library(mvtnorm)
dat <- rmvnorm(10000, mean=c(x=4,y=4), sigma=matrix(c(1,0.5,0.5,1), ncol=2))
dat <- as.data.frame(dat)
# 2d density plot
library(MASS)
kdexy <- kde2d(dat$x,dat$y, n=50)
image(kdexy, col=grey(seq(1,0.2,length=10)))
But using my real data doesn't:
但是使用我的真实数据并不是:
kdexy <- kde2d(temp$V1, temp$V2, n=50)
image(kdexy, col=grey(seq(1,0.2,length=10)))
Yet the structure of the two data sets is the same (the dummy data [dat] and the real data [temp]):
然而,这两个数据集的结构是相同的(虚拟数据[dat]和真实数据[temp]):
> str(dat$x)
num [1:80000] 0.669 0.609 -0.633 0.565 0.559 ...
> str(temp$V1)
num [1:823180] 0 0 0 0 0.0146 ...
> summary(dat$x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.2270 -0.1902 0.4841 0.4900 1.1600 5.3570
> summary(temp$V1)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.0000 0.0000 0.0000 -0.0289 0.0000 0.9844
> range(dat$x)
[1] -4.227400 5.357184
> range(temp$V1)
[1] -1.000000 0.984375
> str(temp$V2)
num [1:823180] 1 1 15.5 15.5 18.5 ...
> summary(temp$V2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 1.00 18.45 23.55 35.96 116.10
> range(temp$V2)
[1] 1.0000 116.0829
They are both stored in dataframes and the only difference I'm aware of is the length, and that temp$V1
is bounded at -1
and 1
.
它们都存储在dataframes中,我知道的唯一不同是长度,而temp$V1的值是-1和1。
The output of kdexy <- kde2d()
differs between the two datasets. In the example data, the Z
sections is populated with very small numbers; in the real dataset, every point is filled with 'NaN'.
kdexy <- kde2d()的输出在两个数据集之间有所不同。在示例数据中,Z部分使用非常小的数字填充;在真正的数据集中,每个点都充满了“NaN”。
1 个解决方案
#1
1
My data are heavily concentrated around 0 (total n=850,000
, n==0 ~ 650,000
). Using a subset of data, where x<0
(n=150,000
), the plots work.
我的数据高度集中在0(总数n=85万,n==0 ~ 65万)。当x<0 (n= 150000)时,使用一个子集的数据。
#1
1
My data are heavily concentrated around 0 (total n=850,000
, n==0 ~ 650,000
). Using a subset of data, where x<0
(n=150,000
), the plots work.
我的数据高度集中在0(总数n=85万,n==0 ~ 65万)。当x<0 (n= 150000)时,使用一个子集的数据。