将正常曲线覆盖到R的直方图上。

时间:2021-07-03 14:58:26

I have managed to find online how to overlay a normal curve to a histogram in R, but I would like to retain the normal "frequency" y-axis of a histogram. See two code segments below, and notice how in the second, the y-axis is replaced with "density". How can I keep that y-axis as "frequency", as it is in the first plot.

我已经在网上找到了如何将一个正常的曲线覆盖到R的直方图上,但是我想保留直方图的正常的“频率”y轴。请参阅下面的两个代码段,并注意在第二段中,y轴被“密度”取代。如何保持y轴为“频率”,就像在第一个图中一样。

AS A BONUS: I'd like to mark the SD regions (up to 3 SD) on the density curve as well. How can I do this? I tried abline, but the line extends to the top of the graph and looks ugly.

作为奖励:我想在密度曲线上标出SD区域(最多3个SD)。我该怎么做呢?我试过abline,但这条线一直延伸到图形的顶端,看起来很难看。

g = d$mydata
hist(g)

将正常曲线覆盖到R的直方图上。

g = d$mydata
    m<-mean(g)
    std<-sqrt(var(g))
    hist(g, density=20, breaks=20, prob=TRUE, 
         xlab="x-variable", ylim=c(0, 2), 
         main="normal curve over histogram")
    curve(dnorm(x, mean=m, sd=std), 
          col="darkblue", lwd=2, add=TRUE, yaxt="n")

将正常曲线覆盖到R的直方图上。

See how in the image above, the y-axis is "density". I'd like to get that to be "frequency".

在上图中,y轴是“密度”。我想把它变成“频率”。

2 个解决方案

#1


39  

Here's a nice easy way I found:

这是我找到的一个很简单的方法:

h <- hist(g, breaks = 10, density = 10,
          col = "lightgray", xlab = "Accuracy", main = "Overall") 
xfit <- seq(min(g), max(g), length = 40) 
yfit <- dnorm(xfit, mean = mean(g), sd = sd(g)) 
yfit <- yfit * diff(h$mids[1:2]) * length(g) 

lines(xfit, yfit, col = "black", lwd = 2)

#2


22  

You just need to find the right multiplier, which can be easily calculated from the hist object.

你只需要找到正确的乘数,它可以很容易地从hist对象计算出来。

myhist <- hist(mtcars$mpg)
multiplier <- myhist$counts / myhist$density
mydensity <- density(mtcars$mpg)
mydensity$y <- mydensity$y * multiplier[1]

plot(myhist)
lines(mydensity)

将正常曲线覆盖到R的直方图上。

A more complete version, with a normal density and lines at each standard deviation away from the mean (including the mean):

一个更完整的版本,在每个标准偏离平均值(包括平均值)的情况下,有一个正常的密度和线:

myhist <- hist(mtcars$mpg)
multiplier <- myhist$counts / myhist$density
mydensity <- density(mtcars$mpg)
mydensity$y <- mydensity$y * multiplier[1]

plot(myhist)
lines(mydensity)

myx <- seq(min(mtcars$mpg), max(mtcars$mpg), length.out= 100)
mymean <- mean(mtcars$mpg)
mysd <- sd(mtcars$mpg)

normal <- dnorm(x = myx, mean = mymean, sd = mysd)
lines(myx, normal * multiplier[1], col = "blue", lwd = 2)

sd_x <- seq(mymean - 3 * mysd, mymean + 3 * mysd, by = mysd)
sd_y <- dnorm(x = sd_x, mean = mymean, sd = mysd) * multiplier[1]

segments(x0 = sd_x, y0= 0, x1 = sd_x, y1 = sd_y, col = "firebrick4", lwd = 2)

#1


39  

Here's a nice easy way I found:

这是我找到的一个很简单的方法:

h <- hist(g, breaks = 10, density = 10,
          col = "lightgray", xlab = "Accuracy", main = "Overall") 
xfit <- seq(min(g), max(g), length = 40) 
yfit <- dnorm(xfit, mean = mean(g), sd = sd(g)) 
yfit <- yfit * diff(h$mids[1:2]) * length(g) 

lines(xfit, yfit, col = "black", lwd = 2)

#2


22  

You just need to find the right multiplier, which can be easily calculated from the hist object.

你只需要找到正确的乘数,它可以很容易地从hist对象计算出来。

myhist <- hist(mtcars$mpg)
multiplier <- myhist$counts / myhist$density
mydensity <- density(mtcars$mpg)
mydensity$y <- mydensity$y * multiplier[1]

plot(myhist)
lines(mydensity)

将正常曲线覆盖到R的直方图上。

A more complete version, with a normal density and lines at each standard deviation away from the mean (including the mean):

一个更完整的版本,在每个标准偏离平均值(包括平均值)的情况下,有一个正常的密度和线:

myhist <- hist(mtcars$mpg)
multiplier <- myhist$counts / myhist$density
mydensity <- density(mtcars$mpg)
mydensity$y <- mydensity$y * multiplier[1]

plot(myhist)
lines(mydensity)

myx <- seq(min(mtcars$mpg), max(mtcars$mpg), length.out= 100)
mymean <- mean(mtcars$mpg)
mysd <- sd(mtcars$mpg)

normal <- dnorm(x = myx, mean = mymean, sd = mysd)
lines(myx, normal * multiplier[1], col = "blue", lwd = 2)

sd_x <- seq(mymean - 3 * mysd, mymean + 3 * mysd, by = mysd)
sd_y <- dnorm(x = sd_x, mean = mymean, sd = mysd) * multiplier[1]

segments(x0 = sd_x, y0= 0, x1 = sd_x, y1 = sd_y, col = "firebrick4", lwd = 2)