I am making a scatter plot in R with ggplot2. I am comparing the fraction of votes Hillary and Bernie received in the primary and education level. There is a lot over overlap and way to many points. I tried to use transparency so I could see the overlap but it still looks bad.
我用R和ggplot2做了一个散点图。我正在比较希拉里和伯尼在小学和教育方面获得的选票比例。有很多重叠的地方,还有很多点。我尝试使用透明度,这样我就能看到重叠,但看起来还是很糟糕。
Code:
代码:
demanalyze <- function(infocode, n = 1){
infoname <- filter(infolookup, column_name == infocode)$description
infocolumn <- as.vector(as.matrix(mydata[infocode]))
ggplot(mydata) +
aes(x = infocolumn) +
ggtitle(infoname) +
xlab(infoname) +
ylab("Fraction of votes each canidate recieved") +
xlab(infoname) +
geom_point(aes(y = sanders_vote_fraction, colour = "Bernie Sanders")) +#, color = alpha("blue",0.02), size=I(1)) +
stat_smooth(aes(y = sanders_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkblue", se = F) +
geom_point(aes(y = clinton_vote_fraction, colour = "Hillary Clinton")) +#, color = alpha("red",0.02), size=I(1)) +
stat_smooth(aes(y = clinton_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkred", se = F) +
scale_colour_manual("",
values = c("Bernie Sanders" = alpha("blue",0.02), "Hillary Clinton" = alpha("red",0.02))
) +
guides(colour = guide_legend(override.aes = list(alpha = 1)))
}
What could I change to make the overlap areas look less messy?
为了使重叠区域看起来不那么混乱,我可以做些什么改变呢?
1 个解决方案
#1
3
The standard way to plot a large number of points over 2 dimensions is to use 2D density plots:
在二维空间上绘制大量点的标准方法是使用二维密度图:
With reproducible example:
用可再生的例子:
x1 <- rnorm(1000, mean=10)
x2 <- rnorm(1000, mean=10)
y1 <- rnorm(1000, mean= 5)
y2 <- rnorm(1000, mean = 7)
mydat <- data.frame(xaxis=c(x1, x2), yaxis=c(y1, y2), lab=rep(c("H","B"),each=1000))
head(mydat)
library(ggplot2)
##Dots and density plots (kinda messy, but can play with alpha)
p1 <-ggplot(mydat) + geom_point(aes(x=xaxis, y = yaxis, color=lab),alpha=0.4) +
stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p1
## just density
p2 <-ggplot(mydat) + stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p2
There are many parameters to play with, so look here for the full info on the plot type in ggplot2.
有许多参数可以使用,所以请在这里查看关于ggplot2中的情节类型的完整信息。
#1
3
The standard way to plot a large number of points over 2 dimensions is to use 2D density plots:
在二维空间上绘制大量点的标准方法是使用二维密度图:
With reproducible example:
用可再生的例子:
x1 <- rnorm(1000, mean=10)
x2 <- rnorm(1000, mean=10)
y1 <- rnorm(1000, mean= 5)
y2 <- rnorm(1000, mean = 7)
mydat <- data.frame(xaxis=c(x1, x2), yaxis=c(y1, y2), lab=rep(c("H","B"),each=1000))
head(mydat)
library(ggplot2)
##Dots and density plots (kinda messy, but can play with alpha)
p1 <-ggplot(mydat) + geom_point(aes(x=xaxis, y = yaxis, color=lab),alpha=0.4) +
stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p1
## just density
p2 <-ggplot(mydat) + stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p2
There are many parameters to play with, so look here for the full info on the plot type in ggplot2.
有许多参数可以使用,所以请在这里查看关于ggplot2中的情节类型的完整信息。