如何在R中分离直方图的两个最左边的二进制位

Suppose I need to plot a dataset like below:

假设我需要绘制如下数据集:

set.seed(1)
dataset <- sample(1:7, 1000, replace=T)
hist(dataset)

As you can see in the plot below, the two leftmost bins do not have any space between them unlike the rest of the bins.

正如你在下面的图中所看到的,两个最左边的箱子之间没有任何空间,不像其他箱子。

如何在R中分离直方图的两个最左边的二进制位

I tried changing xlim, but it didn't work. Basically I would like to have each number (1 to 7) represented as a bin, and additionally, I would like any two adjacent bins to have space beween them...Thanks!

我尝试改变xlim,但它没有用。基本上我想将每个数字(1到7)表示为一个bin,另外,我希望任何两个相邻的bin在它们之间有空间...谢谢!

2 个解决方案

#1

The best way is to set the breaks argument manually. Using the data from your code,

最好的方法是手动设置break参数。使用代码中的数据,

hist(dataset,breaks=rep(1:7,each=2)+c(-.4,.4))

gives the following plot:

给出以下图:

如何在R中分离直方图的两个最左边的二进制位

The first part, rep(1:7,each=2), is what numbers you want the bars centered around. The second part controls how wide the bars are; if you change it to c(-.49,.49) they'll almost touch, if you change it to c(-.3,.3) you get narrower bars. If you set it to c(-.5,.5) then R yells at you because you aren't allowed to have the same number in your breaks vector twice.

第一部分,rep(1:7,每个= 2),是你想要的数字以数字为中心的数字。第二部分控制条的宽度;如果你把它改成c( - 。49,.49)它们几乎会触碰,如果把它改成c( - .3,.3),你会变得更窄。如果你把它设置为c( - .5,.5),那么R会对你大喊大叫,因为你不允许两次在你的休息向量中使用相同的数字。

Why does this work?

为什么这样做?

If you split up the breaks vector, you get one part that looks like this:

如果你拆分了中断向量,你会得到一个如下所示的部分:

> rep(1:7,each=2)
 [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7

and a second part that looks like this:

和第二部分看起来像这样:

> c(-.4,.4)
 [1] -0.4  0.4

When you add them together, R loops through the second vector as many times as needed to make it as long as the first vector. So you end up with

当你将它们加在一起时,R会根据需要多次循环第二个向量,使其与第一个向量一样长。所以你最终得到了

  1-0.4  1+0.4  2-0.4  2+0.4  3-0.4  3+0.4 [etc.]
=   0.6    1.4    1.6    2.4    2.6    3.4 [etc.]

Thus, you have one bar from 0.6 to 1.4--centered around 1, with width 2*.4--another bar from 1.6 to 2.4 centered around 2 with with 2*.4, and so on. If you had data in between (e.g. 2.5) then the histogram would look kind of silly, because it would create a bar from 2.4 to 2.6, and the bar widths would not be even (since that bar would only be .2 wide, while all the others are .8). But with only integer values that's not a problem.

因此,您有一个从0.6到1.4的栏 - 以1为中心,宽度为2 * .4 - 另一个从1.6到2.4的栏,以2 * .4为中心,依此类推。如果你之间有数据(例如2.5)那么直方图看起来有点傻,因为它会创建一个从2.4到2.6的条形,条形宽度不会是均匀的(因为那条条只有.2宽,而所有其他人都是.8)。但只有整数值不是问题。

#2

-3

You need six bars NOT seven bars; that is what your histogram has space for. But then you end up generating seven bars. That is the bug.

你需要六个酒吧而不是七个酒吧;这就是你的直方图有空间的地方。但是你最终会产生七个酒吧。那是错误。

do sample(1:6, 1000, replace=T) instead of sample(1:7, 1000, replace=T)

做样品(1:6,1000,替换= T)而不是样品(1:7,1000,替换= T)

If you do need seven bars, then seed with 0

如果你确实需要7个柱,那么种子为0

#1

The best way is to set the breaks argument manually. Using the data from your code,

最好的方法是手动设置break参数。使用代码中的数据,

hist(dataset,breaks=rep(1:7,each=2)+c(-.4,.4))

gives the following plot:

给出以下图:

如何在R中分离直方图的两个最左边的二进制位

Why does this work?

为什么这样做?

If you split up the breaks vector, you get one part that looks like this:

如果你拆分了中断向量,你会得到一个如下所示的部分:

> rep(1:7,each=2)
 [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7

and a second part that looks like this:

和第二部分看起来像这样:

> c(-.4,.4)
 [1] -0.4  0.4

When you add them together, R loops through the second vector as many times as needed to make it as long as the first vector. So you end up with

当你将它们加在一起时,R会根据需要多次循环第二个向量,使其与第一个向量一样长。所以你最终得到了

  1-0.4  1+0.4  2-0.4  2+0.4  3-0.4  3+0.4 [etc.]
=   0.6    1.4    1.6    2.4    2.6    3.4 [etc.]

#2

-3

You need six bars NOT seven bars; that is what your histogram has space for. But then you end up generating seven bars. That is the bug.

你需要六个酒吧而不是七个酒吧;这就是你的直方图有空间的地方。但是你最终会产生七个酒吧。那是错误。

do sample(1:6, 1000, replace=T) instead of sample(1:7, 1000, replace=T)

做样品(1:6,1000,替换= T)而不是样品(1:7,1000,替换= T)

If you do need seven bars, then seed with 0

如果你确实需要7个柱,那么种子为0

秒客网

如何在R中分离直方图的两个最左边的二进制位

2 个解决方案

#1

#2

#1

#2

相关文章