如何在R中创建(100%)堆积直方图?

时间:2022-05-30 14:56:52

My dataset:

I have data in the following format (here, imported from a CSV file). You can find an example dataset as CSV here.

我有以下格式的数据(此处,从CSV文件导入)。您可以在此处找到CSV格式的示例数据集。

PAIR   PREFERENCE
1      5
1      3
1      2
2      4
2      1
2      3

… and so on. In total, there are 19 pairs, and the PREFERENCE ranges from 1 to 5, as discrete values.

… 等等。总共有19对,PREFERENCE的范围从1到5,作为离散值。


What I'm trying to achieve:

What I need is a stacked histogram, e.g. a 100% high column, for each pair, indicating the distribution of the PREFERENCE values.

我需要的是堆叠直方图,例如每对都有一个100%高的列,表示PREFERENCE值的分布。

Something similar to the "100% stacked columns" in Excel, or (although not quite the same, a so-called "mosaic plot"):

类似于Excel中的“100%堆积列”,或者(尽管不完全相同,所谓的“马赛克图”):

如何在R中创建(100%)堆积直方图?


What I tried:

I figured it'd be easiest using ggplot2, but I don't even know where to start. I know I can create a simple bar chart with something like:

我认为使用ggplot2最简单,但我甚至不知道从哪里开始。我知道我可以创建一个简单的条形图,例如:

ggplot(d, aes(x=factor(PAIR), y=factor(PREFERENCE))) + geom_bar(position="fill")

… that however doesn't get me very far. So I tried this, and it gets me somewhat closer to what I'm trying to achieve, but it still uses the count of PREFERENCE, I suppose? Note the ylab being "count" here, and the values ranging to 19.

......然而,这并没有让我走得太远。所以我尝试了这个,它让我更接近我想要实现的目标,但我认为它仍然使用了PREFERENCE的数量?注意ylab在这里是“count”,值是19。

qplot(factor(PAIR), data=d, geom="bar", fill=factor(PREFERENCE_FIXED))

Results in:

如何在R中创建(100%)堆积直方图?

  • So, what do I have to do to get the stacked bars to represent a histogram?
  • 那么,我需要做些什么来使堆积条形成直方图?

  • Or do they actually do this already?
  • 或者他们实际上已经这样做了吗?

  • If so, what do I have to change to get the labels right (e.g. have percentages instead of the "count")?
  • 如果是这样,我需要更改什么才能使标签正确(例如,有百分比而不是“计数”)?

By the way, this is not really related to this question, and only marginally related to this (i.e. probably same idea, but not continuous values, instead grouped into bars).

顺便说一下,这与这个问题并没有真正的关系,只是与此略有关系(即可能是相同的想法,但不是连续的值,而是分为条形)。

1 个解决方案

#1


8  

Maybe you want something like this:

也许你想要这样的东西:

ggplot() + 
    geom_bar(data = dat,
             aes(x = factor(PAIR),fill = factor(PREFERENCE)),
             position = "fill")

where I've read your data into dat. This outputs something like this:

我把你的数据读入dat的地方。这输出如下:

如何在R中创建(100%)堆积直方图?

The y label is still "count", but you can change that manually by adding:

y标签仍为“count”,但您可以通过添加以下内容手动更改:

+ scale_x_discrete("Pairs") + scale_y_continuous("Votes")

#1


8  

Maybe you want something like this:

也许你想要这样的东西:

ggplot() + 
    geom_bar(data = dat,
             aes(x = factor(PAIR),fill = factor(PREFERENCE)),
             position = "fill")

where I've read your data into dat. This outputs something like this:

我把你的数据读入dat的地方。这输出如下:

如何在R中创建(100%)堆积直方图?

The y label is still "count", but you can change that manually by adding:

y标签仍为“count”,但您可以通过添加以下内容手动更改:

+ scale_x_discrete("Pairs") + scale_y_continuous("Votes")