使用变量指定ggpairs图的颜色,但不绘制该变量的颜色

时间:2021-12-14 03:41:08

I have a dataset from the world bank with some continuous and categorical variables.

我有来自世界银行的数据集,其中包含一些连续和分类变量。

> head(nationsCombImputed)
  iso3c iso2c              country year.x life_expect population birth_rate neonat_mortal_rate                     region
1   ABW    AW                Aruba   2014       75.45     103441       10.1                2.4  Latin America & Caribbean
2   AFG    AF          Afghanistan   2014       60.37   31627506       34.2               36.1                 South Asia
3   AGO    AO               Angola   2014       52.27   24227524       45.5               49.6         Sub-Saharan Africa
4   ALB    AL              Albania   2014       77.83    2893654       13.4                6.5      Europe & Central Asia
5   AND    AD              Andorra   2014       70.07      72786       20.9                1.5      Europe & Central Asia
6   ARE    AE United Arab Emirates   2014       77.37    9086139       10.8                3.6 Middle East & North Africa
               income gdp_percap.x  log_pop
1         High income     47008.83 5.014693
2          Low income      1942.48 7.500065
3 Lower middle income      7327.38 7.384309
4 Upper middle income     11307.55 6.461447
5         High income     30482.64 4.862048
6         High income     67239.00 6.958379

I wish to use ggpairs to plot some of the continuous variables (life_expect, birth_rate, neonat_mortal_rate, gdp_percap.x) in a scatter plot but I would like to colour them using the region categorical variable from the data. I have tried a number of different ways but I cannot colour the continuous variables without including the categorical variable.

我希望使用ggpairs在散点图中绘制一些连续变量(life_expect,birth_rate,neonat_mortal_rate,gdp_percap.x),但我想使用数据中的区域分类变量对它们进行着色。我尝试了许多不同的方法,但我不能在不包括分类变量的情况下为连续变量着色。

ggpairs(nationsCombImputed[,c(2,5,7,8,9,11)],
        title="Scatterplot of Variables",
        mapping = ggplot2::aes(color = region),
        labeller = "iso2c")

But I get this error

但是我得到了这个错误

Error in stop_if_high_cardinality(data, columns, cardinality_threshold) : Column 'iso2c' has more levels (211) than the threshold (15) allowed. Please remove the column or increase the 'cardinality_threshold' parameter. Increasing the cardinality_threshold may produce long processing times

stop_if_high_cardinality(data,columns,cardinality_threshold)中的错误:列'iso2c'具有比允许的阈值(15)更多的级别(211)。请删除列或增加'cardinality_threshold'参数。增加cardinality_threshold可能会产生很长的处理时间

Ultimately I would just like a 4x4 scatter plot of the continuous variables coloured by region with the data points labels using the iso2c code in column 2.

最后,我只想使用第2列中使用iso2c代码的数据点标签按区域着色的连续变量的4x4散点图。

Is this possible in ggpairs?

这有可能在ggpairs中吗?

Well yes it is possible! As per @Robin Gertenbach suggestions I added the columns argument to my code and this worked great, please see below.

嗯,是的,这是可能的!根据@Robin Gertenbach的建议,我将columns参数添加到我的代码中,这非常有用,请参阅下文。

使用变量指定ggpairs图的颜色,但不绘制该变量的颜色

ggpairs(nationsCombImputed,
        title="Scatterplot of Variables",
        columns = c(5,7,8,11),
        mapping=ggplot2::aes(colour = region))

I still wish to add data point labels to the scatter plot using the iso2c column but I am struggling with this, any pointers would be greatly appreciated.

我仍然希望使用iso2c列将数据点标签添加到散点图中,但我正在努力解决这个问题,任何指针都会非常感激。

1 个解决方案

#1


1  

As mentioned in the comment you can get ggpairs to color but not plot a dimension by specifying the numeric indices of the columns you do want to plot with columns = c(5,7,8,11).

如注释中所述,您可以获取ggpairs颜色,但不能通过指定要使用columns = c(5,7,8,11)绘制的列的数字索引来绘制尺寸。

To have a text scatter plot you will need to define a function e.g. textscatter that you will supply via lower = list(continuous = textscatter) in the ggpairs function call and specify the labels in the aesthetics.

要有文本散点图,您需要定义一个函数,例如您将通过ggpairs函数调用中的lower = list(continuous = textscatter)提供的textscatter,并在美学中指定标签。

textscatter <- function(data, mapping, ...) {
   ggplot(data, mapping, ...) + geom_text()
}

ggpairs(
  nationsCombImputed, 
  title="Scatterplot of Variables",
  columns = c(5,7,8,11),
  mapping=ggplot2::aes(colour = region, label = iso2c))
  lower = list(continuous = textscatter)
)

Of course you can also put the label aesthetic definition into textscatter

当然,您也可以将标签美学定义放入文本中

#1


1  

As mentioned in the comment you can get ggpairs to color but not plot a dimension by specifying the numeric indices of the columns you do want to plot with columns = c(5,7,8,11).

如注释中所述,您可以获取ggpairs颜色,但不能通过指定要使用columns = c(5,7,8,11)绘制的列的数字索引来绘制尺寸。

To have a text scatter plot you will need to define a function e.g. textscatter that you will supply via lower = list(continuous = textscatter) in the ggpairs function call and specify the labels in the aesthetics.

要有文本散点图,您需要定义一个函数,例如您将通过ggpairs函数调用中的lower = list(continuous = textscatter)提供的textscatter,并在美学中指定标签。

textscatter <- function(data, mapping, ...) {
   ggplot(data, mapping, ...) + geom_text()
}

ggpairs(
  nationsCombImputed, 
  title="Scatterplot of Variables",
  columns = c(5,7,8,11),
  mapping=ggplot2::aes(colour = region, label = iso2c))
  lower = list(continuous = textscatter)
)

Of course you can also put the label aesthetic definition into textscatter

当然,您也可以将标签美学定义放入文本中