通过为R中的每个条的不同部分分配名称,使直方图更清晰

时间:2021-10-09 00:44:48

Assume that I have a data frame with two columns and 19 rows (see below); The left column is the name of cell lines and the right one is the expression of gene ZEB1 in corresponding cell line.

假设我有一个两列19行的数据框架(见下面);左列为细胞系名称,右列为相应细胞系中基因ZEB1的表达。

    CellLines   ZEB1
    600MPE  2.8186
    AU565   2.783
    BT20    2.7817
    BT474   2.6433
    BT483   2.4994
    BT549   3.035
    CAMA1   2.718
    DU4475  2.8005
    HBL100  2.6745
    HCC38   3.2884
    HCC70   2.597
    HCC202  2.8557
    HCC1007 2.7794
    HCC1008 2.4513
    HCC1143 2.8159
    HCC1187 2.6372
    HCC1428 2.7327
    HCC1500 2.7564
    HCC1569 2.8093

I've drawn a histogram for this data using simple code below:

我用下面的简单代码为这些数据绘制了一个直方图:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")

and it gives me the histogram whose x axis is the amount of gene expression and the y axis is the frequency of that expression among cell lines; however, I would like to add the name of cell lines to their specific positions on histogram... How can I do that?

它给出了直方图x轴表示基因表达量y轴表示细胞间表达频率;然而,我想在直方图上的特定位置加上细胞系的名称……我怎么做呢?

Thanks in advance for your time on answering this :-) Best.

提前感谢您的时间回答这个:-)最好。

2 个解决方案

#1


2  

One alternative is to use text to insert labels into the plot:

另一种选择是使用文本将标签插入到图中:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
text(Heiser$ZEB1, 2, labels= Heiser$CellLines, srt=90)

通过为R中的每个条的不同部分分配名称,使直方图更清晰

Edit:

编辑:

Positioning labels in the same category one over another:

同一类别的定位标签:

Heiser_hist <- hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
Heiser$cut <- cut(Heiser$ZEB1, breaks=Heiser_hist$breaks)
library(dplyr)
Heiser <- Heiser %>% group_by(cut) %>% mutate(pos = seq(from=1, to=2, length.out=length(ZEB1)))
with(Heiser, text(ZEB1, pos, labels=CellLines, srt=45, cex=0.9))

通过为R中的每个条的不同部分分配名称,使直方图更清晰

You could try the text without inclination changing srt, but the overplotting is worse in that case. You could also play with the x axis to reduce overplottig.

你可以试一试文本,而不倾向于改变srt,但在这种情况下,过度绘图会更糟。你也可以使用x轴来减少重复。

#2


0  

You are going to have a problem with overlapping labels (not sure what you want to do there) but

您将遇到重叠标签的问题(不确定您想在那里做什么),但是

hist(Heiser$ZEB1[1:19], breaks=50, col="grey", xaxt="n")
axis(1,Heiser$ZEB1, Heiser$CellLines )

I think gives you what you're after based on the description.

我想根据描述给你你想要的。

Are you sure you don't want a bar plot instead? Because with a histogram, one bar does not represent one observation. The histogram is an attempt to estimate the underlying probability density function for continuous variables.

你确定你不想要一个酒吧情节代替吗?因为用直方图,一巴不能代表一种观察。直方图是对连续变量的潜在概率密度函数的估计。

#1


2  

One alternative is to use text to insert labels into the plot:

另一种选择是使用文本将标签插入到图中:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
text(Heiser$ZEB1, 2, labels= Heiser$CellLines, srt=90)

通过为R中的每个条的不同部分分配名称,使直方图更清晰

Edit:

编辑:

Positioning labels in the same category one over another:

同一类别的定位标签:

Heiser_hist <- hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
Heiser$cut <- cut(Heiser$ZEB1, breaks=Heiser_hist$breaks)
library(dplyr)
Heiser <- Heiser %>% group_by(cut) %>% mutate(pos = seq(from=1, to=2, length.out=length(ZEB1)))
with(Heiser, text(ZEB1, pos, labels=CellLines, srt=45, cex=0.9))

通过为R中的每个条的不同部分分配名称,使直方图更清晰

You could try the text without inclination changing srt, but the overplotting is worse in that case. You could also play with the x axis to reduce overplottig.

你可以试一试文本,而不倾向于改变srt,但在这种情况下,过度绘图会更糟。你也可以使用x轴来减少重复。

#2


0  

You are going to have a problem with overlapping labels (not sure what you want to do there) but

您将遇到重叠标签的问题(不确定您想在那里做什么),但是

hist(Heiser$ZEB1[1:19], breaks=50, col="grey", xaxt="n")
axis(1,Heiser$ZEB1, Heiser$CellLines )

I think gives you what you're after based on the description.

我想根据描述给你你想要的。

Are you sure you don't want a bar plot instead? Because with a histogram, one bar does not represent one observation. The histogram is an attempt to estimate the underlying probability density function for continuous variables.

你确定你不想要一个酒吧情节代替吗?因为用直方图,一巴不能代表一种观察。直方图是对连续变量的潜在概率密度函数的估计。