I'd like to use R to make a series of boxplots which are sorted by median value. Suppose then I execute:
我想用R来做一系列的箱线图这些箱线图是按中值排序的。假设我执行:
boxplot(cost ~ type)
This would give me some boxplots were cost is shown on the y axis and the type category is visible on the x-axis:
这将给我一些箱线图如果成本在y轴上显示,类型类别在x轴上可见:
----- -----
| |
[ ] |
| [ ]
| |
----- -----
A B
However, what I'd like is the boxplot figures sorted from highest to lowest median value. My suspicion is that what I need to do is change the labels of the type (A or B) to numerically indicate which is the lowest and highest median value, but I wonder if there is a more clever way to solve the problem.
然而,我想要的是从最高到最低中值排序的boxplot图形。我的怀疑是,我需要做的是改变类型(A或B)的标签,以数字表示它是最低和最高的中值,但我想知道是否有一个更聪明的方法来解决这个问题。
3 个解决方案
#1
44
Check out ?reorder
. The example seems to be what you want, but sorted in the opposite order. I changed -count
in the first line below to sort in the order you want.
看看?重新排序。这个例子似乎是你想要的,但顺序是相反的。我在下面的第一行更改了-count,以便按照您想要的顺序进行排序。
bymedian <- with(InsectSprays, reorder(spray, -count, median))
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")
#2
12
Yes, that is the idea:
是的,这就是我的想法:
> set.seed(42) # fix seed
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE),
+ cost=rnorm(100))
>
> boxplot(cost ~ type, data=DF) # not ordered by median
>
> # compute index of ordered 'cost factor' and reassign
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])
>
> boxplot(cost ~ type, data=DF) # now it is ordered by median
#3
0
Beware of missing values, you have to add na.rm = TRUE
for it to work. If not, the code simply doesn't work. It took me hours to found that out.
小心丢失的值,你必须添加na。rm =对它起作用。如果不是,代码就不能工作。我花了好几个小时才发现这一点。
bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")
#1
44
Check out ?reorder
. The example seems to be what you want, but sorted in the opposite order. I changed -count
in the first line below to sort in the order you want.
看看?重新排序。这个例子似乎是你想要的,但顺序是相反的。我在下面的第一行更改了-count,以便按照您想要的顺序进行排序。
bymedian <- with(InsectSprays, reorder(spray, -count, median))
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")
#2
12
Yes, that is the idea:
是的,这就是我的想法:
> set.seed(42) # fix seed
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE),
+ cost=rnorm(100))
>
> boxplot(cost ~ type, data=DF) # not ordered by median
>
> # compute index of ordered 'cost factor' and reassign
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])
>
> boxplot(cost ~ type, data=DF) # now it is ordered by median
#3
0
Beware of missing values, you have to add na.rm = TRUE
for it to work. If not, the code simply doesn't work. It took me hours to found that out.
小心丢失的值,你必须添加na。rm =对它起作用。如果不是,代码就不能工作。我花了好几个小时才发现这一点。
bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")