
时间:2022-03-31 13:02:03

I'd like to use R to make a series of boxplots which are sorted by median value. Suppose then I execute:


boxplot(cost ~ type)

This would give me some boxplots were cost is shown on the y axis and the type category is visible on the x-axis:


-----     -----
  |         |
 [ ]        |
  |        [ ]
  |         |
-----     -----
  A         B

However, what I'd like is the boxplot figures sorted from highest to lowest median value. My suspicion is that what I need to do is change the labels of the type (A or B) to numerically indicate which is the lowest and highest median value, but I wonder if there is a more clever way to solve the problem.


3 个解决方案



Check out ?reorder. The example seems to be what you want, but sorted in the opposite order. I changed -count in the first line below to sort in the order you want.


  bymedian <- with(InsectSprays, reorder(spray, -count, median))
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")



Yes, that is the idea:


> set.seed(42)                     # fix seed       
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE), 
+                  cost=rnorm(100)) 
> boxplot(cost ~ type, data=DF)    # not ordered by median
> # compute index of ordered 'cost factor' and reassign          
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))    
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])   
> boxplot(cost ~ type, data=DF)    # now it is ordered by median



Beware of missing values, you have to add na.rm = TRUE for it to work. If not, the code simply doesn't work. It took me hours to found that out.

小心丢失的值,你必须添加na。rm =对它起作用。如果不是,代码就不能工作。我花了好几个小时才发现这一点。

  bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")



Check out ?reorder. The example seems to be what you want, but sorted in the opposite order. I changed -count in the first line below to sort in the order you want.


  bymedian <- with(InsectSprays, reorder(spray, -count, median))
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")



Yes, that is the idea:


> set.seed(42)                     # fix seed       
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE), 
+                  cost=rnorm(100)) 
> boxplot(cost ~ type, data=DF)    # not ordered by median
> # compute index of ordered 'cost factor' and reassign          
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))    
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])   
> boxplot(cost ~ type, data=DF)    # now it is ordered by median



Beware of missing values, you have to add na.rm = TRUE for it to work. If not, the code simply doesn't work. It took me hours to found that out.

小心丢失的值,你必须添加na。rm =对它起作用。如果不是,代码就不能工作。我花了好几个小时才发现这一点。

  bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
  boxplot(count ~ bymedian, data = InsectSprays,
          xlab = "Type of spray", ylab = "Insect count",
          main = "InsectSprays data", varwidth = TRUE,
          col = "lightgray")