在R中省略和找到平均值

时间:2022-09-08 14:57:54

I am given stores ID's and the amount the store earned. What I would like to do is, omit all but one store (lets say store ID: 333333 and 222222 in this case) and then find the average amount of store 111111.

我获得商店ID和商店赚取的金额。我想要做的是,省略除一个商店之外的所有商店(在这种情况下可以说商店ID:333333和222222),然后找到商店111111的平均数量。

YEAR       STORE ID       AMOUNT
2012       111111         11
2012       222222         12
2012       111111         4 
2012       222222         4 
2012       111111         45
2012       333333         7

All help is appreciated!

所有帮助表示赞赏!

1 个解决方案

#1


1  

While mean(df$AMOUNT[df[, "STORE ID"] == 1111111]) will work for your specific example, you might also want to checkout the dplyr package which provides some advanced table manipulation and grouping functions.

虽然mean(df $ AMOUNT [df [,“STORE ID”] == 1111111])适用于您的特定示例,但您可能还需要签出dplyr包,它提供了一些高级表操作和分组功能。

For example, to get the mean for all stores at once, you could do the following:

例如,要一次获得所有商店的平均值,您可以执行以下操作:

library(dplyr)
summarize(group_by(df, STORE.ID), Average = mean(AMOUNT))

Or, the same code but using the pipe operator (%>%), which is typically done in dplyr:

或者,使用管道运算符(%>%)的相同代码,通常在dplyr中完成:

df %>%
  group_by(STORE.ID) %>%
  summarise(Average = mean(AMOUNT))

Assumptions:

假设:

  1. Your data is in a data frame called df
  2. 您的数据位于名为df的数据框中
  3. The STORE ID column is converted to a valid R name with a dot in place of the space
  4. STORE ID列将转换为有效的R名称,并带有一个点代替空格

#1


1  

While mean(df$AMOUNT[df[, "STORE ID"] == 1111111]) will work for your specific example, you might also want to checkout the dplyr package which provides some advanced table manipulation and grouping functions.

虽然mean(df $ AMOUNT [df [,“STORE ID”] == 1111111])适用于您的特定示例,但您可能还需要签出dplyr包,它提供了一些高级表操作和分组功能。

For example, to get the mean for all stores at once, you could do the following:

例如,要一次获得所有商店的平均值,您可以执行以下操作:

library(dplyr)
summarize(group_by(df, STORE.ID), Average = mean(AMOUNT))

Or, the same code but using the pipe operator (%>%), which is typically done in dplyr:

或者,使用管道运算符(%>%)的相同代码,通常在dplyr中完成:

df %>%
  group_by(STORE.ID) %>%
  summarise(Average = mean(AMOUNT))

Assumptions:

假设:

  1. Your data is in a data frame called df
  2. 您的数据位于名为df的数据框中
  3. The STORE ID column is converted to a valid R name with a dot in place of the space
  4. STORE ID列将转换为有效的R名称,并带有一个点代替空格