拆分应用组合w/函数或purr package pmap?

时间:2020-12-26 21:42:26

This is a big problem for me to solve. If I had enough reputation to award a bounty I would!

这是我要解决的一个大问题。如果我有足够的名声来奖励一笔赏金,我会的!

Looking to balance territories of accounts of sales reps. I have the process broken up, and I don't really know how to do it across each region.

为了平衡销售代表的账目,我把整个流程都分解了,我真的不知道如何在每个地区进行。

In this example there are 1000 accounts across 4 regions, each region with 2 subset Leagues, and then various owners of the accounts -- Some accounts are unowned. Each account has a random value between 1,000 and 100,000.

在这个例子中,有4个区域的1000个账户,每个区域有2个子集,然后是账户的不同所有者——有些账户是不拥有的。每个帐户的随机值在1,000到100,000之间。

reproducible example:

可再生的例子:

Account List:

账户列表:

set.seed(1)
Accounts <- paste0("Acc", 1:1000)
Region <- c("NorthEast", "SouthEast", "MidWest", "West")
League <- sample(c("Majors", "Minors"), 1000, replace = TRUE)
AccValue <- sample(1000:100000, 1000, replace = TRUE)
Owner <- sample(c("Chad", NA, "Jimmy", "Adrian", NA, NA, "Steph", "Matt", "Jared", "Eric"), 1000, replace = TRUE)
AccDF <- data.frame(Accounts, Region, League, AccValue, Owner)
AccDF$Accounts <- as.character(AccDF$Accounts)
AccDF$Region <- as.character(AccDF$Region)
AccDF$League <- as.character(AccDF$League)
AccDF$Owner <- as.character(AccDF$Owner)

Summary of Ownership in region:

区域所有权概述:

Summary <- AccDF %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue))

Summary by Region, League:

摘要按地区联盟:

Summary2 <- AccDF %>%
  group_by(Region, League) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue),
            AccountsPerRep = round(Count / 7, 0),
            ValuePerRep = TotalValue / 7)

That is all of the starting data, and I would like to do the following process to each grouping of the Summary2 table.

这是所有的起始数据,我想对Summary2表的每个分组进行如下处理。

West Minors Example:

西方国家未成年人的例子:

Total West Minors Accounts: 120

西部未成年人总数:120

#break out into owned and unowned

WestMinorsOwned <- AccDF %>%
  filter(Region == "West",
         League == "Minors",
         !is.na(Owner))

WestMinorsUnowned <- AccDF %>%
  filter(Region == "West",
         League == "Minors",
         is.na(Owner))

#unassign accounts until threshold is hit

New.WestMinors <- WestMinorsOwned %>% 
  mutate(r = runif(n())) %>% 
  arrange(r) %>% 
  group_by(Owner) %>% 
  mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>% 
  ungroup(Owner) %>%
  mutate(Owner = NewOwner) %>%
  select(-r, -NewOwner)

After the Owner has been updated we bind back together the pieces to have the WestMinors Account Base, all with updated owners, hopefully balanced.

在所有者被更新后,我们将所有的碎片重新绑定在一起,以拥有西部未成年人的账户基础,所有与更新的所有者,希望平衡。

AssignableWestMinors <- bind_rows(filter(AccDF, Region == "West" & League == "Minors" & is.na(Owner)), 
                                  filter(New.WestMinors, is.na(Owner))) %>%
  arrange(desc(AccValue))

#check work
OwnerSummary <- New.WestMinors %>%
  filter(!is.na(Owner)) %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(), TotalValue = sum(AccValue))

No one has more than 14 accounts or 600,000, so we're in a good place to start reassigning the unowned accounts to try to balance everyone together. The following for-loop looks at each name in the OwnerSummary for who has the smallest $$ assigned to them and assigns the most valueable account, and then moves through each account, attempting to balance each owner's share.

没有人拥有超过14个账户或60万个账户,所以我们有一个很好的地方开始重新分配这些未拥有的账户,试图平衡所有人。下面的for-loop将查看owner summary中的每个名称,看看谁为它们分配了最小的$$,并分配了最值钱的帐户,然后遍历每个帐户,试图平衡每个所有者的份额。

#Balance Unassigned

for (i in 1:nrow(AssignableWestMinors)){
  idx <- which.min(OwnerSummary$TotalValue)
  OwnerSummary$TotalValue[idx] <- OwnerSummary$TotalValue[idx] + AssignableWestMinors$AccValue[i]
  OwnerSummary$Count[idx] <- OwnerSummary$Count[idx] + 1
  AssignableWestMinors$Owner[i] <- as.character(OwnerSummary$Owner[idx])}

Now we just bind together the previously owned, and the newly assigned, and we have our finished balanced West Minors territory.

现在我们把以前拥有的和新分配的结合在一起,我们已经完成了西部未成年人的平衡。

WestMinors.Final <- bind_rows(filter(New.WestMinors, !is.na(Owner)), AssignableWestMinors)

WM.Summary <- WestMinors.Final %>%
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
            TotalValue = sum(AccValue))

Everyone has a similar number of accounts, and the total $$ territory is all within reason.

每个人都有相同数量的账户,而且总金额都是合理的。

Now I'm trying to do that for each grouping of the original 4 regions, 2 leagues. So doing this 8 times and then stitch it all together. Each subgroup has a different threshold for $$ value to aim for, and # of accounts as well. How can I break apart the original account base into 8 sections, apply all of this, and then combine it back together?

现在我试着对原来的4个区域,2个联盟的每组都这样做。这样做8次,然后把它缝合在一起。每个子组都有一个针对$ value的不同阈值,以及帐户的#。我如何将原始的帐户基拆分成8个部分,应用所有这些,然后将它们合并到一起?

1 个解决方案

#1


2  

You should take advantage of ?dplyr::do to do the split-apply-combine operation that you want on subsets of Region-League. First, functionize your logic so that it can operate on a data frame dta which represents a subsetted version of the master dataframe AccDF.

您应该利用?dplyr::do在区域联盟的子集上执行您想要的分割-应用-合并操作。首先,对逻辑进行功能化,使其能够对代表主dataframe AccDF的子版本的数据帧dta进行操作。

reAssign <- function(dta) {
  other_acct <- dta %>% 
    filter(!is.na(Owner)) %>% 
    mutate(r = runif(n())) %>% 
    arrange(r) %>% 
    group_by(Owner) %>% 
    mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>% 
    ungroup(Owner) %>%
    mutate(Owner = NewOwner) %>%
    select(-r, -NewOwner)

  assignable_acct <- other_acct %>% 
    filter(is.na(Owner)) %>% 
    bind_rows( filter(dta, is.na(Owner)) ) %>% 
    arrange(desc(AccValue))

  acct_summary <- other_acct %>%
    filter(!is.na(Owner)) %>%
    group_by(Owner) %>%
    summarise(Count = n(), TotalValue = sum(AccValue))

  # I have a feeling there's a much better way of doing this, but oh well...  
  for (i in seq(nrow(assignable_acct))) {
    idx <- which.min(acct_summary$TotalValue)
    acct_summary$TotalValue[idx] <- acct_summary$TotalValue[idx] + assignable_acct$AccValue[i]
    acct_summary$Count[idx] <- acct_summary$Count[idx] + 1
    assignable_acct$Owner[i] <- as.character(acct_summary$Owner[idx])
  }
  final <- other_acct %>% 
    filter(!is.na(Owner)) %>% 
    bind_rows(assignable_acct)

  return(final)
}

Then simply apply it to AccDF that has been grouped by Region, League.

然后简单地将它应用到AccDF中。AccDF是按地区、联盟分组的。

new_master <- AccDF %>% 
  group_by(Region, League) %>% 
  do( reAssign(.) ) %>% 
  ungroup() 

Checking to make sure it's done it's job...

检查确保完成了任务……

new_master %>% 
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
          TotalValue = sum(AccValue)) %>% 
  as.data.frame()

#1


2  

You should take advantage of ?dplyr::do to do the split-apply-combine operation that you want on subsets of Region-League. First, functionize your logic so that it can operate on a data frame dta which represents a subsetted version of the master dataframe AccDF.

您应该利用?dplyr::do在区域联盟的子集上执行您想要的分割-应用-合并操作。首先,对逻辑进行功能化,使其能够对代表主dataframe AccDF的子版本的数据帧dta进行操作。

reAssign <- function(dta) {
  other_acct <- dta %>% 
    filter(!is.na(Owner)) %>% 
    mutate(r = runif(n())) %>% 
    arrange(r) %>% 
    group_by(Owner) %>% 
    mutate(NewOwner = replace(Owner, cumsum(AccValue) > 600000 | row_number() > 14, NA)) %>% 
    ungroup(Owner) %>%
    mutate(Owner = NewOwner) %>%
    select(-r, -NewOwner)

  assignable_acct <- other_acct %>% 
    filter(is.na(Owner)) %>% 
    bind_rows( filter(dta, is.na(Owner)) ) %>% 
    arrange(desc(AccValue))

  acct_summary <- other_acct %>%
    filter(!is.na(Owner)) %>%
    group_by(Owner) %>%
    summarise(Count = n(), TotalValue = sum(AccValue))

  # I have a feeling there's a much better way of doing this, but oh well...  
  for (i in seq(nrow(assignable_acct))) {
    idx <- which.min(acct_summary$TotalValue)
    acct_summary$TotalValue[idx] <- acct_summary$TotalValue[idx] + assignable_acct$AccValue[i]
    acct_summary$Count[idx] <- acct_summary$Count[idx] + 1
    assignable_acct$Owner[i] <- as.character(acct_summary$Owner[idx])
  }
  final <- other_acct %>% 
    filter(!is.na(Owner)) %>% 
    bind_rows(assignable_acct)

  return(final)
}

Then simply apply it to AccDF that has been grouped by Region, League.

然后简单地将它应用到AccDF中。AccDF是按地区、联盟分组的。

new_master <- AccDF %>% 
  group_by(Region, League) %>% 
  do( reAssign(.) ) %>% 
  ungroup() 

Checking to make sure it's done it's job...

检查确保完成了任务……

new_master %>% 
  group_by(Region, League, Owner) %>%
  summarise(Count = n(),
          TotalValue = sum(AccValue)) %>% 
  as.data.frame()