如何在R中没有排列的情况下执行DF的两列交互?

时间:2022-10-03 20:08:54

I have a data.frame in R which has values for pairs of regions. The first columns can be constructed with the code:

我在R中有一个data.frame,它有一对区域的值。可以使用以下代码构造第一列:

region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW",  "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)

df <- data.frame(x,y,region.1,region.2)

I would like to make a plot with different collors for each pair of regions, so I have tried

我想为每对区域制作一个不同颜色的情节,所以我试过了

ggplot(data=df, aes(x=x, y=y))+
    geom_point(size=5,aes(color=interaction(region.1,region.2)))

However, the result wasn't what I was expecting, once the permutations of the same interaction were considered.

然而,一旦考虑到相同交互的排列,结果就不是我所期望的。

如何在R中没有排列的情况下执行DF的两列交互?

As shown in the image there is the group SW.SE and SE.SW, for instance.

如图所示,例如有SW.SE和SE.SW组。

I would like to ask how could I make, in an inteligent way, groups of the pairs without the permutations.

我想问一下如何以一种智能的方式制作没有排列的成对组。

3 个解决方案

#1


2  

Here are two dplyr-based options. Both involve sorting the values of the two regions for each (x,y) pair. The first uses mutate to paste sorted values, done with rowwise. The second uses gather to make a single column of regions, arrange by pairs of (x,y), and then summarises the regions by pasting them together.

这里有两个基于dplyr的选项。两者都涉及为每个(x,y)对排序两个区域的值。第一个使用mutate来粘贴排序值,使用rowwise完成。第二个使用聚集来制作单个区域列,按(x,y)对排列,然后通过将它们粘贴在一起来汇总区域。

library(tidyverse)
region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW",  "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)

df <- data.frame(x,y,region.1,region.2)

df_interact1 <- df %>%
  mutate_if(is.factor, as.character) %>%
  rowwise() %>% 
  mutate(interact = sort(c(region.1, region.2)) %>% paste(., collapse = ".")) %>%
  ungroup()


df_interact1
#> # A tibble: 5 x 5
#>       x     y region.1 region.2 interact
#>   <dbl> <dbl> <chr>    <chr>    <chr>   
#> 1     1     1 SE       SW       SE.SW   
#> 2     2     3 SE       S        S.SE    
#> 3     3     2 SW       SE       SE.SW   
#> 4     4     4 S        SE       S.SE    
#> 5     5     1 SW       SE       SE.SW

ggplot(df_interact1, aes(x = x, y = y, color = interact)) +
  geom_point(size = 5)

如何在R中没有排列的情况下执行DF的两列交互?

df_interact2 <- df %>%
  gather(key = region, value = value, region.1, region.2) %>%
  group_by(x, y) %>%
  arrange(value) %>%
  summarise(interact = paste(min(value), max(value), sep = ".")) %>%
  ungroup()

df_interact2
#> # A tibble: 5 x 3
#>       x     y interact
#>   <dbl> <dbl> <chr>   
#> 1     1     1 SE.SW   
#> 2     2     3 S.SE    
#> 3     3     2 SE.SW   
#> 4     4     4 S.SE    
#> 5     5     1 SE.SW

ggplot(df_interact2, aes(x = x, y = y, color = interact)) +
  geom_point(size = 5)

如何在R中没有排列的情况下执行DF的两列交互?

Created on 2018-05-22 by the reprex package (v0.2.0).

由reprex包(v0.2.0)于2018-05-22创建。

#2


2  

Using your example data you can apply over the rows and then sort the regions and then collapse them into an interaction term as follows:

使用示例数据,您可以应用于行,然后对区域进行排序,然后将它们折叠为交互术语,如下所示:

df$interaction <- apply(df, 1, function(x){paste(sort(c(x[3],x[4])), collapse = ".")})
ggplot(data=df, aes(x=x, y=y))+
  geom_point(size=5,aes(color=interaction))

Resulting in:

如何在R中没有排列的情况下执行DF的两列交互?

#3


0  

The solutions above works properly, based on then I found a faster solution where one does not need to know the indexes on the data-frame, based on mutate, and paste. An ifelse is used to chose the order:

上面的解决方案工作正常,基于此我找到了一个更快的解决方案,其中一个人不需要知道数据框架上的索引,基于mutate和paste。 ifelse用于选择订单:

library(dplyr)
df<-df %>% mutate (
           region.1=as.character(region.1),
           region.2=as.character(region.2),
           interact = ifelse(region.1<region.2,
                                paste(region.1,region.2,sep="."),
                                paste(region.2,region.1,sep=".")))

#1


2  

Here are two dplyr-based options. Both involve sorting the values of the two regions for each (x,y) pair. The first uses mutate to paste sorted values, done with rowwise. The second uses gather to make a single column of regions, arrange by pairs of (x,y), and then summarises the regions by pasting them together.

这里有两个基于dplyr的选项。两者都涉及为每个(x,y)对排序两个区域的值。第一个使用mutate来粘贴排序值,使用rowwise完成。第二个使用聚集来制作单个区域列,按(x,y)对排列,然后通过将它们粘贴在一起来汇总区域。

library(tidyverse)
region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW",  "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)

df <- data.frame(x,y,region.1,region.2)

df_interact1 <- df %>%
  mutate_if(is.factor, as.character) %>%
  rowwise() %>% 
  mutate(interact = sort(c(region.1, region.2)) %>% paste(., collapse = ".")) %>%
  ungroup()


df_interact1
#> # A tibble: 5 x 5
#>       x     y region.1 region.2 interact
#>   <dbl> <dbl> <chr>    <chr>    <chr>   
#> 1     1     1 SE       SW       SE.SW   
#> 2     2     3 SE       S        S.SE    
#> 3     3     2 SW       SE       SE.SW   
#> 4     4     4 S        SE       S.SE    
#> 5     5     1 SW       SE       SE.SW

ggplot(df_interact1, aes(x = x, y = y, color = interact)) +
  geom_point(size = 5)

如何在R中没有排列的情况下执行DF的两列交互?

df_interact2 <- df %>%
  gather(key = region, value = value, region.1, region.2) %>%
  group_by(x, y) %>%
  arrange(value) %>%
  summarise(interact = paste(min(value), max(value), sep = ".")) %>%
  ungroup()

df_interact2
#> # A tibble: 5 x 3
#>       x     y interact
#>   <dbl> <dbl> <chr>   
#> 1     1     1 SE.SW   
#> 2     2     3 S.SE    
#> 3     3     2 SE.SW   
#> 4     4     4 S.SE    
#> 5     5     1 SE.SW

ggplot(df_interact2, aes(x = x, y = y, color = interact)) +
  geom_point(size = 5)

如何在R中没有排列的情况下执行DF的两列交互?

Created on 2018-05-22 by the reprex package (v0.2.0).

由reprex包(v0.2.0)于2018-05-22创建。

#2


2  

Using your example data you can apply over the rows and then sort the regions and then collapse them into an interaction term as follows:

使用示例数据,您可以应用于行,然后对区域进行排序,然后将它们折叠为交互术语,如下所示:

df$interaction <- apply(df, 1, function(x){paste(sort(c(x[3],x[4])), collapse = ".")})
ggplot(data=df, aes(x=x, y=y))+
  geom_point(size=5,aes(color=interaction))

Resulting in:

如何在R中没有排列的情况下执行DF的两列交互?

#3


0  

The solutions above works properly, based on then I found a faster solution where one does not need to know the indexes on the data-frame, based on mutate, and paste. An ifelse is used to chose the order:

上面的解决方案工作正常,基于此我找到了一个更快的解决方案,其中一个人不需要知道数据框架上的索引,基于mutate和paste。 ifelse用于选择订单:

library(dplyr)
df<-df %>% mutate (
           region.1=as.character(region.1),
           region.2=as.character(region.2),
           interact = ifelse(region.1<region.2,
                                paste(region.1,region.2,sep="."),
                                paste(region.2,region.1,sep=".")))