I have a data.frame in R which has values for pairs of regions. The first columns can be constructed with the code:
我在R中有一个data.frame,它有一对区域的值。可以使用以下代码构造第一列:
region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW", "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)
df <- data.frame(x,y,region.1,region.2)
I would like to make a plot with different collors for each pair of regions, so I have tried
我想为每对区域制作一个不同颜色的情节,所以我试过了
ggplot(data=df, aes(x=x, y=y))+
geom_point(size=5,aes(color=interaction(region.1,region.2)))
However, the result wasn't what I was expecting, once the permutations of the same interaction were considered.
然而,一旦考虑到相同交互的排列,结果就不是我所期望的。
As shown in the image there is the group SW.SE and SE.SW, for instance.
如图所示,例如有SW.SE和SE.SW组。
I would like to ask how could I make, in an inteligent way, groups of the pairs without the permutations.
我想问一下如何以一种智能的方式制作没有排列的成对组。
3 个解决方案
#1
2
Here are two dplyr
-based options. Both involve sorting the values of the two regions for each (x,y) pair. The first uses mutate
to paste sorted values, done with rowwise
. The second uses gather
to make a single column of regions, arrange by pairs of (x,y), and then summarise
s the regions by pasting them together.
这里有两个基于dplyr的选项。两者都涉及为每个(x,y)对排序两个区域的值。第一个使用mutate来粘贴排序值,使用rowwise完成。第二个使用聚集来制作单个区域列,按(x,y)对排列,然后通过将它们粘贴在一起来汇总区域。
library(tidyverse)
region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW", "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)
df <- data.frame(x,y,region.1,region.2)
df_interact1 <- df %>%
mutate_if(is.factor, as.character) %>%
rowwise() %>%
mutate(interact = sort(c(region.1, region.2)) %>% paste(., collapse = ".")) %>%
ungroup()
df_interact1
#> # A tibble: 5 x 5
#> x y region.1 region.2 interact
#> <dbl> <dbl> <chr> <chr> <chr>
#> 1 1 1 SE SW SE.SW
#> 2 2 3 SE S S.SE
#> 3 3 2 SW SE SE.SW
#> 4 4 4 S SE S.SE
#> 5 5 1 SW SE SE.SW
ggplot(df_interact1, aes(x = x, y = y, color = interact)) +
geom_point(size = 5)
df_interact2 <- df %>%
gather(key = region, value = value, region.1, region.2) %>%
group_by(x, y) %>%
arrange(value) %>%
summarise(interact = paste(min(value), max(value), sep = ".")) %>%
ungroup()
df_interact2
#> # A tibble: 5 x 3
#> x y interact
#> <dbl> <dbl> <chr>
#> 1 1 1 SE.SW
#> 2 2 3 S.SE
#> 3 3 2 SE.SW
#> 4 4 4 S.SE
#> 5 5 1 SE.SW
ggplot(df_interact2, aes(x = x, y = y, color = interact)) +
geom_point(size = 5)
Created on 2018-05-22 by the reprex package (v0.2.0).
由reprex包(v0.2.0)于2018-05-22创建。
#2
2
Using your example data you can apply
over the rows and then sort
the regions and then collapse
them into an interaction term as follows:
使用示例数据,您可以应用于行,然后对区域进行排序,然后将它们折叠为交互术语,如下所示:
df$interaction <- apply(df, 1, function(x){paste(sort(c(x[3],x[4])), collapse = ".")})
ggplot(data=df, aes(x=x, y=y))+
geom_point(size=5,aes(color=interaction))
Resulting in:
#3
0
The solutions above works properly, based on then I found a faster solution where one does not need to know the indexes on the data-frame, based on mutate, and paste. An ifelse is used to chose the order:
上面的解决方案工作正常,基于此我找到了一个更快的解决方案,其中一个人不需要知道数据框架上的索引,基于mutate和paste。 ifelse用于选择订单:
library(dplyr)
df<-df %>% mutate (
region.1=as.character(region.1),
region.2=as.character(region.2),
interact = ifelse(region.1<region.2,
paste(region.1,region.2,sep="."),
paste(region.2,region.1,sep=".")))
#1
2
Here are two dplyr
-based options. Both involve sorting the values of the two regions for each (x,y) pair. The first uses mutate
to paste sorted values, done with rowwise
. The second uses gather
to make a single column of regions, arrange by pairs of (x,y), and then summarise
s the regions by pasting them together.
这里有两个基于dplyr的选项。两者都涉及为每个(x,y)对排序两个区域的值。第一个使用mutate来粘贴排序值,使用rowwise完成。第二个使用聚集来制作单个区域列,按(x,y)对排列,然后通过将它们粘贴在一起来汇总区域。
library(tidyverse)
region.1 <- c("SE", "SE", "SW", "S", "SW")
region.2 <- c("SW", "S", "SE", "SE", "SE")
x <- c(1,2,3,4,5)
y <- c(1,3,2,4,1)
df <- data.frame(x,y,region.1,region.2)
df_interact1 <- df %>%
mutate_if(is.factor, as.character) %>%
rowwise() %>%
mutate(interact = sort(c(region.1, region.2)) %>% paste(., collapse = ".")) %>%
ungroup()
df_interact1
#> # A tibble: 5 x 5
#> x y region.1 region.2 interact
#> <dbl> <dbl> <chr> <chr> <chr>
#> 1 1 1 SE SW SE.SW
#> 2 2 3 SE S S.SE
#> 3 3 2 SW SE SE.SW
#> 4 4 4 S SE S.SE
#> 5 5 1 SW SE SE.SW
ggplot(df_interact1, aes(x = x, y = y, color = interact)) +
geom_point(size = 5)
df_interact2 <- df %>%
gather(key = region, value = value, region.1, region.2) %>%
group_by(x, y) %>%
arrange(value) %>%
summarise(interact = paste(min(value), max(value), sep = ".")) %>%
ungroup()
df_interact2
#> # A tibble: 5 x 3
#> x y interact
#> <dbl> <dbl> <chr>
#> 1 1 1 SE.SW
#> 2 2 3 S.SE
#> 3 3 2 SE.SW
#> 4 4 4 S.SE
#> 5 5 1 SE.SW
ggplot(df_interact2, aes(x = x, y = y, color = interact)) +
geom_point(size = 5)
Created on 2018-05-22 by the reprex package (v0.2.0).
由reprex包(v0.2.0)于2018-05-22创建。
#2
2
Using your example data you can apply
over the rows and then sort
the regions and then collapse
them into an interaction term as follows:
使用示例数据,您可以应用于行,然后对区域进行排序,然后将它们折叠为交互术语,如下所示:
df$interaction <- apply(df, 1, function(x){paste(sort(c(x[3],x[4])), collapse = ".")})
ggplot(data=df, aes(x=x, y=y))+
geom_point(size=5,aes(color=interaction))
Resulting in:
#3
0
The solutions above works properly, based on then I found a faster solution where one does not need to know the indexes on the data-frame, based on mutate, and paste. An ifelse is used to chose the order:
上面的解决方案工作正常,基于此我找到了一个更快的解决方案,其中一个人不需要知道数据框架上的索引,基于mutate和paste。 ifelse用于选择订单:
library(dplyr)
df<-df %>% mutate (
region.1=as.character(region.1),
region.2=as.character(region.2),
interact = ifelse(region.1<region.2,
paste(region.1,region.2,sep="."),
paste(region.2,region.1,sep=".")))