R:通过分隔符和重新排列来分隔字符串向量。

时间:2021-04-25 21:41:04

I have string vector that needs to be split and rearranged in a matrix in a certain way. I know how to do split/simple rearrange, but lost how my to rearrange how I want:

我有一个字符串向量它需要被分割并以某种方式重新排列在一个矩阵中。我知道如何进行拆分/简单的重新排列,但却迷失了如何重新排列我想要的:

library(stringi)

vec = c("b;a;c","a;c","c;b")
q = stri_split_fixed(vec, ";", simplify = TRUE,fill=T)
View(q)

V1  V2  V3
b   a   c
a   c    
c   b    

Desired output

期望输出值

V1  V2  V3
a   b   c
a       c 
    b   c 

Thank you! EDIT:

谢谢你!编辑:

Letters above are for simplicity. Real options are (not exhaustive list): D-Amazon Marketplace, U-Amazon, D-Amazon, U-Jet, etc. Starts with U and D only, though.

上面的字母是为了简单。真正的选项是:D- amazon Marketplace、U- amazon、D- amazon、U- jet等等。不过,这些都是从U和D开始的。

Order - alphabetical but grouped by retailer. If too complicated - no order is OK

按字母顺序排列,但按零售商分组。如果太复杂-没有顺序是可以的

1 个解决方案

#1


2  

This solution generates a boolean matrix with each vector as a row, and each possible character as a column.

该解决方案生成一个布尔矩阵,每个向量作为一行,每个可能的字符作为一列。

possible_options = c('a', 'b', 'c')
result <- sapply(possible_options, function(x) apply(q, 1, function(y) x %in% y))
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE

This solution requires a list of all the options. If you don't have that, you can either make a list of all possible options (for example all alphanumeric characters) and then remove blank rows:

这个解决方案需要列出所有选项。如果没有,你可以列出所有可能的选项(例如所有字母和数字字符),然后删除空行:

result <- sapply(c(letters, LETTERS), function(x) apply(q, 1, function(y) x %in% y))
result <- result[, colSums(result) > 0]
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE

Or extract them from the result of q

或者从q的结果中提取它们。

opts <- as.character(unique(unlist(q)))
opts <- opts[sort.list(opts[opts != ''])]
result <- sapply(opts , function(x) apply(q, 1, function(y) x %in% y))
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE

#1


2  

This solution generates a boolean matrix with each vector as a row, and each possible character as a column.

该解决方案生成一个布尔矩阵,每个向量作为一行,每个可能的字符作为一列。

possible_options = c('a', 'b', 'c')
result <- sapply(possible_options, function(x) apply(q, 1, function(y) x %in% y))
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE

This solution requires a list of all the options. If you don't have that, you can either make a list of all possible options (for example all alphanumeric characters) and then remove blank rows:

这个解决方案需要列出所有选项。如果没有,你可以列出所有可能的选项(例如所有字母和数字字符),然后删除空行:

result <- sapply(c(letters, LETTERS), function(x) apply(q, 1, function(y) x %in% y))
result <- result[, colSums(result) > 0]
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE

Or extract them from the result of q

或者从q的结果中提取它们。

opts <- as.character(unique(unlist(q)))
opts <- opts[sort.list(opts[opts != ''])]
result <- sapply(opts , function(x) apply(q, 1, function(y) x %in% y))
result
         a     b    c
[1,]  TRUE  TRUE TRUE
[2,]  TRUE FALSE TRUE
[3,] FALSE  TRUE TRUE