Sorry if this has been answered before, I'm not even sure how to search for it. I'm happy with any automated solution in R, VBA, or SPSS.
对不起,如果这个问题之前有人回答过,我甚至不知道该如何查找。我对任何R、VBA或SPSS中的自动解决方案都很满意。
I have a huge set of demographic data like this:
我有很多这样的人口统计数据
ID <- c(1, 2, 3, 4, 5)
State <- c("FL", "FL", "FL", "FL", "FL")
County <- c("Lake", "Lake", "Lake", "Orange", "Orange")
Household <- c (2, 1, 3, 2, 1)
First.Gender <- c("Male", "Female", "Male", "Female", "Male")
Second.Gender <- c("Male", "-", "Female", "Female", "-")
Third.Gender <- c("-", "-", "Male", "-", "-")
Gender_Example <- data.frame(ID, State, County, Household, First.Gender, Second.Gender, Third.Gender)
and I'd like to find a way to create new rows based on what's in the column (without creating blank rows). Something that looks like this:
我希望找到一种方法,根据列中的内容创建新的行(不创建空行)。看起来是这样的:
ID_i <- c(1, 1, 2, 3, 3, 3, 4, 4, 5) # _i designates my ideal set
State_i <- c("FL", "FL", "FL", "FL", "FL", "FL", "FL", "FL", "FL")
County_i <- c("Lake", "Lake", "Lake", "Lake", "Lake", "Lake", "Orange", "Orange", "Orange")
Household_i <- c(2, 2, 1, 3, 3, 3, 2, 2, 1)
Gender_i <- c("Male", "Male", "Female", "Male", "Female", "Male", "Female", "Female", "Male")
Gender_ideal <- data.frame(ID_i, State_i, County_i, Household_i, Gender_i)
If this has already been asked then I'd be happy just have a link. Thank you!
如果已经有人问过这个问题,那么我很高兴有个链接。谢谢你!
2 个解决方案
#1
3
R
In R, your best choices are going to be melt
from "data.table" (which lets you use "patterns" to identify your measure variables. With that, you would do:
在R中,你的最佳选择将从“数据”中消失。表(它允许您使用“模式”来识别度量变量。有了它,你就能做到:
library(data.table)
melt(setDT(Gender_Example), measure.vars = patterns("Gender$"))[value != "-"]
Alternatively, there's the "tidyverse" approach.
另外,还有一种“潮诗”方法。
library(tidyverse)
Gender_Example %>%
gather(variable, value, ends_with("Gender")) %>%
filter(value != "-")
SPSS
In SPSS, you would want to look at varstocases
. There's a pretty good writeup here that should help you get started.
在SPSS中,您需要查看varstocases。这里有一篇很好的文章可以帮助你开始。
Excel
This might depend on the version of Excel you're using. If you are using 2016, you can use the pivot table wizard on your data, and then double click on the grand totals to access the underlying "long" table that would have been used to create the table.
这可能取决于您使用的Excel版本。如果您正在使用2016,您可以在您的数据上使用pivot表向导,然后双击grand total来访问将用于创建该表的底层“long”表。
The process is outlined at this video.
这个过程在这个视频中被概述。
Alternatively, you can use the Tableau reshaping tool as described at this video
或者,您可以使用本视频中描述的Tableau整形工具
#2
1
This might be of help
这可能会有帮助。
library(reshape2)
Gender_ideal <- melt(Gender_Example, id=c(names(Gender_Example)[1:4]))
rows.to.remove <- which(Gender_ideal$value == "-")
Gender_ideal <- Gender_ideal[-rows.to.remove,]
Gender_ideal
ID State County Household variable value
1 1 FL Lake 2 First.Gender Male
2 2 FL Lake 1 First.Gender Female
3 3 FL Lake 3 First.Gender Male
4 4 FL Orange 2 First.Gender Female
5 5 FL Orange 1 First.Gender Male
6 1 FL Lake 2 Second.Gender Male
8 3 FL Lake 3 Second.Gender Female
9 4 FL Orange 2 Second.Gender Female
13 3 FL Lake 3 Third.Gender Male
#1
3
R
In R, your best choices are going to be melt
from "data.table" (which lets you use "patterns" to identify your measure variables. With that, you would do:
在R中,你的最佳选择将从“数据”中消失。表(它允许您使用“模式”来识别度量变量。有了它,你就能做到:
library(data.table)
melt(setDT(Gender_Example), measure.vars = patterns("Gender$"))[value != "-"]
Alternatively, there's the "tidyverse" approach.
另外,还有一种“潮诗”方法。
library(tidyverse)
Gender_Example %>%
gather(variable, value, ends_with("Gender")) %>%
filter(value != "-")
SPSS
In SPSS, you would want to look at varstocases
. There's a pretty good writeup here that should help you get started.
在SPSS中,您需要查看varstocases。这里有一篇很好的文章可以帮助你开始。
Excel
This might depend on the version of Excel you're using. If you are using 2016, you can use the pivot table wizard on your data, and then double click on the grand totals to access the underlying "long" table that would have been used to create the table.
这可能取决于您使用的Excel版本。如果您正在使用2016,您可以在您的数据上使用pivot表向导,然后双击grand total来访问将用于创建该表的底层“long”表。
The process is outlined at this video.
这个过程在这个视频中被概述。
Alternatively, you can use the Tableau reshaping tool as described at this video
或者,您可以使用本视频中描述的Tableau整形工具
#2
1
This might be of help
这可能会有帮助。
library(reshape2)
Gender_ideal <- melt(Gender_Example, id=c(names(Gender_Example)[1:4]))
rows.to.remove <- which(Gender_ideal$value == "-")
Gender_ideal <- Gender_ideal[-rows.to.remove,]
Gender_ideal
ID State County Household variable value
1 1 FL Lake 2 First.Gender Male
2 2 FL Lake 1 First.Gender Female
3 3 FL Lake 3 First.Gender Male
4 4 FL Orange 2 First.Gender Female
5 5 FL Orange 1 First.Gender Male
6 1 FL Lake 2 Second.Gender Male
8 3 FL Lake 3 Second.Gender Female
9 4 FL Orange 2 Second.Gender Female
13 3 FL Lake 3 Third.Gender Male