根据行号R组合和附加不同长度的列

时间:2021-02-26 13:21:35

I'm working with biochemical data from subjects, analysing the results by sex. I have 19 biochemical tests to analyse for each sex, for each of two drugs (haematology and anatomy tests coming later).

我正在研究受试者的生化数据,根据性别分析结果。我有19项生化测试来分析每一种性别,每一种药物(血液学和解剖学测试将在后面进行)。

For reasons of reproducibility of results and for preventing transcription errors, I am trying to summarise each test into one table. Included in the table output, I need a column for the Dunnett post hoc comparison p-values. Because the Dunnett test compares to the control results, with a control and 3 drug levels I only get 3 p-values. However, I have 4 mean and sd values.

由于结果的可重复性和防止转录错误,我试图将每个测试总结成一个表格。在表输出中,我需要一个Dunnett post hoc比较p值的列。因为Dunnett测试与对照组的结果相比较,在对照组和3个药物水平下,我只得到3个p值。但是,我有4个平均值和sd值。

Using ddply to get the mean and sd results (having limited the number of significant figures, I get a dataset that looks like this:

使用ddply获得平均值和sd结果(由于限制了重要数据的数量,我得到了如下数据集:

 Sex<- c(rep("F",4), rep("M",4))
 Druglevel <- c(rep(0:3,2))
 Sample <- c(rep(10,8))
 Mean <- c(0.44, 0.50, 0.46, 0.49, 0.48, 0.55, 0.47, 0.57)
 sd <- c(0.07, 0.07, 0.09, 0.12, 0.18, 0.19, 0.13, 0.41)
 Drug1Biochem1 <- data.frame(Sex, Druglevel, Sample, Mean, sd)

I have used glht in the package multcomp to perform the Dunnett tests on the aov object I constructed from undertaking a normal aov. I've extracted the p-values from the glht summary (I've rounded these to three decimal places). The male and female analyses have been run using separate ANOVA so I have one set of output for each sex. The female results are:

我在包multcomp中使用了glht来对我从一个正常的aov构造的aov对象执行Dunnett测试。我从glht摘要中提取了p值(我把它们四舍五入到小数点后三位)。男性和女性的分析都是用单独的方差分析来进行的,所以我对每一性别都有一组输出。女性的结果是:

femaleR <- c(0.371, 0.973, 0.490) 

and the male results are:

男性的结果是:

 maleR <- c(0.862, 0.999, 0.738)

How can I append a column for the p-values to my original dataframe (Drug1Biochem1) so that both femaleR and maleR are in that final column, with row 1 and row 5 of that column empty (i.e. no p-values for the control)?

如何将p值的列附加到原始的dataframe (Drug1Biochem1)中,使femaleR和maleR都位于最后一列,该列的第1行和第5行为空(即控件没有p值)?

I wish to output the resulting combination to html, which can be inserted into a Word document so no transcription errors occur. I have set a seed value so that the results of the program are reproducible (when I finally stop debugging).

我希望将结果的组合输出到html,它可以插入到Word文档中,这样就不会出现抄写错误。我设置了一个种子值,以便程序的结果是可复制的(当我最终停止调试时)。

In summary, I would like a data frame (or table, or whatever I can output to html) that has the following format:

总之,我想要一个数据框架(或表格,或我可以输出到html的任何东西)具有以下格式:

 Sex       Druglevel       Sample     Mean     sd     p-value
 F         0               10         0.44     0.07   
 F         1               10         0.50     0.07   0.371
 F         2               10         0.46     0.09   0.973
 F         3               10         0.49     0.12   0.480
 M         0               10         0.48     0.18   
 M         1               10         0.55     0.19   0.862
 M         2               10         0.47     0.13   0.999
 M         3               10         0.57     0.41   0.738

For each test, I wish to reproduce this exact table. There will always be 4 groups per sex, and there will never be a p-value for the control, which will always be summarised in row 1 (F) and row 5 (M).

对于每个测试,我希望复制这个精确的表。每个性别总是有4个组,并且永远不会有一个p值来控制,它总是在第一行(F)和第5行(M)中总结。

1 个解决方案

#1


1  

You could try merge

你可以尝试合并

dN <- data.frame(Sex=rep(c('M', 'F'), each=3), Druglevel=1:3, 
                 pval=c(maleR, femaleR))

merge(Drug1Biochem1, dN, by=c('Sex', 'Druglevel'), all=TRUE)
#   Sex Druglevel Sample Mean   sd  pval
#1   F         0     10 0.44 0.07    NA
#2   F         1     10 0.50 0.07 0.371
#3   F         2     10 0.46 0.09 0.973
#4   F         3     10 0.49 0.12 0.490
#5   M         0     10 0.48 0.18    NA
#6   M         1     10 0.55 0.19 0.862
#7   M         2     10 0.47 0.13 0.999
#8   M         3     10 0.57 0.41 0.738

#1


1  

You could try merge

你可以尝试合并

dN <- data.frame(Sex=rep(c('M', 'F'), each=3), Druglevel=1:3, 
                 pval=c(maleR, femaleR))

merge(Drug1Biochem1, dN, by=c('Sex', 'Druglevel'), all=TRUE)
#   Sex Druglevel Sample Mean   sd  pval
#1   F         0     10 0.44 0.07    NA
#2   F         1     10 0.50 0.07 0.371
#3   F         2     10 0.46 0.09 0.973
#4   F         3     10 0.49 0.12 0.490
#5   M         0     10 0.48 0.18    NA
#6   M         1     10 0.55 0.19 0.862
#7   M         2     10 0.47 0.13 0.999
#8   M         3     10 0.57 0.41 0.738