如何在S编程语言中进行治疗紧急数据转换?

时间:2022-04-16 15:02:07

how would you do this in R? ( this is a data preparation task ) From the adverse events dataset, derive the treatment-emergent adverse events dataset: For each body system and preferred term, a row for each patient such that either:

你怎么在R? (这是一个数据准备任务)从不良事件数据集中,导出治疗 - 紧急不良事件数据集:对于每个身体系统和首选术语,每个患者的行,以便:

  1. That adverse event occurred in the post-baseline period but not the baseline period, or
  2. 该不良事件发生在基线后期但不是基线期,或
  3. Even though the event did occur in the baseline period , it occurred post-baseline at a higher severity than observed during baseline
  4. 即使事件确实发生在基线期,但它发生在基线后的严重程度高于基线期间观察到的严重程度

Variables:

severity = 1 , 2 , 3 (integer code for mild moderate severe)
patid visit bodysys prefterm
Baseline rows are rows such that visit<=2
Post baseline rows are rows such that visit>2

Here is the data prep in SAS , in 23 lines of code:

以下是SAS中的数据准备,有23行代码:

data base1_dset ;
 set ae_dset ;
 if visit<=2 ;

proc sort data=base1_dset ;
 by patid bodysys prefterm severity ;

data base2_dset ;
 set base1_dset ;
 by patid bodysys prefterm severity ;
 if last.prefterm ;

data post1_dset ;
 set ae_dset ;
 if visit> 2 ;

proc sort data=post1_dset ;
 by patid bodysys prefterm severity ;

data post2_dset ;
 set post1_dset ;
 by patid bodysys prefterm severity ;
 if last.prefterm ;
 rename severity = severity2 ;

data new_ae_dset ;
 merge base2_dset post2_dset ;
 by patid bodysys prefterm ;
 if severity2>severity or severity==. ;

And here is the data prep in Vilno Data Transformation , in 12 lines of code : ( for more see http://fivetimesfaster.blogspot.com )

这里是Vilno数据转换中的数据准备,有12行代码:(更多信息请参见http://fivetimesfaster.blogspot.com)

inlist ae_dset ;
if not ( visit<=2 ) deleterow ;
select severity=max(severity) by patid bodysys prefterm ;
sendoff(base2_dset) patid bodysys prefterm severity ;

inlist ae_dset ;
if not ( visit>2 ) deleterow ;
select severity2=max(severity) by patid bodysys prefterm ;
sendoff(post2_dset) patid bodysys prefterm severity2 ;

inlist base2_dset post2_dset ;
mergeby patid bodysys prefterm ;
if not ( severity2>severity or severity is null ) deleterow ;
sendoff(new_ae_dset) patid bodysys prefterm severity2 ;

How would you do this in R?

你会如何在R中做到这一点?

thanks , Robert

谢谢,罗伯特

PS the formatting of the code examples is horrendous, why is * ignoring some of my return/newline characters?

PS代码示例的格式是可怕的,为什么*忽略了我的一些返回/换行符?

1 个解决方案

#1


1  

This seems to do more or less what you are asking (at least if the variables are numeric). There will be better ways

这似乎或多或少地与你所要求的一致(至少如果变量是数字的话)。会有更好的方法

smallvisit       <- ae_dset[ ae_dset$visit <= 2, ]
bigvisit         <- ae_dset[ ae_dset$visit >  2, ]
nams             <- c("patid", "bodysys", "prefterm")
smallvisitsorted <- smallvisit[ do.call( order, smallvisit[nams] ), ]
smallvisitsplit  <- split( smallvisitsorted, smallvisitsorted[nams], drop=TRUE )
last             <- function(a){ tail( a, 1 ) }
smallvisitlast   <- as.data.frame( t( sapply( smallvisitsplit, last ) ) )
mergedvisit      <- merge( bigvisit, smallvisitlast, by=nams, all.x=TRUE )
new_ae_dset      <- mergedvisit[ mergedvisit$severity.x > mergedvisit$severity.y | 
                                is.na( mergedvisit$severity.y ) , ]

For example if ae_dset looks like

例如,如果ae_dset看起来像

   patid bodysys prefterm visit severity
1      5       9        2     1        3
2     22       1        5     5        2
3     11       2        9     3        3
4     11       2        9     2        2
5     22       3        3     3        1
6      3       4        6     1        2
7     22       3        3     2        2
8     22       3        3     4        3
9     11       2        9     1        1
10     3       3        6     5        2
11     4       3        7     7        3

then, using this code, new_ae_dset will look

然后,使用此代码,new_ae_dset将会显示

  patid bodysys prefterm visit.x severity.x visit.y severity.y
1     3       3        6       5          2      NA         NA
2     4       3        7       7          3      NA         NA
3    11       2        9       3          3       1          1
4    22       1        5       5          2      NA         NA
6    22       3        3       4          3       2          2

#1


1  

This seems to do more or less what you are asking (at least if the variables are numeric). There will be better ways

这似乎或多或少地与你所要求的一致(至少如果变量是数字的话)。会有更好的方法

smallvisit       <- ae_dset[ ae_dset$visit <= 2, ]
bigvisit         <- ae_dset[ ae_dset$visit >  2, ]
nams             <- c("patid", "bodysys", "prefterm")
smallvisitsorted <- smallvisit[ do.call( order, smallvisit[nams] ), ]
smallvisitsplit  <- split( smallvisitsorted, smallvisitsorted[nams], drop=TRUE )
last             <- function(a){ tail( a, 1 ) }
smallvisitlast   <- as.data.frame( t( sapply( smallvisitsplit, last ) ) )
mergedvisit      <- merge( bigvisit, smallvisitlast, by=nams, all.x=TRUE )
new_ae_dset      <- mergedvisit[ mergedvisit$severity.x > mergedvisit$severity.y | 
                                is.na( mergedvisit$severity.y ) , ]

For example if ae_dset looks like

例如,如果ae_dset看起来像

   patid bodysys prefterm visit severity
1      5       9        2     1        3
2     22       1        5     5        2
3     11       2        9     3        3
4     11       2        9     2        2
5     22       3        3     3        1
6      3       4        6     1        2
7     22       3        3     2        2
8     22       3        3     4        3
9     11       2        9     1        1
10     3       3        6     5        2
11     4       3        7     7        3

then, using this code, new_ae_dset will look

然后,使用此代码,new_ae_dset将会显示

  patid bodysys prefterm visit.x severity.x visit.y severity.y
1     3       3        6       5          2      NA         NA
2     4       3        7       7          3      NA         NA
3    11       2        9       3          3       1          1
4    22       1        5       5          2      NA         NA
6    22       3        3       4          3       2          2