I have a Cox proportional hazards model set up using the following code in R that predicts mortality. Covariates A, B and C are added simply to avoid confounding (i.e. age, sex, race) but we are really interested in the predictor X. X is a continuous variable.
我有一个Cox比例风险模型,使用R中的以下代码来预测死亡率。协变量A,B和C的添加只是为了避免混淆(即年龄,性别,种族),但我们真的对预测X感兴趣.X是一个连续变量。
cox.model <- coxph(Surv(time, dead) ~ A + B + C + X, data = df)
Now, I'm having troubles plotting a Kaplan-Meier curve for this. I've been searching on how to create this figure but I haven't had much luck. I'm not sure if plotting a Kaplan-Meier for a Cox model is possible? Does the Kaplan-Meier adjust for my covariates or does it not need them?
现在,我遇到了麻烦,为此绘制了Kaplan-Meier曲线。我一直在寻找如何创造这个数字,但我没有太多运气。我不确定是否可以为Cox模型绘制Kaplan-Meier? Kaplan-Meier是否适合我的协变量或不需要它们?
What I did try is below, but I've been told this isn't right.
我尝试过的是下面的内容,但我被告知这是不对的。
plot(survfit(cox.model), xlab = 'Time (years)', ylab = 'Survival Probabilities')
I also tried to plot a figure that shows cumulative hazard of mortality. I don't know if I'm doing it right since I've tried it a few different ways and get different results. Ideally, I would like to plot two lines, one that shows the risk of mortality for the 75th percentile of X and one that shows the 25th percentile of X. How can I do this?
我还尝试绘制一个显示死亡累积危险的数字。我不知道我是否正确行事,因为我尝试了几种不同的方式并获得了不同的结果。理想情况下,我想绘制两条线,一条显示X的第75百分位的死亡风险,另一条显示X的第25百分位。我怎么能这样做?
I could list everything else I've tried, but I don't want to confuse anyone!
我可以列出我尝试过的所有其他内容,但我不想混淆任何人!
Many thanks.
2 个解决方案
#1
6
Here is an example taken from this paper.
以下是本文的一个例子。
url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
Rossi <- read.table(url, header=TRUE)
Rossi[1:5, 1:10]
# week arrest fin age race wexp mar paro prio educ
# 1 20 1 no 27 black no not married yes 3 3
# 2 17 1 no 18 black no not married yes 8 4
# 3 25 1 no 19 other yes not married yes 13 3
# 4 52 0 yes 23 black yes married yes 1 5
# 5 52 0 no 19 other yes not married yes 3 3
mod.allison <- coxph(Surv(week, arrest) ~
fin + age + race + wexp + mar + paro + prio,
data=Rossi)
mod.allison
# Call:
# coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp +
# mar + paro + prio, data = Rossi)
#
#
# coef exp(coef) se(coef) z p
# finyes -0.3794 0.684 0.1914 -1.983 0.0470
# age -0.0574 0.944 0.0220 -2.611 0.0090
# raceother -0.3139 0.731 0.3080 -1.019 0.3100
# wexpyes -0.1498 0.861 0.2122 -0.706 0.4800
# marnot married 0.4337 1.543 0.3819 1.136 0.2600
# paroyes -0.0849 0.919 0.1958 -0.434 0.6600
# prio 0.0915 1.096 0.0286 3.194 0.0014
#
# Likelihood ratio test=33.3 on 7 df, p=2.36e-05 n= 432, number of events= 114
Note that the model uses fin, age, race, wexp, mar, paro, prio
to predict arrest
. As mentioned in this document the survfit()
function uses the Kaplan-Meier estimate for the survival rate.
请注意,该模型使用fin,age,race,wexp,mar,paro,prio来预测逮捕。如本文档中所述,survfit()函数使用Kaplan-Meier估计存活率。
plot(survfit(mod.allison), ylim=c(0.7, 1), xlab="Weeks",
ylab="Proportion Not Rearrested")
We get a plot (with a 95% confidence interval) for the survival rate. For the cumulative hazard rate you can do
我们得到了一个生存率的图(置信区间为95%)。对于您可以做的累积危险率
# plot(survfit(mod.allison)$cumhaz)
but this doesn't give confidence intervals. However, no worries! We know that H(t) = -ln(S(t)) and we have confidence intervals for S(t). All we need to do is
但这不会给出置信区间。但是,不用担心!我们知道H(t)= -ln(S(t))并且我们有S(t)的置信区间。我们所需要做的就是
sfit <- survfit(mod.allison)
cumhaz.upper <- -log(sfit$upper)
cumhaz.lower <- -log(sfit$lower)
cumhaz <- sfit$cumhaz # same as -log(sfit$surv)
Then just plot these
然后只是绘制这些
plot(cumhaz, xlab="weeks ahead", ylab="cumulative hazard",
ylim=c(min(cumhaz.lower), max(cumhaz.upper)))
lines(cumhaz.lower)
lines(cumhaz.upper)
You'll want to use survfit(..., conf.int=0.50)
to get bands for 75% and 25% instead of 97.5% and 2.5%.
你需要使用survfit(...,conf.int = 0.50)来获得75%和25%的乐队,而不是97.5%和2.5%。
#2
2
The request for estimated survival curve at the 25th and 75th percentiles for X first requires determining those percentiles and specifying values for all the other covariates in a dataframe to be used as newdata argument to survfit.:
对于X的第25和第75百分位数的估计存活曲线的请求首先需要确定那些百分位数并指定数据帧中所有其他协变量的值,以用作幸存的新数据参数:
Can use the data suggested by other resondent from Fox's website, although on my machine it required building an url
-object:
可以使用福克斯网站上其他重建人员建议的数据,虽然在我的机器上需要建立一个网址对象:
url <- url("http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt")
Rossi <- read.table(url, header=TRUE)
It's probably not the best example for this wquestion but it does have a numeric variable that we can calculate the quartiles:
它可能不是这个问题的最好例子,但它确实有一个数值变量,我们可以计算四分位数:
> summary(Rossi$prio)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 1.000 2.000 2.984 4.000 18.000
So this would be the model fit and survfit calls:
所以这将是模型拟合和幸存者调用:
mod.allison <- coxph(Surv(week, arrest) ~
fin + age + race + prio ,
data=Rossi)
prio.fit <- survfit(mod.allison,
newdata= data.frame(fin="yes", age=30, race="black", prio=c(1,4) ))
plot(prio.fit, col=c("red","blue"))
#1
6
Here is an example taken from this paper.
以下是本文的一个例子。
url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
Rossi <- read.table(url, header=TRUE)
Rossi[1:5, 1:10]
# week arrest fin age race wexp mar paro prio educ
# 1 20 1 no 27 black no not married yes 3 3
# 2 17 1 no 18 black no not married yes 8 4
# 3 25 1 no 19 other yes not married yes 13 3
# 4 52 0 yes 23 black yes married yes 1 5
# 5 52 0 no 19 other yes not married yes 3 3
mod.allison <- coxph(Surv(week, arrest) ~
fin + age + race + wexp + mar + paro + prio,
data=Rossi)
mod.allison
# Call:
# coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp +
# mar + paro + prio, data = Rossi)
#
#
# coef exp(coef) se(coef) z p
# finyes -0.3794 0.684 0.1914 -1.983 0.0470
# age -0.0574 0.944 0.0220 -2.611 0.0090
# raceother -0.3139 0.731 0.3080 -1.019 0.3100
# wexpyes -0.1498 0.861 0.2122 -0.706 0.4800
# marnot married 0.4337 1.543 0.3819 1.136 0.2600
# paroyes -0.0849 0.919 0.1958 -0.434 0.6600
# prio 0.0915 1.096 0.0286 3.194 0.0014
#
# Likelihood ratio test=33.3 on 7 df, p=2.36e-05 n= 432, number of events= 114
Note that the model uses fin, age, race, wexp, mar, paro, prio
to predict arrest
. As mentioned in this document the survfit()
function uses the Kaplan-Meier estimate for the survival rate.
请注意,该模型使用fin,age,race,wexp,mar,paro,prio来预测逮捕。如本文档中所述,survfit()函数使用Kaplan-Meier估计存活率。
plot(survfit(mod.allison), ylim=c(0.7, 1), xlab="Weeks",
ylab="Proportion Not Rearrested")
We get a plot (with a 95% confidence interval) for the survival rate. For the cumulative hazard rate you can do
我们得到了一个生存率的图(置信区间为95%)。对于您可以做的累积危险率
# plot(survfit(mod.allison)$cumhaz)
but this doesn't give confidence intervals. However, no worries! We know that H(t) = -ln(S(t)) and we have confidence intervals for S(t). All we need to do is
但这不会给出置信区间。但是,不用担心!我们知道H(t)= -ln(S(t))并且我们有S(t)的置信区间。我们所需要做的就是
sfit <- survfit(mod.allison)
cumhaz.upper <- -log(sfit$upper)
cumhaz.lower <- -log(sfit$lower)
cumhaz <- sfit$cumhaz # same as -log(sfit$surv)
Then just plot these
然后只是绘制这些
plot(cumhaz, xlab="weeks ahead", ylab="cumulative hazard",
ylim=c(min(cumhaz.lower), max(cumhaz.upper)))
lines(cumhaz.lower)
lines(cumhaz.upper)
You'll want to use survfit(..., conf.int=0.50)
to get bands for 75% and 25% instead of 97.5% and 2.5%.
你需要使用survfit(...,conf.int = 0.50)来获得75%和25%的乐队,而不是97.5%和2.5%。
#2
2
The request for estimated survival curve at the 25th and 75th percentiles for X first requires determining those percentiles and specifying values for all the other covariates in a dataframe to be used as newdata argument to survfit.:
对于X的第25和第75百分位数的估计存活曲线的请求首先需要确定那些百分位数并指定数据帧中所有其他协变量的值,以用作幸存的新数据参数:
Can use the data suggested by other resondent from Fox's website, although on my machine it required building an url
-object:
可以使用福克斯网站上其他重建人员建议的数据,虽然在我的机器上需要建立一个网址对象:
url <- url("http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt")
Rossi <- read.table(url, header=TRUE)
It's probably not the best example for this wquestion but it does have a numeric variable that we can calculate the quartiles:
它可能不是这个问题的最好例子,但它确实有一个数值变量,我们可以计算四分位数:
> summary(Rossi$prio)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 1.000 2.000 2.984 4.000 18.000
So this would be the model fit and survfit calls:
所以这将是模型拟合和幸存者调用:
mod.allison <- coxph(Surv(week, arrest) ~
fin + age + race + prio ,
data=Rossi)
prio.fit <- survfit(mod.allison,
newdata= data.frame(fin="yes", age=30, race="black", prio=c(1,4) ))
plot(prio.fit, col=c("red","blue"))