Problem I of takehome We saw in homework 2 that the use of offset( ) in the survreg() (also available in coxph() ) can produce an confidence intervel by appealing to the Wilks theorem (chi-square limiting distribution of the -2 log likelihood ratio). In the AFT model, similar thing can be done but more tedious: Here is an example. > library(survival) > library(rankreg) > data(myeloma) > rankaft(x=cbind(myeloma[,3],myeloma[,4]),y=myeloma[,1],delta=myeloma[,2])$beta We see the Gehan rank estimator of the slope beta in AFT model is: betag -15.01117 1.317596 So, beta1 hat = -15.01117, beta2 hat = 1.317596. The estimated model is Y= -15.01117*X1 + 1.317596*X2 + e How do we find a 90% confidence interval for beta1 ? There is no such thing like offset(-12*myeloma[,3]) in rankaft(), we have to do the following two steps: suppose we want to fix beta1 at -12 step 1. > rankaft(x=cbind(myeloma[,4]),y=(myeloma[,1]-(-12*myeloma[,3])),delta=myeloma[,2])$beta We get: betag 1.353403 step 2: > RankRegV(x=cbind(myeloma[,3],myeloma[,4]),y=myeloma[,1],d=myeloma[,2], beta=c(-12,1.353403)) We see (among other things) chisquare [,1] [1,] 0.3663783 This means if we do have an offset() command, then the output test statistics will have a chisquare value of 0.3663783, when beta1 is fixed at -12. We can keep repeating the above two steps, with other values replacing -12, until we get chisquare value of 2.7 (= 1.645^2 ). There should be two of them and this will be the 90% confidence interval of beta1 This is the two step with offset(-6*myeloma[,3]) and almost got what we needed chisquare value: > rankaft(x=cbind(myeloma[,4]),y=(myeloma[,1]-(-6*myeloma[,3])),delta=myeloma[,2])$beta xnew betag 1.324358 betal 2.697085 > RankRegV(x=cbind(myeloma[,3],myeloma[,4]),y=myeloma[,1],d=myeloma[,2], beta=c(-6,1.324358)) $VEF [,1] [,2] [1,] 141.54615 -31.92112 [2,] -31.92112 6071.29246 $chisquare [,1] [1,] 2.612443 $Pval [,1] [1,] 0.2708415 ============================================ OK, I assume you got the above procedure. Now you need to work on a different data set: data(stanford2) but with the following: delete the cases that t5 is missing (NA) and delete the cases where time is smaller than 5. We are intersted to fit a model where log10(time)=Y and 2 covariates are: age and t5. find a 90% confidence interval for age alone using the AFT model as illustrated above. Take home problem 2 Use the cancer data set. > library(survival) > data(cancer) > ?cancer (a) Fit a Cox model relating the survival time (time) of cancer patients with age, sex, ph.ecog and ph.karno. (Notice the definition of the status). Write some comments on the findings of your statistical analysis. (additional model fitting and analysis may be needed depending on your recommendation). Estimate the cumulative hazard function based on the fitted model in (a) for a hypothetical person with: age = 50, sex = male, ph.ecog=1, ph.karno= 70. Plot the curve.