Regression analysis of semi-competing risk data and precision medicine
No Thumbnail Available
Authors
Meeting name
Sponsors
Date
Journal Title
Format
Thesis
Subject
Abstract
Right-censored failure time data commonly occur in various fields, including economics, medical studies and public health, and a great deal of literature on their analyses has been established. However, there still exist some problems related to their analyses that have not been investigated. In this dissertation, we will discuss three such topics and provide some statistical methods. The first part of this dissertation focuses on semi-competing risk data problem, which is often encountered in clinical studies when there are related endpoints. The data structure consists of a terminal event and a non-terminal event, where the terminal event may censor the non-terminal event, but not vice versa. Such relationship between the two events brings difficulty to estimation procedure and has been well studied in the past twenty years. However, most of the existing methods, either the copula models or the illness-death models, require the specification of the underlying correlation between the non-terminal event and terminal event. In Chapter 2, we propose an alternative conditional approach, which is more attractive and natural if the non-terminal event is of main interest. In the proposed method, a class of flexible additive and multiplicative models and the additive hazards model are employed to model the non-terminal and terminal events, respectively. For inference, an estimating equation-based procedure is developed and the asymptotic properties of the resulting estimators are established. In addition, a model checking procedure is provided. The numerical results indicate that the proposed methodology works well in practical situations and it is applied to a real set of data that motivated this study. The second and third part of this dissertation focus on precision medicine. There are two types of heterogeneity considered in this dissertation: the same treatment can have different effect for different patients in a clinical study, or the effect can be heterogeneous on the same patient across different quantiles of the survival time. The first type heterogeneity is referred to as the subgroup analysis, and the second type is referred to as the quantile regression. There has been a great deal of literature for subgroup analysis methods for censored data and quantile regression for censored data, but there has been no method considering the case when both of the heterogeneity exists. In Chapter 3, to address such double heterogeneity, we propose a pairwise fusion penalty approach that can identify the subgroup structure and estimate the covariate effects simultaneously. It is in the similar spirit of regularized variable selection, but the penalized term is the pairwise difference between the coefficients of subjects. For the implementation of the proposed method, an alternating direction method of multipliers algorithm is developed and the asymptotic properties of the resulting estimators are established. To assess its empirical performance of the proposed methodology, a simulation study is performed and indicates that it works well in practical situation. Finally, it is applied to the well-known Stanford heart transplant data and suggests the possible existence of a threshold with respect to the diagnostic effect of the T5 mismatch score. In Chapter 4, we focus on censored quantile regression (CQR) and propose a prediction method for CQR with high-dimensional covariates. Instead of variable selection, we adopt the model averaging framework since we are more interested in prediction. Unlike with variable selection method, model averaging method do not select the best model, but assign different weights to a group of candidate models, so that the prediction accuracy is increased, especially when the noise level is high and impacts the model selection. We use the jackknife criterion to search the optimal weights for each submodels. To evaluate the prediction performance of the proposed method, we conduct a simulation study and apply it to a real data example, and compare the prediction error with other variable selection methods established for high-dimensional CQR.
Table of Contents
DOI
PubMed ID
Degree
Ph. D.
