Statistical methods for clustered competing risk data when the event types are only available in a training dataset.

Wu, Y., Yang, C., & Wang, M. (2026). Statistical methods for clustered competing risk data when the event types are only available in a training dataset.. Statistical Methods in Medical Research, 9622802251415022.

Abstract

We develop methods to analyze clustered competing risks data when the event types are only available in a training dataset and are missing in the main study. We propose to estimate the exposure effects through the cause-specific proportional hazards frailty model where random effects are introduced into the model to account for the within-cluster correlation. We propose a weighted penalized partial likelihood method where the weights represent the probabilities of the occurrence of events, and the weights can be obtained by fitting a classification model for the event types on the training dataset. Alternatively, we propose an imputation approach where the missing event types are imputed based on the predictions from the classification model. We derive the analytical variances, and evaluate the finite sample properties of our methods in an extensive simulation study. As an illustrative example, we apply our methods to estimate the associations between tinnitus and metabolic, sensory and metabolic+sensory hearing loss in the Conservation of Hearing Study Audiology Assessment Arm.

Last updated on 04/01/2026
PubMed