Refining Kaplan-Meier Estimation with the Generalized Pareto Model for Survival Analysis
The Kaplan-Meier estimator is widely recognized as the leading nonparametric method for estimating survival functions from censored data. However, it faces challenges with tail estimation and cannot extrapolate beyond the maximum observed data point, particularly when the largest observation is censored. To address these limitations, we enhance the Kaplan-Meier estimator by fitting the upper tail of the survival function to a generalized Pareto model. This approach improves tail estimation and extends survival estimates beyond the observed maximum, regardless of whether the largest observation is censored. We derive the joint asymptotic behavior of the Kaplan-Meier estimator in both central and tail regions by analyzing exceedances over a high, finite threshold, leading to more accurate approximations. Furthermore, we establish that the confidence intervals from a random weighted bootstrap method are asymptotically correct and demonstrate its coverage performance through numerical analysis. We illustrate the estimation and inference advantages of our refined estimator in an application to the National Job Training Partnership Act study.