FPBoost: Fully Parametric Gradient Boosting for Survival Analysis
Survival analysis is a statistical framework for modeling time-to-event data. It plays a pivotal role in medicine, reliability engineering, and social science research, where understanding event dynamics even with few data samples is critical. Recent advancements in machine learning, particularly those employing neural networks and decision trees, have introduced sophisticated algorithms for survival modeling. However, many of these methods rely on restrictive assumptions about the underlying event-time distribution, such as proportional hazard, time discretization, or accelerated failure time. In this study, we propose FPBoost, a survival model that combines a weighted sum of fully parametric hazard functions with gradient boosting. Distribution parameters are estimated with decision trees trained by maximizing the full survival likelihood. We show how FPBoost is a universal approximator of hazard functions, offering full event-time modeling flexibility while maintaining interpretability through the use of well-established parametric distributions. We evaluate concordance and calibration of FPBoost across multiple benchmark datasets, showcasing its robustness and versatility as a new tool for survival estimation.
💡 Research Summary
The paper introduces FPBoost, a novel survival‑analysis algorithm that models the hazard function as a weighted sum of fully parametric “heads,” each representing a known probability distribution (e.g., Weibull or Log‑Logistic). Unlike traditional Cox‑type semi‑parametric methods that rely on the proportional‑hazards assumption and partial likelihood, or recent deep‑learning approaches that discretize time or impose accelerated‑failure‑time constraints, FPBoost directly maximizes the full survival likelihood. This allows the model to use all available information from both observed events and censored observations without simplifying approximations.
Each head contributes a hazard term h_j(t | η_j, k_j) parameterized by a scale η and a shape k. Both parameters, as well as the head weight w_j, are predicted from the covariate vector x by regression trees within a gradient‑boosting framework. Non‑negativity of η, k, and w_j is enforced through ReLU or sigmoid/softmax activations, preserving interpretability: a large weight on a Weibull head suggests aging‑related risk, while a dominant Log‑Logistic head indicates early‑failure or infant‑mortality patterns.
The authors prove a universal approximation theorem (Theorem 3.1): any continuous, non‑negative hazard function on a bounded interval can be approximated arbitrarily well by a finite collection of Weibull heads with appropriate weights. The proof leverages the Weierstrass approximation theorem by showing a single Weibull head is equivalent to a monomial of arbitrary degree, establishing FPBoost as a universal hazard approximator given enough heads.
Empirically, FPBoost is evaluated on eight publicly available right‑censored, single‑event datasets spanning clinical, industrial, and customer‑churn domains. Performance is measured with the concordance index (C‑index) for discrimination and the integrated Brier score (IBS) for calibration. Across the majority of datasets, FPBoost attains the highest or statistically indistinguishable C‑index and IBS compared with strong baselines such as Cox‑Boost, XGBoost‑Cox, DeepHit, Deep Survival Machines (DSM), Random Survival Forests, and BoXHED. The advantage is especially pronounced in small‑sample, high‑censoring settings where FPBoost’s likelihood‑based training mitigates over‑fitting.
Interpretability is demonstrated by visualizing the learned head weights: datasets with monotonic increasing risk exhibit dominant Weibull contributions, whereas datasets with early‑peak risk show larger Log‑Logistic weights. This decomposition offers domain experts actionable insights into underlying failure mechanisms.
The paper contributes four main items: (1) a detailed description of the FPBoost architecture and its gradient‑boosting training pipeline; (2) a theoretical analysis establishing universal hazard approximation; (3) an extensive experimental comparison showing superior or competitive discrimination and calibration; and (4) an open‑source Python implementation compatible with scikit‑survival, enabling seamless integration into existing pipelines.
Limitations include the need to pre‑specify the number and type of heads, and increased computational cost when many heads or deep trees are employed. Future work is suggested on automatic head selection, Bayesian weight priors, and extensions to multi‑event or time‑varying covariate scenarios.
In summary, FPBoost bridges the gap between interpretable parametric survival models and the flexible, high‑capacity learning of gradient‑boosted trees, delivering a theoretically sound, practically robust tool for modern survival analysis without relying on proportional‑hazards or time‑discretization assumptions.
Comments & Academic Discussion
Loading comments...
Leave a Comment