Generalizing the Finkelstein-Schoenfeld Test to Incorporate Multiple Alternating Thresholds

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Composite endpoints consisting of both terminal and non-terminal events, such as death and hospitalization, are frequently used in cardiovascular clinical trials. The Finkelstein-Schoenfeld (FS) test provides a way to employ a hierarchical structure to combine fatal and non-fatal events by giving death information an absolute priority, which may limit the contribution of clinically meaningful non-fatal events. To provide a more flexible alternative, we propose the Finkelstein-Schoenfeld with Multiple Thresholds (FS-MT) test, which extends the standard FS test by incorporating multiple thresholds applied sequentially and alternating across endpoints. A weighted adaptive approach is also developed to help determine the thresholds in FS-MT. The proposed approach retains the statistical properties of the FS test while allowing more flexible use of information from lower-priority events. We evaluate the operating characteristics of the proposed test through simulations that vary the follow-up time, the correlation between events, and the treatment effect sizes. A case study based on the Digitalis Investigation Group clinical trial data is presented to further illustrate our proposed method. An R package ``FSMT’’ that implements the proposed methodology has been developed.

💡 Research Summary

The paper addresses a well‑known limitation of the traditional Finkelstein‑Schoenfeld (FS) test: its strict hierarchical ordering gives absolute priority to the most severe event (typically death), which can suppress the contribution of clinically important non‑fatal events such as hospitalizations. To overcome this, the authors introduce the Finkelstein‑Schoenfeld with Multiple Thresholds (FS‑MT) test. In FS‑MT each component endpoint is assigned not one but at least two pre‑specified thresholds (e.g., d ≥ 0 for survival time and t₁ ≥ 0 for time‑to‑hospitalization). Pairwise comparisons are performed sequentially, starting with the larger thresholds; only when a comparison is tied at a given level does the procedure move to the next, lower‑threshold level. This “alternating thresholds” scheme relaxes the rigid priority while preserving the pairwise‑comparison foundation of the original FS test.

Mathematically, the authors define comparison functions Dᵢⱼ(d) and Tᵢⱼ¹(t₁) that incorporate censoring indicators and the sign of the observed differences, and then construct four stage‑wise win‑loss scores Uᵢⱼ₁…Uᵢⱼ₄. The overall test statistic S is the sum of Uᵢⱼ over all treated–control pairs, exactly as in the classic FS test. They prove that under the null hypothesis S follows an asymptotic normal distribution with mean zero and a closed‑form variance estimator, guaranteeing that the FS‑MT test retains the same type‑I error control and inference machinery (e.g., Z‑test, confidence intervals) as the original method.

A key practical challenge is the choice of thresholds. The authors propose a data‑driven weighted adaptive approach: clinically meaningful minimal clinically important differences (MCIDs) are pre‑specified for each endpoint, then the observed frequencies of exceeding these differences are used to assign weights that shrink the thresholds toward the smallest clinically relevant values. Simulation studies demonstrate that this adaptive weighting does not inflate the test size while substantially increasing power, especially when the top‑level event is frequent and unresponsive to treatment.

The simulation design varies three factors: (1) maximum follow‑up time, (2) correlation between death and hospitalization times, and (3) magnitude of treatment effects on each endpoint (hazard ratios). Results show that FS‑MT consistently outperforms the standard FS test when the primary (death) endpoint dominates the comparison, allowing the secondary (hospitalization) endpoint to contribute meaningful information. Power gains of 10–15 % are observed at the same nominal α level, and the method remains robust across different censoring patterns and correlation structures.

For empirical illustration, the authors re‑analyze data from the Digitalis Investigation Group (DIG) trial. While the original FS analysis found no significant effect of digitalis on mortality and consequently down‑weighted the hospitalization benefit, the FS‑MT analysis, using thresholds (d = 30 days, t₁ = 15 days) and the adaptive weighting scheme, detected a statistically significant win‑ratio favoring digitalis for reducing hospitalizations. This case study underscores how FS‑MT can reveal treatment benefits that are masked by a strict hierarchy.

Finally, the authors release an R package “FSMT” that implements the full methodology: specification of multiple thresholds, weighted adaptive threshold selection, computation of the win‑loss matrix, variance estimation, and bootstrap confidence intervals. The package is compatible with existing GPC tools (e.g., winratio, gpc) and includes functions for simulation studies.

In summary, FS‑MT extends the FS test by incorporating alternating multiple thresholds, thereby softening the hierarchical dominance of fatal events while preserving the statistical properties of the original test. The weighted adaptive threshold selection offers a pragmatic way to choose clinically relevant thresholds without sacrificing type‑I error control. Simulation and real‑world data demonstrate improved sensitivity to lower‑priority outcomes, making FS‑MT a valuable addition to the toolbox for analyzing composite endpoints in cardiovascular and other clinical trials.

Generalizing the Finkelstein-Schoenfeld Test to Incorporate Multiple Alternating Thresholds

💡 Research Summary

Comments & Academic Discussion

Leave a Comment