An interpretable data-driven approach to optimizing clinical fall risk assessment
In this study we aim to better align fall risk prediction from the Johns Hopkins Fall Risk Assessment Tool (JHFRAT) with additional clinically meaningful measures via a data-driven modelling approach. We conducted a retrospective analysis of 54,209 inpatient admissions from three Johns Hopkins Health System hospitals between March 2022 and October 2023. A total of 20,208 admissions were included as high fall risk encounters, and 13,941 were included as low fall risk encounters. To incorporate clinical knowledge and maintain interpretability, we employed constrained score optimization (CSO) models on JHFRAT assessment data and additional electronic health record (EHR) variables. The model demonstrated significant improvements in predictive performance over the current JHFRAT (CSO AUC-ROC=0.91, JHFRAT AUC-ROC=0.86). The constrained score optimization models performed similarly with and without the EHR variables. Although the benchmark black-box model (XGBoost), improves upon the performance metrics of the knowledge-based constrained logistic regression (AUC-ROC=0.94), the CSO demonstrates more robustness to variations in risk labelling. This evidence-based approach provides a robust foundation for health systems to systematically enhance inpatient fall prevention protocols and patient safety using data-driven optimization techniques, contributing to improved risk assessment and resource allocation in healthcare settings.
💡 Research Summary
This paper presents a data‑driven, interpretable approach to improve the Johns Hopkins Fall Risk Assessment Tool (JHFRAT), a widely used clinical score for inpatient fall risk stratification. The authors conducted a retrospective cohort study of 54,209 adult, non‑psychiatric admissions across three Johns Hopkins hospitals between March 2022 and October 2023. After applying inclusion criteria (length of stay ≥48 h, at least three JHFRAT recordings), they derived a proxy “true” fall‑risk label based on the frequency of resource‑intensive, targeted fall‑prevention interventions recorded in the electronic health record (EHR). Specifically, a three‑day sliding window with ≥6 targeted interventions classified an encounter as high‑risk, ≤1 intervention as low‑risk, and all others as indeterminate. Indeterminate encounters that matched the intervention pattern of a high‑risk fall encounter were re‑labeled as high‑risk, yielding a final analytic set of 13,945 high‑risk and 20,265 low‑risk admissions.
The core methodological contribution is the use of Constrained Score Optimization (CSO), a recent framework for ordinal classification that preserves the additive structure of an existing clinical score while allowing its item weights to be recalibrated. The CSO formulation incorporates several constraints: (1) monotonicity within each JHFRAT domain (age, medication, patient‑care equipment) so that higher risk levels receive non‑decreasing weights; (2) non‑negativity of all weights; (3) class‑balanced weighting to address the ~60 %/40 % high‑low risk imbalance. The objective function maximizes a weighted combination of log‑likelihoods evaluated at the two established JHFRAT thresholds (low/moderate vs. high), effectively aligning the recalibrated score with the study’s proxy labels while keeping the original cut‑points intact.
Two CSO models were trained: (a) Optimized CSO using only the 18 binary JHFRAT items, and (b) Augmented CSO that adds 22 binary EHR‑derived variables (demographics, AM‑PAC and JH‑HLM mobility scores, ICD‑10 codes, service category, etc.). For comparison, a gradient‑boosted decision tree model (XGBoost) was also trained on both the JHFRAT‑only and the expanded feature sets. All models were trained on an 80 %/20 % train‑test split stratified by risk label, with 5‑fold cross‑validation. CSO optimization employed CVXPY with the MOSEK solver (tolerance 1e‑8, max 1e⁶ iterations); XGBoost used 100 trees, learning rate 0.1, and unrestricted depth.
Performance results show that CSO markedly improves discrimination over the original JHFRAT (AUC‑ROC 0.91 vs. 0.86). Adding the extra EHR variables does not substantially increase CSO performance, suggesting that the original JHFRAT items already capture most of the predictive signal. XGBoost achieves the highest AUC‑ROC (0.94) but is more sensitive to label definition changes. Sensitivity analyses varying the high‑risk intervention threshold from 4 to 8 interventions per three‑day window reveal that CSO’s AUC varies by less than 0.02, whereas XGBoost’s performance fluctuates more markedly, indicating that CSO is more robust to uncertainty in the proxy labels.
The authors acknowledge several limitations: (1) true fall events are rare (0.92 % of encounters), so the proxy label based on interventions may not perfectly reflect underlying fall propensity; (2) the study is confined to a single health system, limiting external generalizability; (3) missing JHFRAT or mobility assessments were not imputed, potentially biasing results. They propose future work to incorporate sensor‑based mobility data, validate the approach across multiple institutions, and refine the labeling strategy using more granular clinical outcomes.
In conclusion, the paper demonstrates that a constrained, score‑recalibration approach can substantially enhance an existing clinical risk tool while preserving its interpretability and operational workflow. The CSO framework offers a pragmatic pathway for health systems to modernize legacy scores with modern EHR data, achieving better patient‑level risk stratification without sacrificing transparency—a critical requirement for bedside decision support.
Comments & Academic Discussion
Loading comments...
Leave a Comment