A Mobile Application Front-End for Presenting Explainable AI Results in Diabetes Risk Estimation
Diabetes is a significant and continuously rising health challenge in Indonesia. Although many artificial intelligence (AI)-based health applications have been developed for early detection, most function as “black boxes,” lacking transparency in their predictions. Explainable AI (XAI) methods offer a solution, yet their technical outputs are often incomprehensible to non-expert users. This research aims to develop a mobile application front-end that presents XAI-driven diabetes risk analysis in an intuitive, understandable format. Development followed the waterfall methodology, comprising requirements analysis, interface design, implementation, and evaluation. Based on user preference surveys, the application adopts two primary visualization types - bar charts and pie charts - to convey the contribution of each risk factor. These are complemented by personalized textual narratives generated via integration with GPT-4o. The application was developed natively for Android using Kotlin and Jetpack Compose. The resulting prototype interprets SHAP (SHapley Additive exPlanations), a key XAI approach, into accessible graphical visualizations and narratives. Evaluation through user comprehension testing (Likert scale and interviews) and technical functionality testing confirmed the research objectives were met. The combination of visualization and textual narrative effectively enhanced user understanding (average score 4.31/5) and empowered preventive action, supported by a 100% technical testing success rate.
💡 Research Summary
This paper addresses the growing public‑health challenge of diabetes in Indonesia by developing a mobile‑front‑end that translates complex Explainable AI (XAI) outputs into forms that lay‑people can readily understand and act upon. The authors focus on SHAP (SHapley Additive exPlanations) values generated by an XGBoost‑based risk model and aim to present these contributions through intuitive visualizations and natural‑language explanations.
The development followed a classic waterfall software‑development life‑cycle, progressing through requirements analysis, design, implementation, and evaluation. In the requirements phase, three consecutive surveys involving 333, 183, and 114 participants respectively were conducted. The first survey identified three core user pain points: distrust of AI‑generated health information, lack of actionable recommendations, and difficulty interpreting complex data. The second survey compared default XAI library visualizations (LIME bar, SHAP bar, waterfall, force plots) with simplified graphics (pie chart, bar chart, radar chart). Users overwhelmingly preferred the simplified formats, with pie and bar charts scoring 8.69 and 8.54 out of 10, and 85 % indicating that textual narration was essential. The third survey narrowed the choice to three top‑performing simplified charts—pie, standard bar (all bars extending right), and diverging bar (bars extending left/right to show positive/negative influence). Preference was split, but the standard bar chart received the most votes (39.5 %). These findings directly shaped the UI design.
Design decisions centered on offering two visualization options—bar and pie charts—implemented with the MP AndroidChart library and integrated into Jetpack Compose via AndroidView. A clean MVVM + Clean Architecture pattern separates data, business logic, and UI layers. Raw SHAP values are transformed into percentage contributions using a simple normalization formula (absolute SHAP value divided by the sum of absolute values across all features). Color coding (red for risk‑increasing factors, green for risk‑decreasing) and a custom legend displaying both percentages and feature abbreviations further reduce cognitive load.
To complement graphics, the system generates concise, personalized narratives using OpenAI’s GPT‑4o model. A carefully engineered system prompt defines a “Medical AI Explainer” persona, specifies the task, enforces a strict JSON output schema, and supplies a knowledge base containing feature definitions and global importance statistics. Few‑shot examples and markdown‑styled tables are used to improve model comprehension and limit hallucination. The resulting JSON is parsed and displayed as a narrative card for each risk factor, typically consisting of two to three sentences that (1) state whether the factor raises or lowers risk, (2) quantify its influence, and (3) provide a brief definition. An example for BMI reads: “Your BMI of 24.7 contributes 17.0 % to your overall risk. Being overweight is associated with higher diabetes risk. Your value is above the ideal range of 18.5‑24.9.”
Functional testing employed the Espresso framework to run 111 end‑to‑end automated scenarios covering registration, data entry, risk simulation, and result visualization under both happy‑path and error conditions. The suite achieved a 100 % pass rate, surpassing the predefined 90 % reliability target.
User comprehension was assessed through a Likert‑scale questionnaire (5‑point) and semi‑structured interviews. The combined visualization‑plus‑narrative condition yielded an average score of 4.31 / 5 (SD = 0.42), indicating high clarity and perceived usefulness. Participants reported that the textual explanations clarified the direction and magnitude of each factor, while the charts provided an at‑a‑glance summary of contributions.
The authors conclude that integrating simplified visualizations with LLM‑generated narratives effectively bridges the “comprehension gap” inherent in XAI outputs for non‑expert users. Technically, the choice of Kotlin and Jetpack Compose ensures modern, performant Android development, while the MVVM + Clean Architecture foundation supports maintainability and future extensions. Limitations include reliance on continuous internet connectivity for LLM calls, potential privacy concerns around transmitting personal health data, and the absence of longitudinal clinical outcome studies. Future work is suggested in areas such as multilingual support, offline LLM inference, expansion to other chronic diseases, and real‑world deployment trials to evaluate impact on health behavior and outcomes.
Comments & Academic Discussion
Loading comments...
Leave a Comment