Explaining AI Without Code: A User Study on Explainable AI

Explaining AI Without Code: A User Study on Explainable AI
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The increasing use of Machine Learning (ML) in sensitive domains such as healthcare, finance, and public policy has raised concerns about the transparency of automated decisions. Explainable AI (XAI) addresses this by clarifying how models generate predictions, yet most methods demand technical expertise, limiting their value for novices. This gap is especially critical in no-code ML platforms, which seek to democratize AI but rarely include explainability. We present a human-centered XAI module in DashAI, an open-source no-code ML platform. The module integrates three complementary techniques, which are Partial Dependence Plots (PDP), Permutation Feature Importance (PFI), and KernelSHAP, into DashAI’s workflow for tabular classification. A user study (N = 20; ML novices and experts) evaluated usability and the impact of explanations. Results show: (i) high task success ($\geq80%$) across all explainability tasks; (ii) novices rated explanations as useful, accurate, and trustworthy on the Explanation Satisfaction Scale (ESS, Cronbach’s $α$ = 0.74, a measure of internal consistency), while experts were more critical of sufficiency and completeness; and (iii) explanations improved perceived predictability and confidence on the Trust in Automation scale (TiA, $α$ = 0.60), with novices showing higher trust than experts. These findings highlight a central challenge for XAI in no-code ML, making explanations both accessible to novices and sufficiently detailed for experts.


💡 Research Summary

The paper addresses a critical gap between the rapid growth of Explainable AI (XAI) methods and their accessibility in no‑code machine‑learning platforms, which aim to democratize AI but often lack built‑in interpretability tools. The authors introduce a human‑centered XAI module integrated into DashAI, an open‑source, graphical, no‑code ML environment. The module combines three complementary techniques—Partial Dependence Plots (PDP), Permutation Feature Importance (PFI), and KernelSHAP—to deliver both global (model‑wide) and local (instance‑specific) explanations for tabular classification models.

System design: DashAI’s workflow (Dataset → Experiment → Model → Prediction) is extended with an “Explainers” tab. After model training, users can open an explanation dashboard and, with a single click, generate PDP, PFI, or KernelSHAP visualizations. PDP shows how varying a single feature changes the predicted probability, providing an intuitive global view. PFI ranks features by the drop in model performance when each feature is randomly permuted, offering a straightforward importance ranking. KernelSHAP decomposes an individual prediction into additive feature contributions, giving a detailed local rationale. By embedding these methods directly into the UI, the authors eliminate the need for external libraries or code.

User study: Twenty participants (10 novices, 10 experts) were recruited and randomly assigned to two conditions. Scenario A presented explanations sequentially (PDP → PFI → SHAP), while Scenario B displayed all three simultaneously. Each participant completed five explanation‑focused tasks (open dashboard, generate and interpret PFI, generate and interpret PDP, generate and interpret KernelSHAP for a chosen instance, and compare the three methods). The study also collected responses to the Explanation Satisfaction Scale (ESS) and the Trust in Automation (TiA) questionnaire, followed by a brief semi‑structured interview.

Results – Usability: All explanation tasks achieved ≥ 80 % success, with 100 % success for opening the dashboard and generating PFI, 90 % for PDP, 80 % for KernelSHAP, and 90 % for the combined comparison. Errors were mainly in the more complex local explanation generation and multi‑method comparison steps, suggesting that UI refinements could further improve the KernelSHAP workflow.

Results – Satisfaction (ESS): The ESS demonstrated acceptable internal consistency (Cronbach α = 0.74). Novices gave high median scores (≈ 5/5) for usefulness, accuracy, and trust, while also rating comprehension, sufficiency, completeness, usability, and overall satisfaction around 4/5. Experts showed more variance, especially lowering scores for sufficiency of details and completeness, indicating higher expectations for diagnostic depth. No significant difference emerged between the sequential and combined presentation modes (Mann‑Whitney U = 38, p = 0.38). Logistic regression suggested novices were marginally more positive overall (p ≈ 0.08).

Results – Trust (TiA): The TiA subscales yielded a lower reliability (α = 0.60), but still revealed that explanations increased perceived predictability and confidence across participants. Novices reported higher overall trust and propensity to rely on the system, whereas experts remained more cautious, reflecting their experience with model limitations.

Interpretation: The study confirms that integrating multiple XAI techniques into a no‑code platform can achieve high usability and user satisfaction, especially for non‑technical users. However, expert users demand richer, more detailed explanations and may benefit from adaptive interfaces that expose deeper diagnostic information (e.g., interaction with SHAP value distributions, feature‑interaction visualizations, or model‑level diagnostics). The lack of significant differences between the two presentation scenarios suggests that the mere availability of explanations, rather than their ordering, drives perceived usefulness.

Limitations and future work: The TiA scale’s modest reliability indicates a need for refined trust metrics or complementary qualitative methods. The sample size (N = 20) is small, limiting statistical power. The authors propose extending the approach to large language models, developing context‑aware, interactive explanations, and implementing adaptive depth‑adjustment based on user expertise.

Overall contribution: The paper delivers a concrete, evaluated implementation of a human‑centered XAI module for a no‑code ML platform, demonstrates that such integration can bridge the “explainability gap” for novices while highlighting the necessity of flexible, expert‑oriented features. It offers a valuable blueprint for future XAI‑enhanced low‑code tools and underscores the importance of balancing accessibility with analytical depth.


Comments & Academic Discussion

Loading comments...

Leave a Comment