Human-like visual computing advances explainability and few-shot learning in deep neural networks for complex physiological data

Reading time: 5 minute
...

📝 Original Info

  • Title: Human-like visual computing advances explainability and few-shot learning in deep neural networks for complex physiological data
  • ArXiv ID: 2512.22349
  • Date: 2025-12-26
  • Authors: Alaa Alahmadi, Mohamed Hasan

📝 Abstract

Machine vision models, particularly deep neural networks, are increasingly applied to physiological signal interpretation, including electrocardiography (ECG), yet they typically require large training datasets and offer limited insight into the causal features underlying their predictions. This lack of data efficiency and interpretability constrains their clinical reliability and alignment with human reasoning. Here, we show that a perception-informed pseudo-colouring technique, previously demonstrated to enhance human ECG interpretation, can improve both explainability and few-shot learning in deep neural networks analysing complex physiological data. We focus on acquired, drug-induced long QT syndrome (LQTS) as a challenging case study characterised by heterogeneous signal morphology, variable heart rate, and scarce positive cases associated with life-threatening arrhythmias such as torsades de pointes. This setting provides a stringent test of model generalisation under extreme data scarcity. By encoding clinically salient temporal features, such as QT-interval duration, into structured colour representations, models learn discriminative and interpretable features from as few as one or five training examples. Using prototypical networks and a ResNet-18 architecture, we evaluate one-shot and few-shot learning on ECG images derived from single cardiac cycles and full 10-second rhythms. Explainability analyses show that pseudo-colouring guides attention toward clinically meaningful ECG features while suppressing irrelevant signal components. Aggregating multiple cardiac cycles further improves performance, mirroring human perceptual averaging across heartbeats. Together, these findings demonstrate that human-like perceptual encoding can bridge data efficiency, explainability, and causal reasoning in medical machine intelligence.

💡 Deep Analysis

Figure 1

📄 Full Content

Statistical machine vision models have become increasingly effective in the analysis and interpretation of complex medical data, including imaging and signal-based modalities such as electrocardiograms (ECGs) and electroencephalograms (EEGs) [1,2,3]. These electro-physiological signals are among the most challenging data types for clinical interpretation, owing to their temporal complexity, inter-individual variability, and susceptibility to noise and artefacts [4,5,6]. Advances in artificial intelligence therefore hold significant promise for augmenting clinicians' ability to visually monitor and interpret such signals, although several fundamental challenges remain unresolved [7].

A central challenge in medical machine intelligence is achieving decision-making processes that are interpretable and trustworthy in a human-like manner-that is, intuitive, clinically grounded, and aligned with expert reasoning [8,9,7]. At the same time, many clinically important conditions are rare, heterogeneous, or sparsely labelled, limiting the availability of representative training data and undermining the reliability and generalisability of data-hungry deep learning models [10,7]. These challenges are particularly pronounced for electro-physiological signals, which lack explicit spatial structure and exhibit substantial physiological overlap between normal and pathological patterns. As a result, clinically salient features are often entangled with benign signal variability, complicating abstraction, causal reasoning, and robust generalisation by machine vision systems [11,12].

Here, we investigate a fundamentally different approach to representing and pre-processing ECG signals that draws inspiration from human perceptual strategies rather than purely statistical optimisation. Focusing on long QT syndrome (LQTS)-a clinically serious and visually subtle disorder associated with life-threatening arrhythmias such as torsades de pointes-we build on our prior work demonstrating that perception-informed pseudo-colouring can significantly enhance human ECG interpretation. We show that encoding clinically meaningful temporal features into structured colour representations enables deep neural networks to perceive ECG information in a more human-like manner. This approach simultaneously improves model accuracy, interpretability, and robustness, while enabling effective generalisation from very small numbers of training examples. Together, these results suggest that integrating human perceptual principles into machine learning pipelines offers a promising pathway toward more explainable, data-efficient, and clinically aligned machine intelligence for complex physiological signals 2 Material and methods

We used the ECGRDVQ database, which contains 12-lead ECG signal recordings of 22 healthy subjects who participated in a randomized, double-blind, 5-period crossover clinical trial aimed at assessing the effect of four known QT-prolonging drugs versus placebo. The open ECG datasets are available online from the PhysioNet database [13].

The 10-second lead-II recording was selected from each 12-lead ECG, as this is typically used to measure the QT interval [14]. The heart rates of the ECGs ranged from 40 to 96 beats per minute (bpm), and the QT-interval values ranged from 300 to 579 ms. We used the same ECGs (n = 5050) that were used in our previous studies evaluating the use of the pseudo-coloring technique when mapped according to the QT-nomogram (a clinical risk assessment method designed specifically for identifying patients at risk of drug-induced Torsades de Pointes (TdP) life-threatening arrhythmia according to heart rate [15]), where we evaluated the pseudo-coloring with human interpretation [16], and with a rule-based explainable algorithm [17].

As part of the clinical trial study methodology, medical experts have calculated the QT-interval and heart rate values for all ECGs. These QT/HR values were used as the ground truth for subsequently evaluating the model’s binary classification performance. According to the QT/HR pair plots of all ECGs on the nomogram [15], there were 180 positive ECGs showing a high risk of developing life-threatening TdP arrhythmia from drug-induced long QT syndrome, while the other negative ECGs (n = 4870) were below the nomogram line (no TdP risk). This significant imbalanced data is a common problem in medical machine learning research (where rare or positive cases typically have few examples/representations). Therefore, we specifically used few-shot and one-shot learning techniques (see section 2.3 for more details) to help overcome this problem using a new way of image representation and processing.

In this study, we re-visualize each ECG signal 10-second recording into four image representations, as follows:

• a 256 x 256 single heartbeat image representation with and without pseudo-coloring.

• a 2048 x 256 10-second heart rhythm (i.e. multiple heartbeats) image representation, with and w

📸 Image Gallery

Few-shot-lead-abnormal-all.png Few-shot-lead-normal-all.png Few-shot-leanring-abormal-HB.png Few-shot-leanring-normal-HB.png One-shot-lead-abnormal.png One-shot-lead-normal.png cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut