Exploiting Weak Head Movements for Camera-based Respiration Detection
In recent years, considerable progress has been made in the non-contact based detection of the respiration rate from video sequences. Common techniques either directly assess the movement of the chest due to breathing or are based on analyzing subtle color changes that occur as a result of hemodynamic properties of the skin tissue by means of remote photoplethysmography (rPPG). However, extracting hemodynamic parameters from rPPG is often difficult especially if the skin is not visible to the camera. In contrast, extracting respiratory signals from chest movements turned out to be a robust method. However, the detectability of chest regions cannot be guaranteed in any application scenario, for instance if the camera setting is optimized to provide close-up images of the head. In such a case an alternative method for respiration detection is required. It is reasonable to assume that the mechanical coupling between chest and head induces minor movements of the head which, like in rPPG, can be detected from subtle color changes as well. Although the strength of these movements is expected to be much smaller in scale, sensing these intensity variations could provide a reasonably suited respiration signal for subsequent respiratory rate analysis. In order to investigate this coupling we conducted an experimental study with 12 subjects and applied motion- and rPPG-based methods to estimate the respiratory frequency from both head regions and chest. Our results show that it is possible to derive signals correlated to chest movement from facial regions. The method is a feasible alternative to rPPG-based respiratory rate estimations when rPPG-signals cannot be derived reliably and chest movement detection cannot be applied as well.
💡 Research Summary
The paper addresses the problem of contact‑free respiratory rate monitoring when the chest is not visible and conventional remote photoplethysmography (rPPG) is unreliable because skin regions are occluded. The authors hypothesize that the mechanical coupling between the thorax and the head induces subtle head movements that can be captured in video recordings. To test this, twelve healthy volunteers were recorded with a 120 fps RGB camera while breathing in synchrony with a metronome at rates ranging from 10 to 18 bpm. Videos included the face and upper chest. Regions of interest (ROIs) were divided into 10‑pixel squares; for each ROI the average red, green, and blue pixel intensities were computed, low‑pass filtered at 4 Hz, and used as raw motion signals. An adaptive least‑mean‑square filter combined red and green channels to derive rPPG signals, compensating for global illumination changes.
All signals were segmented into 30‑second windows with 1‑second steps. Empirical Mode Decomposition (EMD) was applied to each ROI signal, producing intrinsic mode functions (IMFs). The IMF whose power‑spectral density exhibited the highest peak within the plausible respiratory band (0.1–0.4 Hz) was selected as the respiratory surrogate. The frequency of this IMF gave a per‑ROI respiratory estimate; a signal‑to‑noise ratio (SNR) derived from the PSD served as a weight, and the weighted median across ROIs yielded the final respiratory rate for the window.
Results showed that chest‑based raw RGB estimation achieved near‑perfect accuracy (error ≈ ±0.1 bpm). For facial regions, rPPG provided the smallest mean error, but the red‑channel raw motion signal achieved comparable median error with considerably lower variance than the green or blue channels. SNR analysis confirmed that the chest exhibits the highest SNR (median ≈ ‑2.5 dB), while facial red‑channel signals have the best SNR among head regions (median ≈ ‑0.5 dB). rPPG processing reduced SNR further (median ≈ ‑7 dB) but still retained usable information, especially on cheeks and forehead where blood perfusion is high.
The study demonstrates that subtle head movements, captured via raw pixel intensity changes, can serve as a viable alternative to rPPG for respiratory rate estimation when skin is not visible. The red channel proved most robust, suggesting that multi‑channel sensor fusion could improve resilience to motion artifacts and illumination changes. Future work will explore the impact of body posture and real‑world clinical settings, aiming to develop a fully contact‑free respiratory monitoring system that works even when the face is covered or the subject is prone.
Comments & Academic Discussion
Loading comments...
Leave a Comment