Pulse rate estimation using imaging photoplethysmography: generic framework and comparison of methods on a publicly available dataset

Objective: to establish an algorithmic framework and a benchmark dataset for comparing methods of pulse rate estimation using imaging photoplethysmography (iPPG). Approach: first we reveal essential steps of pulse rate estimation from facial video an…

Authors: Anton M. Unakafov

Pulse rate estimation using imaging photoplethysmography: generic   framework and comparison of methods on a publicly available dataset
Pulse r ate esti mation using imaging photopleth ysmogra ph y: generic framew ork and comparison of metho ds on a publicly a v aila ble dataset An to n M. Unak afo v ∗ 1,2,3 1 Georg-Elias-Müller-Institute of Psyc hology , Univ ersit y o f Go ettingen, Goettingen, G e rman y 2 Theoretical Neuroph ysics Group, Max Planc k Institute for Dynamics and Self-Organization, Go ettingen, German y 3 Leibniz Scienc eCampus Primate Cognition, Go ettingen, Germ an y Octob er 24, 2017 Abstract Obje ctive: to establish an algo rithmic framework and a be nchmark dataset for comparing metho ds of pulse rate estimatio n using imaging pho topleth ysmog raph y (iPPG) . Appr o ach: first we reveal essential steps of pulse ra t e estimation fr om facial video and review metho ds a pplied at ea c h of the s teps. Then we inv estiga te p erformance of t hese meth o ds for DEAP data s et ww w.eecs. qmul. ac.uk/mmv/datas ets/ dea p/ containing facia l videos and reference contact pho t oplethysmograms. Main re sults: best ass essmen t precision is achiev ed when pulse rate is estimated using contin uous wa velet transform from iPPG extracted by the POS metho d (ov erall mean a bs olute er ror b elo w 2 heart b eats p er minute). Signific anc e: w e provide a g eneric framework for theoretical compariso n of metho ds for pulse r ate estimation from iPPG and rep ort results for the most p opular metho ds on a publicly av ailable datas et that can b e used a s a b enc hmark. Keyw ords : Imag ing photoplethysmography , Pulse rate, Signal pro cessing, Heart r a te, Benchmark 1 In tro duction Heart rate (HR) is an imp o rtan t indicator of functional status, psycho-e motional s tate and health conditions in general. T radition ally HR is estimated from e le ctro car diogram or photopleth ysmogram (PPG); ho w ev er, b oth tec hniques require conta ct sensors, which can b e disadv an tageous (Perr y and W atkins, 2011), whereas non-con tact HR estimation is useful, for example, for detecting driv er dro wsiness or abnormal state (Saha y adhas et al., 2012). F or a non-con tact assessment of pulse rate (equiv alent of HR obtained from indirect p eripheral mea- suremen ts) imaging photopleth ysmogramm (iPPG) analysis has b een prop osed (T ak ano and Ohta, 2007; V erkruys s e et al., 2008). Similarly to con tact PPG, iPPG is acquired by measuring v ariations in the in tensit y of ligh t reflected by the skin (see Allen (2007); T am ura et al. (2014) f or details), but a video camera is used instead of simple photo detector. Then iPPG is computed from sequence of im- ages, usually acquired from face or palm. Theoretical underpinnings of imaging photopleth ysmograph y are pro vided in Hülsbusc h (2008); Kamshilin et al. (2015); W ang et al. (2017). Rapid dev elopmen t of iPPG analysis (T arassenko et al., 2014; McDuff et al., 2015; Sun and Thak or , 2016) emphasizes imp ortan ce of comparing v arious algorithms for iPPG-based pulse rate estimation. Theoretical comparison is complicated since algorithms for iPPG acquisition consist of m ultiple steps that are often non-uniformly described. F or empirical comparison a publicly a v ailable b enc hmark dataset is required since pulse rate estimates rep ort ed in diff e ren t studies are not comparable due to ∗ e-mail: anton@nld.ds.mpg.de 1 the differences in exp erimen tal conditions. How ev er, to the b est of our kno wledge no suitable dataset has b een prop osed for iPPG b enc hmarking 1 . T o o v ercome the problems of comparing algorithms we suggest a generic algo rithmic framew ork de- scribing main steps of iPPG-based pulse rate estim ation; w e discuss p opu lar methods emplo yed at v arious steps and comp are their p erformance on a publi cly av ailable dataset (K o elstra et al., 2012), con taining facial video and reference con tact PPG. W e rep ort exp erimen tal results demonstrating ho w the c hoice of the metho ds for each step influences ov erall quality of pulse rate estimation. Our framew ork consists of fiv e s te ps 2 . Metho ds used at single steps of pulse rate estimation w ere previously compared in (Holton et al., 2013; Cui et al., 2015 ; W ang et al., 2017); here w e com bine methods used at each of five steps to find their optimal configurations. 2 Materials and Metho ds 2.1 Dataset Description The Dataset for Emotion Analysis using EEG, Ph ysiological and video signals (DEAP , Koelstra et al. (2012)) con tains ph ysiological recordings and frontal face videos of 22 huma n vo lun teers watc hing m usic videos in 40 one-min ute trials. W e denote trials as P x T y , where x is the num b er of participan t in DEAP and y is the n um b er of trial. Altogether, DEAP dataset consists of 861 one-min ute trials with facial v id eo and reference con tact PPG dat a (37 trials for P11; 39 for P3, P5 and P14; 40 for other participan ts ). W e reject 13 trials where large part of the face was o cclud ed (P4 T17; P6 T24; P12 T14, T18; P15 T12, T 16, T23; P18 T4 , T10; P22 T13 , T18-T20) since for th ese videos stable iPPG acquisition was imp ossible. Videos w ere rec orded in D V P AL format using a SONY DCR-HC27E camcorder and transco ded to 50 FPS dein terlaced video using the h264 co dec. The resolution of all the videos is 720 × 586 . Con tact PPG w as acquired from the left thu m b. W e computed reference pulse rate v alues from PPG b y determining in terv als b et w een diastolic minima (Sc häfer and V agedes, 2013) using the meth o d prop o sed in Elgendi et al. (2013) 3 . 2.2 Metho ds In this section w e propos e a generic algorithmic framew ork of iPPG-based pulse rate estimation. It tak es as an input a sequence of T RGB frames; t -th f rame f o r t = 1 , 2 , . . . , T consists of pixels giv en b y v ectors c i,j ( t ) =  r i,j ( t ) , g i,j ( t ) , b i,j ( t )  ⊺ , where r i,j ( t ) , g i,j ( t ) , b i,j ( t ) are the red, green and blue c hannels for the pixel with co ordina tes ( i, j ) ; v ⊺ stands for the tr ansp osed vecto r v . The algorithm consists of five steps schema tically shown in Fig. 1; b elo w w e consider them in details. 1 Uncompressed video normally u s ed for iPPG acquisition is to o large for pub lishing on-line. An attempt to compare existing algorithms on a pub licly av ailable MAHN OB dataset (Soleymani et al., 2012) was made in (Li et al., 2014). How ever, this dataset seems to b e unsuitable for iPPG b enc hmarking as th e videos underw ent strong compression making consisten t iPPG extraction imp ossi ble ( W ang et al., 2017). 2 A similar three-step framework wa s prop os ed in (Rouast et al., 2016), but that article gives an ov erview of iPPG acquisition while h e re w e focus on the algorithmic details of iPPG p rocessing steps. 3 When applying this meth od to contact PPG from DEAP , w e realize that it do es not detect some diastolic minima since their amplitudes v ary significantly . T o alleviate th is problem we introduce tw o mo difications. First, to detect minima with v ary ing amplitudes we d etermi ne offset level α (Elgendi et al., 2013, Eq (7)) not as mean of the whole signal, b ut as running mean o ver windo w of 7 s. Second, to reject false p ositi ves that may arise from the first modification w e add a p ost-processing step: after metho d detects diastolic minima DM 1 , DM 2 , . . . , DM N , w e reject DM i if it holds DM i+1 − D M i − 1 < 2 3 min  DM i − 1 − D M i − 3 , 2 . 3 N − 1 N X j=2 (DM j − D M j − 1 )  , where co efficien ts 2 3 and 2 . 3 are selected empirically . 2 Figure 1: Fiv e steps of pulse rate estimation from facial video using iPPG. 1. F or ever y frame t = 1 , 2 , . . . , T select the region of inter est ROI( t ) as a set of pixels con taining PPG-related information, and compute a verage color in tensities o v er ROI (color signals): c 0 ( t ) =  r 0 ( t ) , g 0 ( t ) , b 0 ( t )  ⊺ = 1 | R O I( t ) | X ( i,j ) ∈ ROI( t ) c i,j ( t ) , (1) where | R OI( t ) | is the num b er of pixels in R OI( t ) (see Subsection 2.2.1 for R OI( t ) selection). 2. Comp ute refined color signals c ( t ) =  r ( t ) , g ( t ) , b ( t )  ⊺ b y pre-process ing c 0 ( t ) (Subsection 2.2.2). 3. Extrac t ra w iPPG as a com bination of refined color signals: iPPG 0 ( t ) = w r ( t ) r ( t ) + w g ( t ) g ( t ) + w b ( t ) b ( t ) with w eights w r ( t ) , w g ( t ) , w b ( t ) ∈ R (see Subsection 2.2.3 for w eigh ts calculation). 4. P ost-pro cess raw s ig nal iPP G 0 ( t ) to get refined signal iPP G ( t ) (Subsection 2.2.4). 5. Estima te pulse rates from pro cessed iPPG signal (Subsection 2.2.5). W e test s everal p opular metho d s for every step of estimation algorithm (Figure 2) in order to find out whic h com binations of methods pro vide most precise pulse ra te estimation. Figure 2: Sc heme of considered methods . Big blo c ks represen t five steps of pulse rate estimation (s ee Figure 1 ), eac h b o x inside the blo c k represen ts a sub-step and conta ins a list of metho ds used at this sub-step. W e try v arious com binations of methods , eac h time taking one metho d for ever y sub-step. 3 2.2.1 Selecting Region of In terest T o compute color signals c 0 ( t ) by (1), color int ensities c i,j ( t ) are av eraged o ver R OI. As PPG-induced v ariations of facial color are w eak in comparison with noise an d artifacts, the aim of R OI selection is to c ho ose pixels con taining maximal pulsatile information, so that a veragin g reduces noise while preserving the iPPG signal. ROI selection consists of t w o s ub-steps: initia l choic e of facial region for iPPG acquisition ( ROI choice ) and excluding irrelev an t pixels ( ROI refinemen t ). R OI choice. The most p opular approac h is to take ROI as a rectangle encompassing the whole-face region (Lewa ndo wsk a et al., 2011; Poh et al., 2011; de Haan and Jeanne, 2013; Mannapp eruma et al. , 2014). Other p opular regions are the whole face excluding eye region (McDuff et al., 2014b; Li et al. , 2014) and f or ehead (V erkruysse et al., 2008). In DE AP dataset for some participan ts EEG cap co v ers most of the f orehead, whic h hinders using the forehead region; therefore w e consider the whole-face region and the facial region b el o w eye s. In b oth cases we detect facial rectangle for eac h f ra me b y the commonly-used cascade classifier (Lienhart, 2000) constructed by means of the Viola-Jones algorithm (Vi ola and Jones, 2001). W e take the width of ROI equal to 80% of the estimated face width as recommended in (P oh et al., 2011). R OI r e finemen t. Ev en when ROI is s e lected prop erly , s om e pixels ma y not con tain iP PG signal. Examples include non-skin pixels (for instance, hair), ov er or under-lit areas, damaged pixels in the sensor. T o exclude suc h pixels RO I-refinemen t metho ds are used, here w e consider t w o of them. First, non-skin pixels are discarded. This is an essential part of R OI refinemen t for DEAP s in ce in man y v ideos cables hang in fron t of participan ts’ faces. W e use simple HSV masking 4 : pixels with h ue, saturation or v alue outside of the ranges [0 ◦ , 46 ◦ ] , [23 , 132] and [88 , 255] , resp ectiv ely , are considered non-skin and discarded (ranges are selected empirically as pro viding effectiv e skin selectio n for the en tire dataset). Then w e reject pixels that differ considerably from other pix els in ROI (outliers). Namely , w e discard pixels ( i, j ) that do not satisfy the follo wing inequalit y (T asli et al., 2014 ): | c 0 ( t ) − c i,j ( t ) | < γ σ ROI ( t ) , where σ ROI ( t ) = v u u t 1 | R O I( t ) | X ( i,j ) ∈ ROI ( t )  c i,j ( t ) − c 0 ( t )  2 . In (T asli et al., 2014) γ = 3 is used; since this v alue do es not pro vide effectiv e outliers rejection for DEAP videos, we take γ = 1 . 5 . Another imp ortan t part of ROI refinemen t is motion comp ensation (Kumar et al., 2015; W ang et al., 2015). Here w e do not use it since there is no prominen t head mov emen ts in videos from DEAP dataset. 2.2.2 Pre-pro cessing of Color Signals A t this step refined color signal s c ( t ) are computed from ra w signals c 0 ( t ) f or t = 1 , . . . , T b y sup- pressing noise and artefacts. T o preserv e relev an t information, frequency comp o nen ts in hu man heart rate bandwidth (40–240 beats p er min ute (BPM), whic h corresp onds to 0.65–4 Hz) should not b e suppressed. Typical pre-pro ce ssing s u b-steps are detrending, band-pass and mo ving a v erage filtering (see Figure 2, Step 2). They are of te n used in com bination (Holton et al., 2013; Li et al., 2014), but some sub-steps can b e omitted or applied at p ost-pro c essing (Step 4, see Subsection 2.2.4). Detrending is imp ortan t since pulsatil e comp onen t of iPPG has m uch lo w e r ampli tude than the slo wly-v arying baseline (Hülsbusch, 2008). A simple de trending metho d co nsists in mean-cen tering 4 HSV color mo del is generally considered to b e most useful for skin d etec tion (Zarit et al., 1999; V ezhnevets et al., 2003), another p opular choice is YCbCr mo del (Bousefsa f et al., 2013). See also (W ang et al., 2015) for a more elaborate approac h to skin selection for iPPG acquisition. 4 and scaling the signal (MCaS, de Haan and Jeanne (2013)): c ( t ) = c 0 ( t ) − m ( t, L ) m ( t, L ) , where m ( t, L ) = 1 L L − 1 P k =0 c ( t − k ) is an L -point running mea n of col or v ectors c ( t ) ; w e tak e L corr e- sp onding to 1 s. Using MCaS is required for man y metho ds of iPPG extraction (see Subsection 2.2.3). Another popular detrending method is smo othn ess priors approach (SP A, T arv ainen et al. (2002)) used in (P oh et al., 2011 ; Li et al., 2014). T o remo v e trend without affecting the heart rate bandwidth w e emplo y SP A with con trol param eter λ = 300 , whic h suppresses frequencies b elo w 0.55 Hz, see (T arv ainen et al. , 2002) f or details. Mo vin g av erage (MA) filtering s m o oths the signal and suppresses high-frequency noise. MA filtering with M -p oin t a v erage is pro v ided b y the follo wing equation: c ( t ) = 1 M M − 1 X k =0 c 0 ( t − k ) . When choosing M one should tak e in to accoun t that M -p oin t MA filter suppresses frequencies n M F SR for n = 1 , 2 , . . . , where F SR is the sampling rate of the signal, see (Smith, 1997, Chapter 16) for details. Since h uman pulse rate can reac h 4 H z, w e recommend using M < 1 4 F SR . F or instance, F SR = 50 Hz requires M ≤ 12 , thus we consider MA filtering with 3-, 6-, 9- or 12-p oin t a v erage. Band-pass filtering suppresses frequency comp onen ts outside the heart rate band width. Here we emplo y t w o commonly used filters, either the 255-th order finite impulse resp onse (FIR ) filter with lin- ear phase designed using the Hamming windo w (Lew ando wsk a et al., 2011; P oh et al., 2011; Li et al., 2014) or the 5th order Butterw orth infinite impulse resp onse (I IR) filter (Sun et al. , 2013 ). 2.2.3 Extracting Photopleth ysm ogram from Color Signals This step (Figure 2, Step 3) can b e represen ted as: iPPG 0 ( t ) = w ( t ) · c ( t ) = w r ( t ) r ( t ) + w g ( t ) g ( t ) + w b ( t ) b ( t ) , where w ( t ) =  w r ( t ) , w g ( t ) , w b ( t )  ⊺ ∈ R 3 are w eigh ts of color s ignals. F or computi ng these w eigh ts the follo wing metho ds are often used: • Estimating iPPG by the green s ign al ( G metho d). This approac h is p opular (T arassenk o et al., 2014; Cui et al., 2015) due to its s implicit y , in this case w ( t ) = (0 , 1 , 0) that is iPPG 0 ( t ) = g ( t ) . • Estimating iPPG b y the green signal while the red signal is considered as conta ining artefacts only (green-red difference or GR D metho d). Here w ( t ) = ( − 1 , 1 , 0) ⊺ , th us iPPG 0 ( t ) = g ( t ) − r ( t ) . This metho d was first pr op osed in (Hülsbusch, 2008, Chapte r 6) as a robust a lternativ e to G method. • A daptive green-red difference ( aGR D , F eng et al. (2015)) computes iPPG as iPPG 0 ( t ) = k c 0 ( t ) k  g ( t ) g 0 ( t ) − r ( t ) r 0 ( t )  , (2) where k c 0 ( t ) k = q  r 0 ( t )  2 +  g 0 ( t )  2 +  b 0 ( t )  2 . P re-pro c essing is essen tial for this method since otherwise g ( t ) ≡ g 0 ( t ) and r ( t ) ≡ r 0 ( t ) result in iPPG 0 ( t ) ≡ 0 . Originally , a ba nd-pass filtering is used (F eng et al., 2015). 5 • Decomposing color s ig nals into comp one n ts b y means of blind source s ep aration (BSS) and c ho osing the comp onen t with the most prominen t p eak in the he art rate bandwidth. Inde- p ende n t comp onen t a nalysis ( ICA ) is the most p opular BSS techn ique for iPPG computation (Holton et al., 2013; W ang et al., 2017 ). Here w e use JADE algorithm of ICA b y Cardoso (1999) as s ug gested in (P oh et al., 2010, 2011) 5 . • CHR OM method (de Haan and Jeanne, 2013) emplo ys a mo del of PPG-induced v ariations in color intensit y and defines iPPG signal as iPPG 0 ( t ) = x 1 ( t ) − σ 1 ( t, L ) σ 2 ( t, L ) x 2 ( t ) , (3) where σ 1 ( t, L ) , σ 2 ( t, L ) are L -p oin t running standard deviations of x 1 ( t ) = 0 . 77 r ( t ) − 0 . 51 g ( t ) and x 2 ( t ) = 0 . 77 r ( t ) + 0 . 51 g ( t ) − 0 . 77 b ( t ) , resp ectiv ely: σ i ( t, L ) = v u u t 1 L − 1 L − 1 X k =0 x i ( t − k ) 2 − 1 L ( L − 1) L − 1 X k =0 x i ( t − k ) ! 2 (4) for i = 1 , 2 . W e follo w (de Haan and Jeanne, 2013) in taking L corresp ond ing to 1.6 s. • The recent ly prop osed POS method (W ang et al., 2017) can b e considered as an impro v ed and simplified v ersion of CHROM: iPPG 0 ( t ) = x 1 ( t ) + σ 1 ( t, L ) σ 2 ( t, L ) x 2 ( t ) , where σ 1 ( t, L ) and σ 2 ( t, L ) are L -p oin t runn ing standar d d eviations (4) of x 1 ( t ) = g ( t ) − b ( t ) and x 2 ( t ) = g ( t ) + b ( t ) − 2 r ( t ) , resp ectiv ely . W e tak e L corresponding to 1.6 s as suggested in (W ang et al., 2017). In order to mak e CHR OM and POS complian t with our generic framew ork, we in tro duce minor algorithmic c hanges not affecting the nature of the metho ds. Namely , w e us e running means and standard deviations instead of computin g iPPG signal in ov erlapp ed windo ws. F or the considered dataset the p erforman ce of our mo dified ver sions is sligh tly b etter than of the original metho ds. Note that sp ecial pre -pro cessing is requir ed for some iPPG extraction metho ds, namely MCaS de- trending for GRD, CHROM and POS and band-pass filtering for aGRD. W hen testing the effect of pre-processing (see Figure 2 , Step 2) w e alw a ys use required pre-pro cessing with these iPPG extraction methods. 2.2.4 P ost-pro cessing of Imaging Photoplethismogram P ost-pro cessing (Figure 2, Step 4) impro v es quality of iPPG signal and is esp ecially necessary if noise and art ifacts w ere not remo v ed at pre- pro cessing (Step 2) or if iPPG w as extracted at Step 3 in a non-linear fashion (whic h is the case for aGRD, ICA, CHR OM and POS). Here w e consider three t y p ical sub-steps of p ost-process ing: band-pass , MA and adaptiv e band-pass (ABP) filtering . Band-pass and MA filtering describ ed in Subsection 2.2.2 as pre-pro cessing sub-steps can b e also used at p ost-processing (P oh et al., 2010, 2011), this results in differen t iPPG signal for all considered methods of iPPG extraction except f or linear G and GR D method s . ABP fi lt ering assumes that frequency componen ts of iPPG signal p ertaini ng to pulse rate ha v e relativ ely high p o w er; then weak comp one n ts corresp ond to noise and should b e suppressed (Hülsbusc h , 5 ICA-based iPPG ex tra ction incorp orates sp ecific pre-pro cessing (subtracting the mean and dividing by standard deviation of eac h channel (McDuff et al., 2014a); see (de Haan and V an Leest, 2014) for the criticism of this approac h) and p os t-pro ces sing (inverti ng iPPG if it was flipp ed during ICA (McDuff et al., 2014a)). W e use these metho ds as a part of I CA but do not d escrib e th em separately due to their limited applicabilit y for n o n-ICA-based iPPG ex tra ction. 6 2008; Bou sefsaf et al., 2013 ; W ang et al., 2015; F eng et al., 2015). Here we use a t w o-step wa v elet filtering suggested in (Bousefs af et al., 2016 ) 6 . Mo difica tions of iPPG signal provided b y MA, band-pass and w a v elet filtering are shown in Figure 3 . 8 10 12 14 16 18 20 time (s) raw iPPG MA filtered iPPG MA and band-pass filtered iPPG MA, band-pass and wavelet filtered iPPG Figure 3: Effect of v arious p ost-pro cessing metho ds on iPPG e xtra cted from P1 T24 data using POS. 2.2.5 Estimation of Pulse R a te W e consider here four most p opular metho ds of pulse rate es ti ma tion (Figure 2, Step 5). • In terb e at interv al (IBI ) estimation is the most direct wa y to assess pulse rate, ho wev er this ap- proac h is rarely used f or iPPG since precise IBI estimation is often problematic (Schä fer and V agedes, 2013; Elgendi et al. , 2013; Kamshilin et al., 2016). I BI corresp onds to a cardiac cycle; th us mo- men tary pulse rate is equal to the in verse IBI duration. IBI is usually defined for iPPG as time b et w een successive systolic p eaks (Sch äfer and V agedes, 2013) using some metho d of p eak detection; here w e emplo y metho d from Elgend i et al. (2013) with mo dificat ions describ ed in Subsection 2.1, see Figure 4 f o r an illustration. F or accurate IBI estimation w e increase sam- pling rate of iPPG signal from 50 to 250 H z using cubic spline in terp olation as suggested in (T ak ano and Oh ta, 2007). • Another approac h is to assess a v erage pulse rate as frequency corresp onding to maximal p o w er sp ectral density (PSD). By computing PSD ov er N points one estimates a ver age pulse rate v alue o v er time int erv al τ = N/F SR , where F SR is the sampling rate of the iPPG signal ( F SR = 50 Hz for DEAP). PSD is usually estimated b y Discrete F ourier T ransform (DFT) or by autoregressiv e (A R) mo deling . DFT is a direct w a y to estimate PSD (P oh et al., 2011; de Haan and Jeanne, 2013). Y et, DFT is often criticized (Hülsbusch and Blazek, 2002; H olton et al., 2013) since its frequency resolution is 60 /τ BPM ( 1 /τ Hz) and leads to a crude estimation of pulse rate f or τ < 20 s, while taking τ > 20 s hinders tracking of pulse rate v ariations. Here we use N = 102 4 , whic h results in a veraging pulse rate o ver τ = 20 . 48 s. AR m odeling considers iPPG as an output of linear sy s te m with added white noise (T ak ano and Oh ta , 2007; T arassenk o et al., 2014); parameters of this system are estimated to compute PSD. I n com- parison with DFT, AR mo deling yields impro ved resolution for short samp les. W e implemen t AR mo deling using Burg’s metho d (Matlab function pburg ) and emplo y mo dels either of 23 -rd 6 First w e p erf orm contin uous wa velet transform of iPPG and fi l ter wa velet co efficien ts with a wide Gaussian windo w centere d at scale corresp onding to th e maxim um of squared wa velet co efficien ts a vera ged ov er 15 s temp oral running windo w. Then w e apply usual Gaussian filter. The filtered signal is reconstructed by p erforming the inv erse contin uous w a vele t transform. See Bousefsaf et al. (2016) for details. 7 0 10 20 30 40 50 Time [s] contact PPG iPPG (POS) Figure 4: Con tact PPG and iPPG signal (extracted us ing POS and p os t-pro cessed b y MA, band- pass and wa v elet filtering) f or P1 T24, red circles indicate diastolic minima for PPG and s ystolic p eaks for iPPG detected using algorithm f ro m Elgendi et al. (2013) with m o difica tions describ ed in Subsection 2.1 . Note that for con tact PPG signal in terbeat i n terv als are estimated from diastolic minima since they are more clear and prominen t than p eaks. order (for iPPG signal with w a velet filtering at Step 4) or 34 -th order (without w av elet filtering) as these settings provide b est pulse rate estimation (w e hav e tested orders 5 , . . . , 80 ). • Con tin uou s W av elet T ransform (CWT) pro v ides a promising alternativ e to DFT and AR mo del ing (Hülsbusch and Blazek, 2002). W e implemen t CWT using Matlab function cwtf t ; we tak e Morlet w a v elet (H ülsbusch, 2008; Bousefs a f et al., 2013) and scales 7 corresp o nding to 0.325 – 25 H z with f ac tor 2 0 . 03125 . Since DFT and AR mo deli ng estimate only a v erage pulse rate, in order to make all estimates compa- rable, we av erage pulse rate estimates for IBI and CWT in windo ws of τ = 20 . 48 s. Note that metho ds of pulse rate estimation hav e b een recen tly compared in (Cui et al. , 2015) for iPPG extracted using G metho d, but in that s tu dy CWT and AR mo deling w ere not considered, while DFT w as used either with long windo ws of 30 s resulting in low time resolution or with short windo ws o f 2 s providin g very lo w frequency resolution. 2.3 Metrics T o in vestigate quality of pulse rate estimation, we split eac h trial (see Section 2.1) in to ep ochs of 20.48 s with 9.88 s (appro ximately 50%) ov erlap and get five ep o c hs p er trial. F or eac h ep o ch i w e compare estimated a v erage pulse rate PR i with the a veraged reference v alue PR ref i . The follo wing quan tities are used to assess es timation p erforman ce for the ep o c hs of each participan t. Mean absolute error (MAE) is giv en b y MAE = 1 N X i | PR i − PR ref i | , (5) where N = 5 ep o c hs p er trial × amoun t of trials and i is the n um b er of ep o c h. MAE ≈ 3 BPM was observ ed in (T arassenk o et al., 2014) for ep o c hs comprising 4 heart b eats (appro ximately 4 s) and MAE ≈ 2 . 5 BPM on av erage in (Lew andow sk a et al., 2011) for 30 s ep ochs. 7 W e choose these scales to hav e a sufficiently go od co verag e of human heart rate bandwidth (0.65–4Hz): 0.325 H z is tw ice low er than minimal pu l se rate, while 25 Hz is the half of iPPG signal sampling rate. F actor 2 0 . 03125 provides 32 scales p er octav e. 8 Ro ot-mean-square error (RMSE) is giv en b y RMSE = 1 N s X i  PR i − PR ref i  2 . (6) RMSE is more sensitiv e to large estimation errors than MAE, so small num b er of large errors results in high RMSE and lo w MAE. Pulse rate estim ates from uncompressed video of stationary sub jects usually ha v e RMSE in range of 1–2 BPM for ep ochs of 30 – 60 s (P oh et al., 2011; Li et al., 2014; Bousefsaf et al., 2016) 8 . P ercen tage of ep o c hs (PE) for those pulse rate is estimated with error b elo w 3 . 5 BPM 9 is giv en b y PE 3 . 5 = 1 N { i : | PR i − PR ref i | < 3 . 5 BPM } . (7) W e also assess qualit y of iPPG signal by signal-to-n oise ratio (SNR) defined as (de Haan and Jeanne, 2013): SNR = 1 N X i 10 log 10 240 BPM P f =40 BPM  ˆ S i ( f )  2 U i ( f ) 240 BPM P f =40 BPM  ˆ S i ( f )  2  1 − U i ( f )  , (8) where ˆ S i ( f ) is the sp ectrum of the i -th iPPG ep o c h computed by using DFT and U i ( f ) indicates whether frequency comp onen t f is attributed to the signal ( U i ( f ) = 1 ) or to noise ( U i ( f ) = 0 ): U i ( f ) = ( 1 , if | f − PR ref i | ≤ ∆ f or if | f − 2 PR ref i | ≤ 2∆ f , 0 , otherwise . In order to mak e results comparable with those in (W ang et al., 2017) we tak e ∆ f = 50 · 60 1024 ≈ 2 . 93 BPM. 3 Results and Discussio n 3.1 Ov erview In T able 1 we presen t qualit y metrics fo pulse rate estimates and iPPG extraction metho ds under b est pre- and p ost-processing (MCaS detrending is b enefic ial f or all methods, other pre- and p ost-processing methods providin g b est results are summarized in T able 2). In all cases b est results are obtained for whole f ac e ROI with skin selection and outliers rejection (see Figure 2, Step 1). The low est estimation errors are ac hiev ed when using POS for iPPG extraction and CWT for pulse rate estimation. Altogether, v alues of qualit y metrics are comparable with those rep orte d in the literature for pulse rate estimation f rom uncompressed video (see Subsection 2.3). Belo w w e discuss influence of v arious steps on the pulse rate estimation q ua lit y . W e b egin with metho ds for iPPG extraction (see Figure 2, Step 3), since results for differen t metho ds v ary considerably (Subsection 3.2). W e pro ceed with ROI selection (Step 1), pre- and p ost-pro ce ssings (Steps 2, 4) and finish with pulse rate estimation (Step 5) in Subsections 3.3 – 3.5, resp ectiv ely . 8 RMSE < 1 BPM observ ed in (de Haan and Jeanne, 2013) is obtained for video recorded u nder dedicated p ro fessional illumination, which makes results incomparable with those for DEAP . On the other hand, RMSE > 7 . 6 B P M reported in (Li et al., 2014) for MAHN OB dataset is to o high and indicates limited usefulness of this dataset for iPPG-based pulse rate estimation. 9 In (Holton et al., 2013) b est metho d estimates pulse rate with error b elo w 6 BPM for PE 6 = 87 % of ep ochs. H ere w e are interested in p ercen tage of ep ochs for those pu lse rate is estimated well; precision of 6 BPM seems insufficient for th is, so we b ound error by 3 . 5 BPM (5% of a verag e human pulse rate 70 BPM). 9 T able 1: Qualit y metrics (a veraged ov er all ep ochs and participan ts) for pulse rate estimates computed from iPPG with b est pre- and p ost-pro cessing (T able 2). V alues in eac h cell stand for MAE (BPM) / RMSE (BPM)/ PE 3 . 5 (%); b est v alues of me trics for each iPPG e xtraction metho d are sho wn in b old . iPPG Pulse rate estimation extraction IBI DFT AR CWT G 5.91 / 7.42 / 44 6.54 / 9.62 / 54 5.62 / 6.78 / 44 5.35 / 7.62 / 58 GRD 4.05 / 5.30 / 60 3.82 / 6.36 / 73 3.99 / 5.11 / 60 3.07 / 4.9 6 / 78 aGRD 4.41 / 5.70 / 57 4.41 / 7.07 / 70 4.28 / 5.45 / 58 3.59 / 5.55 / 74 ICA 3.91 / 5.41 / 64 3.61 / 6.03 / 75 3.71 / 4.87 / 63 2.94 / 4.7 7 / 79 CHR OM 3.46 / 4.64 / 65 2.70 / 4.70 / 81 3.05 / 4.04 / 69 2.08 / 3.4 6 / 86 POS 3.13 / 4.30 / 70 2.61 / 4.50 / 81 2.91 / 3.80 / 71 1.99 / 3.2 5 / 87 T able 2: Pre- and p ost-pro cessing pro viding most precise pu lse rate estimates (pro cessing for IBI and DFT also results in highest SNR); I IR and FIR stand for corresp onding band-pass filtering, MA x – for moving av erage filtering with length x , WF – for w a v elet filtering. iPPG Pre-processing P ost-pro cessing extraction IBI, DFT, CWT AR IBI, DFT CWT AR G – MA12, FI R, WF GRD – MA12, FIR, WF MA12, I IR, WF aGRD FIR I IR MA12, WF ICA FIR, MA 12 I IR, MA12 WF CHR OM – MA12, FI R, WF MA9, FIR ,WF MA12, WF POS – MA12, FI R, WF MA9, FI R MA12, WF 3.2 Step 3: Imaging photopleth ysmogram extraction In T able 3 w e presen t p erforma nce metrics for all considered iPPG extraction metho ds. W e use pre- and post-pro cessing (S teps 2, 4) ensuin g b est SNR (see T able 2) and estim ate pulse rat e b y CWT since this method pro v id es b est results at Step 5. T able 3: A verage SNR and q u alit y metrics for CWT pulse rate estimates from iPPG extracted using pre- and p ost-processing settings pro viding b est SNR. In eac h cell w e presen t v alues with and without w av elet filtering. F or comparison w e include o v erall SNR v alues f rom (W ang et al., 2017). Best v alues for each q ua lit y metrics are s ho wn in b old . iPPG extraction MAE, BPM RMSE, BPM PE 3 . 5 , % SNR, dB rep ort ed SNR, dB G 5.40 / 5.60 7.70 / 8.20 58 / 58 -3.24 / -4.16 -1.90 GRD 3.15 / 3.11 5.00 / 5.08 77 / 78 -0.89 / -2.14 3.67 aGRD 3.59 / 3.63 5.55 / 5.75 74 / 74 -1.26 / -2.45 wa s not considered ICA 2.98 / 3.02 4.88 / 5.11 79 / 80 -0.60 / -1.93 1.92 CHR OM 2.10 / 2.12 3.39 / 3.56 85 / 86 -0.20 / -1.67 3.86 POS 2.04 / 2.01 3.16 / 3.17 87 / 87 0.30 / -1.19 5.16 As you can see from T able 3, POS has the highest signal-to-noise ratio and pro vides most precise pulse rate estimation. The ranking of methods is generall y in line with results rep orted in (W ang et al., 2017), except for GR D p erforming w orse than ICA. W e explain this difference b y s e nsitivit y of ICA to the num b er of source comp onents in the signal; light v ariation and motion in (W ang et al., 2017) in tro duce additional comp onen ts to the color signals and may complicate extraction of pure iPPG by means of ICA. 10 Note that a verage SNR for DE AP dataset is w orse than v alues rep or ted in (W ang et al., 2017). It migh t b e due to the compression of videos in DEAP and using profess ion al dedicated ligh ting for video acquisition in (W ang et al. , 2017). Figure 5 shows av erage v alues of SNR and MAE for ev ery participan t for three b est iPPG extraction methods (POS, CHR OM and ICA). In most cases high SNR correspo nd s to low MAE, whic h (as exp ected ) indicates that go od qualit y of iPPG ensures precise pulse rate estimation. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 (a) -5 0 5 Signal-to-Noise Ratio [dB] ICA CHROM POS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Number of Participant (b) 2 4 6 Mean Average Error [BPM] ICA CHROM POS Figure 5: SNR (a) and MAE (b) obtained by ICA, CHROM and POS metho ds for iPPG extraction in com bination with b est pre- and p ost-pro cessing settings (T able 2) and CWT pulse rate estimation. 3.3 Step 1: ROI Selection R OI c hoice. Results for the whole face region are b etter than for the r egion b elo w ey es , b ot h in terms of iPP G quali t y and pulse rate estimation. Namely , SNR for signal acquired from the w hole face region is at least 0.2-0.3 dB higher (for GRD and aGR D; for other meth o ds the difference is 1.1-1.9 dB), while MAE of pulse rate estimation is low er (con t ribution v aries f rom 1% for GRD and aGRD to more than 10% for other metho ds). R OI refinemen t. Outliers rejection is alw a y s b eneficial, as it increases SNR of iPPG signal (0.25– 0.5 dB) and decreases MAE (10%–15% when CWT or DFT pulse rate estimation is used in com bination with GRD, aGRD, CHR OM or POS metho ds and 5%–10% otherwise). 3.4 Steps 2 and 4: Pre-pro cessing and P ost-pro cessin g Detrending. Among tw o detren ding methods, MCaS alw a ys imp ro ves pulse rate estimation while SP A do es not pro vide an y positive effect (probably ra w color signals in our study are to o noisy for successful application of this tec hnique). F or G, aGRD and I CA metho ds of iPPG extraction (Step 2), using MCaS increases SNR (a v erage increase is 1.6, 0.7 an d 3 dB, resp ectiv ely) and impro v es pulse rate estimation (decrease of MAE is ab o ve 12%). F or CHROM, POS and GRD metho ds using MCaS at pre-pro cessing is immanen t, therefore p erforman ce without MCaS was not tested. MA filtering . Qualit y of iPPG signal and of pulse rate estimati on enhan ces with increase of MA filter lengt h M and reac hes maxim um for M = 12 (MA filtering with M > 12 affects heart rate 11 bandwidth and w as not tested). The only ex ception is pulse ra te estimation by CWT (Step 5) from iPPG obtained by CHR OM and POS metho d s: in this case b est results are observ ed for M = 9 . Figure 6 illustrates this effect for MAE, effect on other quality metrics is similar. 2 3 4 5 6 7 8 9 10 11 12 Length of Moving Average 0.7 0.75 0.8 0.85 0.9 0.95 1 Mean Average Error relative to no filtering ICA CHROM POS Figure 6: A v erage MAE v alues reflecting the influence of M -p oin t MA filtering on pulse rate estimation using CWT (pre- and p ost-pro cessing according to T able 2). Band-pass filtering impro ves qualit y of iPPG signal and, in most cases, p erformance of pulse rate estimation, see T able 4. Surprisingly , band-pass filtering has little p ositiv e or eve n negative effect on pulse rate estimation by AR mo de ling; w e cannot explain this result. The 255th order FI R filter p erforms sligh tly b etter than the 5th order I IR Butterw orth filter; this w as exp ected since frequency resp onse of the latter is sligh tly w orse. T able 4: Effect of band-pass filtering on SNR and MAE estimated from iPP G with pre- and post- pro cessing according to T able 2. In each cell the first v alue is for the 5th order I IR Butterw orth filter and the second for the 255th order FIR filter. iPPG decrease of MA E relativ e to the v alue without band-pass filtering, % SNR increase, extraction IBI DFT AR CWT dB G 25 / 21 14 / 20 10 / 10 22 / 31 0.38 / 0.66 GRD 22 / 13 13 / 20 6 / 5 22 / 31 0.29 / 0.55 aGRD 22 / 17 15 / 19 7 / 6 21 / 29 0.31 / 0.57 ICA 19 / 30 7 / 11 3 / 0 14 / 21 0 .22 / 0.42 CHR OM 13 / 26 6 / 11 -1 / -6 11 / 16 0.19 / 0.38 POS 15 / 27 9 / 14 -1 / -6 14 / 20 0.21 / 0.39 W a v elet filtering results in elimination of large errors in pulse rate estimation whic h is reflected b y prominen t decrease of RMSE (see T able 3). Ho w ever, w av elet filtering only sligh tly impro v es MAE and almost do es not chang e PE 3 . 5 . Pa rameter choi ce for the w a velet filter deserves a separate study: preserving sev eral harmonics of pulse rate is of in terest to ke ep the shap e of iPPG signal. Pre- vs p ost- pr ocessing. Ban d-pass and MA filtering are pr eferable at pr e-pro ce ssing (Step 2) when ICA or aGRD are used at Ste p 3 and at post-pro cessing (Step 4) for other met ho ds of iPP G extraction, see T able 2. F or ICA, POS and CHROM using fi ltering at a different step considerably decreases qualit y of iPPG and of pulse rate estimation. This is quite unexp ected s inc e originally POS and CHR OM w ere proposed with band-pass filtering as pre-pro ce ssing (de Haan and Jeanne, 2013; W ang et al. , 2017), while for ICA p ost-processing was r e commended (McDuff et al. , 2014a). F or aGRD band-pass filter is essential ly a pre-pro cessing sub-step and w e do not observe an y difference b et w een using MA filter as pre- and p ost-processing. 12 3.5 Step 5: Pulse Rate Estimation The b est results in terms of all metrics are pro vided by CWT . This metho d is esp ecially useful since it allow s to estimate not only a v erage but also momen tary pulse rate (see Figure 7a). Other tested metho ds for pulse rate es t imation ha v e certain dra wbacks. DFT pro vides the second b est result in terms of MAE and PE 3 . 5 (T able 1 ), but it has lo w frequency resolution (see Figure 7b) and highest RMSE. IBI estimation (Figure 7c) has the low est o v erall p erformance that can b e ex p lained b y the insufficien t quality of iPPG extracted from compressed DEAP v id eos. IBI filtering tec hniques (McDuff et al., 2014a) ma y impro v e precision of IBI estimation. Finally , AR mo deling requires an elab or ate choi ce of mo del order as using orders differen t from those selected in Subsection 2.2.5 resulted i n c o n s i d e r a b l y w o r s e p u l s e r a t e e s t i m a t i o n . 10 20 30 40 50 Time [s] (a) 55 60 65 70 75 80 85 Pulse Rate [BPM] HR from contact PPG HR from iPPG (CWT) 10 20 30 40 50 Time [s] (c) 55 60 65 70 75 80 85 Pulse Rate [BPM] HR from contact PPG HR from iPPG (IBI) 20 30 40 50 Time [s] (b) 55 60 65 70 75 80 85 Pulse Rate [BPM] 20 30 40 50 Time [s] (d) 55 60 65 70 75 80 85 Pulse Rate [BPM] Figure 7: P erformance of pulse rate estimation methods f or iPPG signal extracted using POS metho d from data P1 T24: momen tary pulse rate estimated by CWT (a, smo othe d by 1-s mo ving av erage) and IBI (c), s pectrograms of iPPG s ig nal estimated using DFT ( b) and by A R mo deling (d). 4 Conclusion Let us summarize the main results of this work. W e ha v e established a generic framew ork for iPPG- based pulse rate estimation. Using this framew ork w e hav e compared v arious metho ds of iPPG analysis for compressed video f rom DEAP dataset; b est pulse rate estimation is obtained when using follo wing methods. Step 1, ROI selection: whole face ROI with skin selection and outliers rejection. Step 2, pr e -pro cessing: mean-cen tering and scaling; mo v ing a verage filtering (for ICA) with filter length M close to 1 4 F SR , where F SR is samp ling rate in Hz; band-pass filtering (for ICA and aGRD) with 255th FI R filter. Step 3, iPPG extraction: POS; result for CHROM and ICA are also relativ ely go od. 13 Step 4, p ost-pro cessing: mo ving a ve rage and band-pass filteri ng (if not used at pre-processing), w av elet filtering. Step 5, pu lse rate estimation: Con tin uous W a v elet T ransform. Let us finish with tw o problems that ma y b ecome inte resting topics f or the further researc h. • Here w e ha ve considered only pulse rate estimation, but one can also use DEAP dataset to in v esti- gate estimation of pulse rate v ariabilit y (McDuff et al., 2014b) and respiratory rate (T arassenko et al., 2014) f ro m iPPG. • Up to no w iPPG has b ee n used only for h uman sub jects. How ev er, iPPG acquisition do es not seem unfeasible for non-h uman animals with bare face, for instance, for primates (Changizi et al., 2006). Using iPPG for pulse rate estimation can b e b eneficial for animal researc h, where using con tact measuremen ts is often undesirable. A c kno wledgemen t s The author ac kno wledges funding from the Ministry for Science and Educ ation of Lo w er Saxon y and the V olksw agen F oundation through the program “Niedersäch sisc hes V orab”. A dditional supp ort w as provide d b y the Lei bniz Asso ciation through funding f or the Leibniz Scien ceCampus Primate Cognition. The author is grateful to MPI for Dynamics and Self-Organization for the supp ort during the work at the man uscript. The autho r thanks Dr. S. Möller, Dr. I . K agan and Prof. F. W olf for useful discussions and remarks on the manuscrip t. References Allen, J. (2007). Phot opleth ysmograph y and its appli cation in clinical physiologica l measure men t, Physiolo gic al me asur ement 28 (3): R1. Bousefsaf, F., Maaoui, C. and Pruski, A. (2013). Con tin uous wa velet filtering on w ebcam photopleth ys- mographic signals to remotely assess the instant aneous heart rate, Biome dic al Signal Pr o c essing and Contr ol 8 (6): 568–574. Bousefsaf, F., Maaou i, C. and Pruski, A. (2016). P eripheral v asomotor activit y assess m en t using a con tinu ous wa v elet analysis on we b cam photopleth ysmographic signals, Bio-Me dic al Materials an d Engine ering 27 (5): 527–538. Cardoso, J.-F. (1999). High-order contr asts for indep enden t comp onen t analysis, Neur al c omputation 11 (1): 157–192. Changizi, M., Zhang, Q. and Shimo jo, S. (2006). Bare s kin, blo o d and the ev olution of primate colour vision, Biolo gy letters 2 (2): 217–221. Cui, Y., F u, C.-H., Hong, H., Zhang, Y. and Sh u, F. (2015). Non -con tact time v arying heart rate mon- itoring in exercise b y video camera, IEEE I nternational C on fer enc e on Wir eless C ommunic ations & Signal Pr o c essing (WCSP) , pp. 1–5. de Haan, G. and Jeanne, V. (2013). R o bust pulse rate from ch rominance-based rPPG, IEEE T r ans- actions on Biome dic al Engine ering 60 (10): 2878–2886. de Haan, G. and V an Leest, A. (2014). Impro ved motion robustness of remote-PPG by using the blo od v olume pulse s ig nature, Physiolo gic al me asur ement 35 (9): 1913. 14 Elgendi, M., Norton, I., Brearley , M., Abb ott, D. and Sc huurmans, D. (2013). Sys to lic p eak detection in acceleratio n photopleth ysmograms measured from emergency resp onders in tropical conditions, PL oS One 8 (10): e76585. F eng, L., P o, L.-M., Xu, X., Li, Y. and Ma, R. (2015). Motion-resistan t remote imaging photopleth y s- mograph y based on the optical prop erties of skin, IEEE T r ansactions on Cir cuits and Sy ste ms for Vide o T e chnolo gy 25 (5): 879–891. Holton, B., Mannapperuma, K., Lesniewski, P . and Thomas, J. (2013). Signal re co very in imaging photopleth ysmograph y, Physiolo gic al me asur ement 34 (11): 1499. Hülsbusc h, M. (2008). An i mag e-b ase d functional metho d for opto-ele ctr oni c dete ction of skin-p erfusion , PhD thesis, R WTH Aac hen (in German). Hülsbusc h, M. and Blazek, V. (2002). Con tactless mapping of rhythmic al phenomena in tissue p erfu- sion using PPGI, Me dic al Imaging 2002 , Inter national So ciet y for Optics and Photonics, pp. 110–117. Kamshilin, A., N ip p olaine n, E., Sidoro v, I., V asilev, P ., Erofeev, N., P o dolian, N. and R om ashk o, R . (2015). A new lo ok at the essence of the imaging photopleth ys m ograph y, Scientific r ep orts 5 . Kamshilin, A., Sidoro v, I., Baba y an, L., V olynsky , M., Giniatullin, R. and Mamon tov, O. (2016). A ccurate measurem en t of the pulse w a ve delay with imaging photople th y smo graph y , Biome dic al optics expr ess 7 (12): 5138–5147. K o elstra, S., Muhl, C., Soleyma ni, M., Lee, J.-S., Y azdani, A., Ebrahi mi, T., Pun, T., Nijholt, A. and P atras, I. (2012). DEAP: A database for emotion analysis ; using ph ysiological signals, IEEE T r ansactions on Affe ctive Computing 3 (1): 18–31. Kumar, M., V eeraragha v an, A. and Sab harw al, A. (2015). Dist anc ePPG: Robust non-con tact vital signs monitoring using a camera, Biome dic al optics expr ess 6 (5): 1565–1588. Lew ando wsk a, M., Rumiński, J., K o cejko , T. and No w ak, J. (2011). Measuring pulse rate with a w eb cam - a non-con tact metho d for ev aluating cardiac activity, F e der ate d Confer enc e on Computer Scienc e and Information Systems (F e dCS IS) , pp. 405–410. Li, X., Chen, J., Zhao, G. and Pietik ainen, M. (2014). Remote heart rate measuremen t f ro m face videos under realistic situations, Pr o c e e din g s of the IEEE C onfer enc e on Com pu ter Vision and Pattern R e c o gnition , pp. 4264–4271. Lienhart, R. (2000). Stump-based 20x20 gen tle adabo ost frontal face detector, https://github. com/opencv/opencv/blob/master/data/haarcascades/haarcascade_frontalface_alt.xml . Ac- cessed: 2017-02-20. Mannapperuma, K., Holton, B., Le sniewski, P . and Thomas, J. (2014). P erformance limits of ICA- based heart rate iden tification tec hniques in imaging photopleth ysmograph y , Physiolo gic al me asur e- ment 36 (1): 67. McDuff, D., Estepp, J., Piasec ki, A. and Blac kford, E. (2015). A survey of remote optical photo- pleth ysmographic imaging metho ds, 37th Annual International Con f e r enc e of the IEEE Engine ering in Me dicine and Biolo gy So ciety ( EMBC) , IEEE, pp. 6398–6404. McDuff, D., Gon tarek, S. and Picard, R . (2014a). Impro v emen ts in remote cardiopulmo nary measure- men t using a fiv e band digital camera, IEEE T r ansactions on Biome dic al Engine ering 61 (10): 2593– 2601. McDuff, D., G ontare k, S. and Picard, R . (2014b). Remote measuremen t of cognitiv e stress via heart rate v ariabilit y, 36th Annual International Confer enc e of the IEEE Engine ering in Me dicine and Biolo gy So ciety , IEEE, pp. 2957–2960. P erry , C. and W atkins, S. (2011). Non-Con tact Vital Sign Monitoring via Ultra-Wideba nd Radar, 15 Infrared Video, and Remote Photopleth ysmograph y: Viable Options f o r Space Exploration Missions, T e chnic al r ep ort . P oh, M.-Z., McDuff, D. and Picard, R. (2010). Non-con tact, automated cardiac pulse measuremen ts using video imaging and blind source separation, Optics expr ess 18 (10): 10762–10774. P oh, M.-Z., McDuff, D. and Picard, R. (2011). Adv ancemen ts in noncon tact, mu ltiparameter ph ysio- logical measuremen ts using a w eb cam , IEEE T r ansactions on Biome dic al Engine ering 58 (1): 7–11. Rouast, P ., Adam , M., Chiong, R., Cornforth, D. and Lux, E. (2016). Remote heart rate measuremen t using lo w-cost RGB face video: A techni cal literature review, F r on ties of Computer Scienc e . Saha yadha s, A ., Sundara j, K. and Murugappan, M. (2012). Detecting driv er dro wsiness based on sensors: a review, S ensors 12 (12): 16937–16953. Sc häfer, A. and V agedes, J. (2013). Ho w accurate is pulse rate v ariabilit y as an estimate of heart rate v ariabilit y?: A review on studies comparing photopleth ysm ographic tec hnology with an electro c ar- diogram, International journal of c ar diolo gy 166 (1): 15–29. Smith, S. (1997). The scientist and engine er’s guide to digital signal pr o c essin g , California T ec hnical Pub. San Diego. Soleymani, M., Lic hten auer, J., Pun, T. and P an tic, M. (2012). A m ultimo da l datab ase for affect recognition and implicit tagging, IEEE T r an sac tions on Affe ctive C omputing 3 (1): 42–55. Sun, Y., Hu, S., Azorin-P eris, V., Kala wsky , R . and Green w al d, S. (2013). Noncon tact imag- ing photopleth y s m ograph y to effect iv ely access pulse rate v ariabilit y , Journal of biome dic al optics 18 (6): 061205–06120 5. Sun, Y. and Thak or, N. (2016). Photopleth ysmograph y revisited: from con tact to noncon tact, from p oin t to imaging, IEEE T r ansactions on Biome dic al Engine ering 63 (3): 463–477. T ak ano, C. and Oh ta, Y. (2007). H eart rate measuremen t based on a time-lapse image, Me dic al engine ering & physics 29 (8): 853–857. T amura , T., Maeda, Y., Sekine, M . and Y oshida, M. (2014). W earable photopleth ysmographic sensors – past and presen t, Ele ctr on ics 3 (2): 282–302. T arassenko, L., Villarro el, M., Guazzi, A., Jorge, J., Clifton, D. and Pugh, C. (2014). Non-con tact video-based vital sign monitoring using am bien t ligh t and auto-regressive mo dels, Physiolo gic al me asur ement 35 (5): 807. T arv ainen, M ., Rant a-Aho, P . and Karjalainen, P . (2002 ). An adv anced detrending metho d with application to HR V analysis, IEEE T r an sac tions on Biome dic al Engine ering 49 (2): 172–175. T asli, H., Gudi, A. and den Uyl, M . (2014). Remote PPG based vi tal s ig n measuremen t using adaptiv e facial regions, IEEE International Confer enc e on Image Pr o c essing (ICIP) , pp. 1410–1414. Unak afo v, A. and Möller, S. (2017). Non-con tact video-based vital sign monitoring using am bient light and auto-regressiv e mo dels, submitte d . V erkruysse, W., Sv aasand, L. and N e lson, J. S. (2008). Remote plethysmog raphic imaging us ing am bien t ligh t, Optics expr ess 16 (26): 21434–21445. V ezhnev ets, V., Sazono v, V. and Andreev a, A. (2003). A surv ey on pixel-based sk in color detection tec hniques, Pr o c e e dings Gr aphic on , V ol. 3, Mosco w, Russia, pp. 85–92. Viola, P . and Jo nes, M. (2001). Rapid ob ject detection using a b oosted cascade of simple features, Pr o c e e din g s of the 2001 IEEE Com puter So ciety Confer enc e on Computer Vision and Pattern R e c o g- nition , V ol. 1, IEEE, pp. I –5 11. W ang, W., den Brinker, A., Stuijk, S. and de Haan, G. (2017). Al gorit hmic principles of remote-PPG, IEEE T r ans ac tions on Biom e dic al Engine ering 64 (7): 1479–1491. 16 W ang, W., Stuijk, S. and de Haan, G. (2015). Exploiting spatial redundancy of image sensor for motion robust rPPG, IEEE T r ansactions on Biome dic al Engine ering 62 (2): 415–425. Zarit, B., Sup er, B. and Quek, F. (1999). Compa rison of five color mo dels in skin pixel classification, Pr o c e e din g s of the International W orkshop on R e c o gnition, Analysis, and T r acking of F ac es and Gestur es in R e al-Time S yste ms , IEEE, pp. 58–63. 17

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment