Head shadow enhancement with low-frequency beamforming improves sound localization and speech perception for simulated bimodal listeners

Head shado w enhancemen t with lo w-frequency b eamforming impro v es sound lo calization and sp eec h p erception for sim ulated bimo dal listeners Benjami n Dieudonn´ e*, T om F rancar t KU Leuven – University of Leuv en, Department of Neurosciences, Exp erimental Oto-rhino- laryngolo gy , Herestraa t 49 bus 721 , B- 3000 Leuven, Belg ium. b enjamin.dieudonne@med.kuleuv en.b e , tom.francart@med.kuleuv en.be Man uscript accepted b y Hearing Researc h (Marc h 6, 2018 ): https://doi .org/10.1016/j.heares.2018.03.007 Abstract Man y h earing-impaired listeners struggle to lo calize sounds due to p o or av ailabil- it y of b inaural cues. Listeners w ith a co c hlear implan t and a con tralateral h earing aid – so-calle d bimo d al liste ners – are amongst the w orst p erformers, as b oth inte r- aural time and lev el diﬀerences are p o orly transmitted. W e present a new m etho d to enhance h ead shado w in the lo w frequencies. Head shado w en hancemen t is ac hieved with a ﬁxed b eamformer with contrala teral at tenuat ion in eac h e ar. The metho d results in in teraural lev el diﬀerences wh ic h v ary monotonically with angle. It also impro ve s low-frequency signal-to-noise ratios in cond itions with spatially s eparated sp eec h and noise. W e v alidated the metho d in tw o exp eriments with ac oustic sim ula- tions of bimo dal listening. I n the lo calization exp erimen t, p erformance impro v ed from 50 . 5 ◦ to 26 . 8 ◦ ro ot-mean-square error compared with standard omni-directio nal m i- crophones. In the sp eec h-in-noise exp eriment, sp eec h wa s presented from the fron tal direction. Sp eec h reception thresholds improv ed by 15 . 7 dB SNR when the n oise w as present ed from the co c hlear implan t side, impro ve d by 7 . 6 d B SNR when the noise w as presented from the hearing aid side, and was not aﬀected w hen n oise w as pre- sen ted fr om all directions. Apart from bimo d al listeners, the metho d might also b e promising for bilateral co chlea r implan t or hearing aid u sers. Its low co mpu tational complexit y mak es the metho d suitable f or applicatio n in curren t clinical d evices. Keyw ords: head shado w enhancement , enhancemen t of interaural lev el d iﬀerences, sound localization, directional hearing, sp eec h in noise, sp eec h intelli gibilit y P ACS: 43 .60.Fg, 43 .66.Pn, 43 .66.Qp, 43. 66.Rq, 43.66 .Ts, 43.7 1.-k, 43.71 .Es, 43.71.Ky *Corresp ondin g author c  2018. This manuscript version is made available under the CC-BY- NC-ND 4.0 lic ense http: // crea tivecommo ns. org/ licen ses/ b y- n c- nd/ 4. 0/ 1 of 15 1 In tro d uction P o or p erception o f binaural cues is a problem for man y hearing-impaired listeners, leading to p o or sound lo calization and sp eech understanding in noisy en vironmen ts. Listeners with a co chle ar implant (CI) and a hearing aid in the no n- implan ted ear – so-called bimo dal CI listeners – a r e amongst the worst p erformers, as b oth interaural time diﬀerences (ITDs) and interaural lev el diﬀerences (ILDs) are p o orly tra nsmitted (F rancart & McDermott, 2 013). ITDs are most probably not p erceiv ed due to (1) the signal pro cessing in clinical CI sound pro cessors whic h neglects mo st temp oral ﬁne structure information, (2) tonotopic mismatc h b et w een electric and acoustic stim ulation, and (3) diﬀerences b et w een the pro cessing delay of b o t h devices (F rancart et al., 2009b, 2011b). ILDs are also po orly p erceiv ed b ecause (1) the head shado w is most eﬀectiv e at high frequencies, which are often not p erceiv ed in the non-implanted ear due to hig h-frequency hearing loss, ( 2 ) diﬀerent dynamic range compression algorithms in b oth devices, and ( 3 ) diﬀerent loudness grow th f unctions for electric and acoustic stim ulation (F rancart et al., 2009 a, 2011a). Moreo ver, for la rge an- gles, the natura l ILD-vers us-angle function b ecomes non-monot onic (Sha w, 197 4). This means tha t it is ph ysically imp ossible to lo calize sounds unam biguously for all directions with only natura l ILDs. Therefore, sev eral author s ha v e presen ted sound pro cessing strategies to artiﬁcially en- hance ILDs, r esulting in improv ed sound lo calization and improv ed sp eec h inte lligibility in n oise. F rancart et al. (2 009a) ha ve sho wn impro ved s ound lo calization in an acous- tic sim ulation of bimo dal CI listeners, b y adapting the lev el in the hearing aid to ob- tain the same broadband ILD a s a normal-hearing listener (F rancart et al., 2009 a, 2 013). Lop ez-P ov eda et al. ( 2 016) implemen ted a strategy for bilateral CI users, inspired b y the con tralateral medial oliv o co chlear reﬂex. Their strategy at t en uated sounds in frequency regions with larger a mplitude o n the con tralateral side, resulting in an increase in sp eec h in telligibilit y for spatially separated sp eec h and noise. Since b oth strategies are solely based on lev el cues t ha t are presen t in the acoustic signal, they cannot solv e the problem of the non-monotonic ILD- v ersus-angle f unction. F rancart et al. (2011 a) a dapted their ab ov e- men tioned strategy b y applying a n artiﬁcial ILD based o n the angle of incidence, to obta in a monotonic ILD-ve rsus-angle function. They found impro v ed sound lo calization fo r real bimo dal listeners. Ho w ev er, this strategy relied on a prior i information ab out the angle of incidence of the incoming sound. Bro wn (201 4) extended the strategy b y estimating the angle of incidence in diﬀerent frequency r egio ns based o n ITDs, resulting in an impro v ed sp eec h inte lligibility for bilateral CI users. Mo ore et al. (2016) ev a luated a similar algo - rithm for bilateral hearing aid users, and found impro v ed sound lo calization while sp eec h p erception was not signiﬁcan tly aﬀected. All ab o v e-men tioned strategies try to artiﬁcially imp ose an ILD based on estimations of audito ry cues that are already presen t. Unfortunately , these estimations are either sub optimal (if based on non-monotonic ILD cues) or computationally expensiv e (if based on ITDs). Moreo ver, they can only handle m ultiple sound sources if these sources are temp orally or sp ectro-temp or a lly separated, while the sp ectrograms of m ultiple concurren t sp eak ers most lik ely ha v e some ov erlaps. Recen tly , V eugen et al. (2017) tried to impro v e 2 of 15 the access to high-frequency ILDs for bimo dal listeners without the need for estimations of auditory cues, b y applying frequency compres sion in the hearing a id. Ho w ev er, they did not ﬁnd a signiﬁcan t impro vem en t in sound lo calization. Moreo v er, frequency compression migh t result in undesired side-eﬀects on sp eec h intelligibilit y , sound qualit y , env elop e ITDs and interaural coherence (Simpson, 2009; Brow n et al., 2016). In this pa p er, w e presen t and v alidate a nov el metho d to enhance lo w-frequency ILDs without the need of estimations of auditory cues or distorting the incoming sound. W e enhance the head shadow by supplying eac h ear with a ﬁxed bilat eral electronic b eamformer applied in the lo w frequencies, atten uating sounds coming from its contralateral side (as opp osed to con v en tional ﬁxed (unilateral or bilateral) b eamformers that a tten uate sounds coming fro m the rear side). This results in enhanced low-freque ncy ILDs and resolv es non-monotonic ILD- v ersus-angle functions. Because of its lo w computational complexit y , our metho d is suitable for application in curren t clinical devices. As a pro of-of-concept, w e v alidate the eﬀect of head shado w enhancemen t on lo calization and sp eec h p erception in noise for sim ulated bimo da l listeners. 2 General meth o d s 2.1 Head shado w enhancemen t In the low frequencies (b elo w 1500 Hz), the ear naturally has an omni-directional direc- tivit y pattern, resulting in v ery small ILD s (Mo o re, 2012, Chapter 7) . W e enhanced this directivit y pattern with an end-ﬁre dela y-and-subtract directional microphone applied b e- lo w 1500 Hz. In each ear, the b eamformer atten uated sounds arr iving from its contralateral side. Ab ov e 15 00 Hz, w e did not apply an y b eamforming. T o a chiev e contralateral atten uation in eac h ear, a linear microphone array in the latera l direction w as realized with a data link b et w een t he left - a nd righ t-ear device s, as illustrated in F ig. 1(a ) . The low -frequency gain w as b o osted with a ﬁrst-order low-pass Butterw orth ﬁlter (cutoﬀ at 5 0 Hz), to comp ensate for the 6 dB/o cta v e atten uation of the subtractiv e directional microphone (Dillon, 200 1, Chapter 7). In this set-up, t he microphone spacing equals the distance b et w een the ears, approx- imately 20 cm. Suc h a large microphone spacing yields go o d sensitivit y of the direc- tional microphone in low fr equencies (note that a fron tal directional microphone in a b ehind-the-ear (BTE) device is usually not active in lo w frequencies b ecause of its strong high-pass characteristic (Rick etts & Henry, 200 2)). O n the other hand, this large spacing decreases the sensitivit y at frequencies ab o v e appro ximately 800 Hz due to the comb ﬁlter b eha vior of a subtractiv e arra y (Dillon, 2001, Chapter 7): t he ﬁrst null in the comb ﬁlter w ould app ear at 85 0 Hz when considering a microphone distance o f 20 cm and a sound sp eed of 3 40 m/s, the second n ull at 1700 Hz, etc. This comb ﬁltering b eha vior also af- fects the directional pattern of the b eamformer. Since w e o nly enhanced the head shado w for frequencies b elow 1500 Hz, the directional pattern and frequency resp onse w ere not strongly aﬀected b y the comb ﬁltering. 3 of 15 In Fig . 1(b) it can b e seen tha t the metho d results in a cardioid-lik e directivit y pattern for lo w frequencies, while the natural directivit y pattern of the ear remains unc hanged for frequencies ab ov e 1500 Hz. The directivit y patterns are calculated as the spectral p o w er in the resp ectiv e band with a white noise signal as input to the algorithm. Figure 1: (a) Blo c k diagram of head shadow enhancemen t algorit hm. Lo w frequencies of the right ear signal are sen t to the left ear device, fo llo w ed by dela y-and-subtract t o obtain lo w-frequency con tralateral attenuation. The same metho d is applied in the righ t ear device (no t sho wn in the ﬁgure). (b) The metho d results in a cardioid-like directivit y pa ttern for low frequencie s (instead of the natural omni-directional one), while the natural directivity pattern of the ear remains unc hanged fo r frequencies ab ov e 150 0 Hz. 2.2 Sim u lations of spatial hearing Spatial hearing w as simulated with head- related transfer f unctions (HR TFs). W e mea- sured the resp onse of an omni-directional microphone in a BTE piece placed on the righ t ear of a COR TEX MK2 hu man-like acoustical manikin; for eac h angle, the left-ear HR TF w as obtained b y taking the HR TF fro m the right ear f o r a sound coming from the op- p osite side of the head (e.g., the left-ear HR TF for a sound coming from − 60 ◦ equaled the righ t-ear HR TF fo r a sound coming fro m + 6 0 ◦ ). The manikin w as p ositioned in the cen ter of a lo calization arc with radius of appro ximately 1 m, with 13 loudsp eake rs (t yp e F ostex 6301B) p ositioned a t angles b et w een − 90 ◦ (left) a nd +90 ◦ (righ t) in steps of 15 ◦ . T o a lso obtain HR TFs for sounds a rriving fro m b ehind the head, w e p erformed a second measuremen t in whic h the manikin w as ro tated 18 0 ◦ . T o sim ulate an anechoic resp onse, reﬂections w ere remov ed by truncating each HR TF after 2 ms starting from its hig hest p eak. 2.3 Sim u lation of bimo dal co c hlear implan t hearing W e sim ulated bimo dal CI hearing according to the metho ds of F rancart et al. (2 009a). CI listening w as sim ulated in the left ear with a noise band voco der to mimic the b eha v- ior of a CI pro cessor: the input signal w a s sen t through a ﬁlter bank; within each c hannel, 4 of 15 the en v elop e w as detected with half- w a v e rectiﬁcation follow ed by a 50 Hz lo w-pass ﬁlt er; this en v elop e w as used to mo dulate a noise band of whic h the sp ectrum corresp onded to the resp ectiv e ﬁlter; the outputs of a ll c hannels we re summed to obta in a single acoustic signal. In the lo calization experiment (Exp eriment 1) , the v o co der con tained 8 c hannels, logarithmically spaced betw een 125 Hz and 8 0 00 Hz. In the sp eech perception exp erimen t (Exp erimen t 2 ), we lo w ered the n um b er of c hannels to 5 (also logarithmically spaced b e- t w een 125 Hz and 8000 Hz) to o btain worse sp eec h p erception, i.e., to b etter corresp ond with real CI listening. The n um b er of c hannels did not ha v e an inﬂuence on the head shado w enhancemen t a lgorithm, as b oth v o co ders had the same eﬀect on the long- term sp ectrum of an y input signal. Sev ere hearing loss w as simulated in the right ear with a sixth order lo w-pass Butter- w orth ﬁlter with a cutoﬀ frequency of 500 Hz, suc h that the resp onse rolled oﬀ at − 36 dB p er o cta v e. This corresp onds with a ski-slop e audiogram o f a typ ical bimo dal CI listener. In this sim ulation with a voco der in one ear and a low-pass ﬁlter in the other ear, little to no ITD cues could b e used to lo calize sounds, as the voco der remo v ed all temp oral ﬁne structure. Note that w e also ramp ed the o n- and oﬀset o f our lo calization stimulus (see Section 3.2) to further reduce p otential ITD cues. Therefore, our par ticipan ts relied (almost) solely on ILD cues during t he lo calization exp erimen t (F rancart & McDermott, 2013). 2.4 P articipan ts W e recruited 8 normal-hearing participants, aged b et w een 24- and 26-y ears-old. Their pure tone thresholds were b etter than or equal to 20 dBHL at all o cta v e frequencies b et w een 12 5 and 8000 Hz. The study was approv ed by the Medical Ethical Committee of the Univ ersit y Hospital Leuv en (S58970). 3 Exp e rimen t 1: Lo calization 3.1 Exp erimen tal set-up The participan t w as seated in the same lo calization ro om as where the HR TFs w ere mea- sured. The loudsp eake rs we re lab eled with n um b ers 1–13, correspo nding to angles b et w een − 90 ◦ (left) and +90 ◦ (righ t) in steps of 15 ◦ . The lo udsp eakers serv ed solely as a visual cue. The stim uli were presen ted through Sennheiser HDA200 o v er-the-ear headphones via a n R ME Hammerfall DSP Multiface soundcard, using the softw are platf o rm APEX 3 (F rancart et al., 2008). 3.2 Stim uli A sp eec h signal w as used b ecause of its relev ance in realistic listening conditions. W e presen ted the Dutc h w ord “zo em” [" zum] from the Lilliput sp eec h mat eria l (V a n Wieringen, 5 of 15 2013), uttered b y a female talk er. T o limit the p oten tial use of on- /oﬀset ITD cues, w e ramp ed the on- and o ﬀ set with a 50 ms cosine windo w. 3.3 Pro cedu re Lo calization p erformance was measured in a condition with head shado w enhancemen t and a condition without head shado w enhancemen t; the order of conditions w as randomized across sub jects. Eac h condition consisted of a blo c k of 7 runs. The ﬁrst 4 runs serv ed as training to get used to the sim ulation; only the la st 3 runs w ere considered in our analysis. Eac h run consis ted o f 3 trials p er angle, resulting in 39 trials in total per run; the order of trials w as randomized in eac h run. The participan t w as instructed to lo o k straigh t a head during stim ulus presen tation, and say the n um b er indicated on the loudsp eak er of the apparen t sound source lo cation af ter stimulus presen tation. F eedbac k w as alwa ys giv en after the resp onse b y turning on a ligh t emitting dio de ab ov e the correct sp eaker for 2 s. Note that we did not ask the part icipants whether t hey p erceiv ed the sound image outside or inside their head. F or calibration, a sp eec h- w eigh ted noise with the same long -term a v erage sp eec h sp ec- trum as the stim ulus was constructed. The stimulus w as calibrated separately for each condition, suc h t hat a signal from t he fron t (0 ◦ ) was presen ted at 65 dBA in each ear (calibrated with a B&K Artiﬁcial Ear T yp e 4153 ) . T o av oid the use of monaural leve l cues for lo calization, the ov erall lev el w as randomly rov ed b y ± 10 dB. W e used three measures to quan tify lo calization p erformance ( a ll expressed in degrees [ ◦ ]): the respo nse bias, the resp onse standard deviation ( s.d.) and the ro o t-mean-square (RMS) error. They are r esp ectiv ely deﬁned as (at a certain angle for a certain sub ject): bias , | mean resp onse − target resp onse | (1) s.d. , v u u t N trials X trial =1 (resp onse trial − mean resp onse) 2 N trials − 1 (2) RMS error , v u u t N trials X trial =1 (resp onse trial − target resp onse) 2 N trials (3) Both the bias a nd s.d. contribute to the RMS error. With equations 1, 2 and 3, the follo wing equality can b e deduced: RMS error = r N trials − 1 N trials s.d. 2 + bias 2 (4) 3.4 Results The broadband ILDs of the stim ulus a fter bimo dal sim ulation for ang les b etw een − 90 ◦ and +90 ◦ with or without head shado w enhancemen t a r e sho wn in Fig . 2(a). Head shado w 6 of 15 enhancemen t resulted in a steeper and monotonic ILD-v ersus-angle function. The results of the lo calization experiment a re sho wn in F ig . 2(b) and (c). Error bars represen t the standard deviation across sub jects. In F ig. 2(b), the mean resp onse a v eraged across trials as a f unction the presen tation angle is plotted p er condition. This is a r epresen tation of the resp onse bias for a certain condition: the closer the mean is to the diago nal, the smaller the resp onse bias. In Fig. 2(c), the resp onse s.d. across trials as a function t he presen tation angle is plotted per condition. The resp o nse s.d. is a measure of the v aria bilit y in the resp onse for a certain condition: the low er the s.d. , the smaller the v ariabilit y in the response, and th us the more certain the participants w ere ab out their resp onse. It can b e seen that head shadow enhanceme nt reduces b oth the bias and v ariability in resp onse. Bot h for t he bias and v ariability , the largest improv emen t is for large ang les, corresp onding w ell with the ILD curv es o f Fig. 2(a). (a) Standard deviation on response [°] Mean response [°] -90 -45 0 45 90 0 10 20 30 40 50 60 70 Omni-directional microphones Broadband ILD [dBA] Stimulus [°] Head shadow enhancement 0 45 90 -45 -90 -20 -10 0 10 20 0 45 90 -45 -90 Broadband ILD [dBA] -20 -10 0 10 20 Mean response [°] -90 -45 0 45 90 Standard deviation on response [°] 0 10 20 30 40 50 60 70 (b) (c) 0 45 90 -45 -90 Stimulus [°] 0 45 90 -45 -90 0 45 90 -45 -90 Stimulus [°] 0 45 90 -45 -90 mean bias = 27.9° mean bias = 11.2° mean s.d. = 37.1° mean bias = 22.0° Figure 2: Due t o enhanced interaural lev el diﬀerences ( ILD s), head shadow enhancemen t signiﬁcan tly impro v ed lo calization p erfor ma nce b y 23 . 7 ◦ in RMS error. The RMS error is dep enden t on b oth the bias and the uncertain t y in the resp onses. (a) Head shado w enhancemen t resulted in a steep er and monotonic ILD-versu s-angle curv e for the sp eec h stim ulus “zo em” in a bimo dal simulation. (b) The mean respo nse (a v eraged acro ss trials) is a measure of the bias in the resp o nse: the closer to the diago nal (dashed line), the b etter the response. Head shado w en- hancemen t decreased the bias esp ecially for large a ng les, as can b e exp ected fro m t he in teraural leve l diﬀerence (ILD ) curv es for our stim uli. Error bars represen t the inter- sub j ect standard deviation. (c) The standard deviation (s.d.) in resp onse (across trials) is a measure of how certain the listener is of his or her resp onse: less uncertaint y results in a smaller s.d. Head shadow enhancemen t decreased the uncertain ty for all angles, but especially f o r larg e angles. Error bars represen t the in ter-sub ject standard deviation. 7 of 15 A Wilco xon signe d-rank test was p erformed to compare the RMS error av erag ed across all a ng les with or without head shadow enhancemen t. Head shado w enhancemen t sig- niﬁcan tly impro v ed lo calization p erformance f rom a mean R MS error of 50 . 5 ◦ to a mean RMS error of 26 . 8 ◦ , i.e., a mean improv emen t o f 23 . 7 ◦ in RMS erro r (V = 36, p = 0 . 008, r = − 0 . 67) . 3.5 Discussion Head shado w enhancemen t yielded a steep er and monotonic ILD- v ersus-angle function, resulting in a large improv emen t in sound lo calization of 23 . 7 ◦ in RMS error. The strong resem bla nce of the ILD-versu s-angle function (F igure 2(a )) and the mean r espo nse curv e (Figure 2(b)) conﬁrms that our participan ts w ere indeed relying on ILDs for lo calization and did not use ITDs. The lo calization erro r (consisting of b oth the bias and uncertaint y in resp onse) w as reduced esp ecially at large angles, as could b e exp ected f rom the ILD curv es of our stimuli. Note that the un usually small s.d. in resp onse at large angles (see Figure 2(c)) can partly b e con tributed to our exp erimen tal set-up: the opp ortunity for erroneous resp onses is appro ximately halv ed at eccen tric angles b ecause there w ere no sp eak ers b ey ond ± 90 ◦ . In the condition without head shadow enhancemen t, we found a mean RMS erro r of 50 . 5 ◦ , whic h is within the range rep orted for real bimo dal listeners (P otts et al., 200 9; Ching et al., 20 07). This conﬁrms the v alidit y of our acoustic sim ulation of bimo dal listen- ing. F rancart et a l. (2011a) ha v e indeed sho wn that their results for acoustic sim ulations of bimo dal listening (F rancart et al., 2 009a) could b e translated to real bimo dal listeners. W e also exp ect impro v emen ts in lo calization fo r diﬀerent p opulatio ns, as similar meth- o ds hav e already b een show n to b e eﬀectiv e for real bimo dal listeners (F rancart et al., 2011a) and bilateral hearing aid users (Mo ore et a l., 2016). Mo ore et al. (2016) ha v e ev en sho wn improv emen ts in lo calization when lo w-frequency ILD enhancemen t w as com bined with compressiv e gain. Note t hat head shado w enhancemen t migh t distort lo w-frequency ITDs, whic h ha s to b e tak en into accoun t when considering bilateral hearing aid users. 4 Exp e rimen t 2: Sp eec h p erception in no ise 4.1 Exp erimen tal set-up The pa r ticipan t was seated in a quiet ro o m. The stimuli w ere again presen ted through Sennheiser HDA200 ov er-the-ear headphones via an RME Hammerfall DSP Multiface soundcard, using the softw are platform APEX 3 (F rancart et al., 2008). 4.2 Stim uli W e used the Flemish (Dutch) Matrix sen t ence test as target sp eec h (Luts et al., 20 1 5). It consists of 13 lists of 20 sen tences uttered b y a female sp eak er. Eac h sen tence has the same 8 of 15 grammatical structure (name, v erb, nume ral, adjectiv e, ob ject). As masking noise, w e used stationary speec h-we ighted noise with the same long- term av erage sp eec h sp ectrum as the sen tences. W e measured sp eec h p erception in three spatial conditio ns, alw ay s with targ et sp eec h from the front: noise at the CI side (S0 NCI), noise at the hearing aid side (S0NHA) a nd uncorrelated noise from all directions (S0N360). 4.3 Pro cedu re F or eac h condition, w e measured the sp eech reception threshold (SR T), deﬁned a s the signal-to-noise r a tio (SNR) (at the cen ter of the pa rticipan t’s head, if the experimen t w ere done in free-ﬁeld) a t whic h 50% of sp eec h could b e understo o d. W e did this a ccording t o t he adaptiv e pro cedure as describ ed by Brand & K o llmeier (2002). The sp eec h was presen ted at a lev el o f 58 dB SPL during eac h run (calibrated with stationary sp eec h- w eigh ted noise with a B&K Artiﬁcial Ear T yp e 4153 ), while the noise lev el w as set according t o the pr e- sen ted SNR. F o r each measuremen t, w e estimated the SR T as the SNR tha t w as calculated based on the resp onse on the last trial. F or eac h participan t, we p erformed eac h measuremen t twice to reduce random v ariabil- it y in the results; b efore the analysis, w e av eraged these tw o rep etitions for eac h measure- men t. W e ended up with a total of 2 (directional pro cessing t yp es) × 3 (spatia l conditions) × 2 (rep etitions) = 12 measuremen ts for eac h sub ject. W e p erformed the tests in blo c ks p er noise direction, while r a ndomizing the order of these blo ck s a nd randomizing t he condi- tions within eac h blo c k. Eac h participan t start ed with some training lists (S0N360 without head shado w enhancemen t) to get used t o the pro cedure and the bimo dal sim ulation. 4.4 Results The frequency-dep enden t SNRs in the left and righ t ear for the three diﬀeren t spatial conditions with or without head shado w enhancemen t a re sho wn in Fig. 3(a). The corre- sp onding SR Ts are sho wn in F ig . 3(b). F or eac h spatial condition, we p erfo rmed a Wilco xon signed-rank test to compare the SR Ts with or without head shado w enhancemen t. With noise from the CI side (S0NCI), the frequency-dependen t SNR increased by up to 20 dB in the lo w frequencies a t t he hearing aid side, while there w as little to no SNR- c hange at the CI side. In other words, the head shado w b eneﬁt was enhanced in the low frequencies. This resulted in a signiﬁcan t impro v emen t in SR T from − 6 . 0 dB SNR to − 21 . 7 dB SNR, i.e., a mean improv emen t o f 15 . 7 dB SNR in SR T (V = 36, p = 0 . 008, r = − 0 . 67) . With noise from the hearing aid side (S0NHA), the frequency-dependen t SNR increased b y up to 20 dB in the low frequencies at t he CI side, while there w as little to no SNR- c hange a t the hearing aid side. Th us, the head shadow b eneﬁt w as ag a in enhanced in the lo w f r equencie s. This resulted in a signiﬁcan t improv emen t in SR T fro m − 7 . 2 dB SNR to − 14 . 8 dB SNR, i.e., a mean improv emen t of 7 . 6 dB SNR in SR T (V = 36, p = 0 . 008 , r = − 0 . 67) . 9 of 15 With noise from all directions (S0N360), there was little to no SNR-change at b ot h ears. Consequen tly , t here w as no signiﬁcan t diﬀerence in SR T without or with head shadow enhancemen t (SR Ts w ere − 5 . 3 and − 5 . 6 dB SNR resp ectiv ely , V = 36, p = 0 . 38, r = − 0 . 22). Le f t e a r ( C I) S0NCI S0NHA Omni-directional microphones Head shadow enhancement S0N360 Rig h t e a r ( HA ) Fr e q u e n c y [ Hz] Fr e q u e n c y [ Hz] SNR [dB] SNR [dB] SNR [dB] SR T [dB] SR T [dB] SR T [dB] (a) (b) 0 1 0 2 0 0 1 0 2 0 0 1 0 2 0 2 5 0 5 00 1 0 0 0 2 00 0 4 0 00 8 0 00 2 5 0 5 0 0 1 0 0 0 2 0 0 0 40 0 0 80 0 0 − 2 5 − 2 0 − 1 5 − 1 0 − 5 − 2 5 − 2 0 − 1 5 − 1 0 − 5 − 2 5 − 2 0 − 1 5 − 1 0 − 5 Omni-directional micr ophones Head shadow enhancement Figure 3: Head shado w enhancemen t impro v ed signal-to-no ise ratio (SNR) and sp eec h understanding in spatial situations where a head shadow b eneﬁt is expected, while it nev er deteriorated SNR or sp eec h in telligibility . (Note the symm etry of the head-related transfer functions (HR TFs) in the SNR plots.) (a) When the noise came from +90 ◦ or − 90 ◦ (S0NHA or S0NCI), head shadow enhancemen t increased the SNR b y up to 20 dB SNR at some frequencies in the ear with b etter SNR, while the SNR in the other ear w as o nly sligh tly aﬀected. When the noise came f rom all directions (S0N36 0 ), there w as little to no SNR -c hange at either ear. (b) Head shado w enhancemen t yielded a n extra head shadow b eneﬁt of 7 . 6 dB SNR (S0NHA) up to 15 . 7 dB SNR (S0NCI), while it did not a ﬀ ect sp eec h intelligibilit y With noise from all directions (S0N36 0 ) 4.5 Discussion In spatial situations where one would exp ect head shadow b eneﬁts (S0NCI and S0NHA), head shadow enhancemen t yielded a larg e increase in SNR in the ear with b etter SNR, while there was little to no eﬀect on the SNR in the other ear. This resulted in a large impro v emen t in sp eec h intelligibilit y: an extra head shadow b eneﬁt of 7 . 6 dB SNR (S0NHA) 10 of 15 up to 15 . 7 dB SNR (S0NCI). Th us, the largest b eneﬁt w as o bta ined when the hearing aid side w as the ear with the b etter SNR (S0NCI). On the one hand, this mak es sense, as the algorithm w orks for the whole hearing sp ectrum at this side (0 t o 5 00 Hz). On the other hand, this implies t ha t our participants w ere relying mostly on the hearing a id side to understand sp eec h in S0NCI, whic h migh t not corresp ond with a ty pical bimo dal listener with prof ound hearing loss in the no n-implan ted ear. While the frequency range of 0 to 500 Hz do es corresp ond quite w ell with the residual hearing of a t ypical bimo dal listener, w e did not tak e in to accoun t degraded frequency selectivit y , cognitive a bilit y , etc. Moreo v er, w e used a closed-set sp eec h material (Flemish Matrix, Luts et al., 2015), whic h migh t hav e facilitated sp eec h understanding with this nar r ow hearing sp ectrum. With noise from all directions (S0N360), there was little to no eﬀect on the SNR in b oth ears. Although the b eamformer yields a small frontal at ten uation, it also attenuates noise from the con tralateral side in each ear. Consequen t ly , there w as no net SNR c hange with or without head shado w enhanceme nt, neither a signiﬁcan t diﬀerence in sp eec h in telligibilit y . In the conditions without head shado w enhancemen t, w e found SR Ts aro und − 6 dB, whic h cor r esp onds to the b est p erformers in a real bimo dal p o pulation (Devoch t et al., 2017) (note that Dev o ch t et al. (2017) used the Dutc h Matrix sen tence test and no t the Flemish one). This again conﬁrms the v alidity of our acoustic simulation of bimo dal listening. 5 General d iscussi on The curren t study show s the p ossible eﬀectiv eness of a head shado w enhancemen t algo- rithm based on ﬁxed b eamformers. The algorithm is able to enhance ILDs and SNRs b y supplying each ear with a ﬁxed b eamformer with con tralatera l attenuation. W e found large impro v emen ts in lo calization ability and sp eec h understanding for simulated bimo dal listeners. W e b eliev e that our results can be translated to real bimoda l listeners, as performance in our acoustic sim ulations without head shadow enhancemen t corresp onded w ell with the p erformance of real bimo dal listeners . Note ho w ev er that acoustic sim ulations with y oung normal-hearing listeners do not tak e in to a ccount all asp ects of real bimo dal listening, suc h as degraded frequency selectivit y in the ear with residual hearing and cog nitive a bilities. Moreo v er, the p o pula t io n of real bimo da l listeners has a large v ar iabilit y in p erformance. F uture in v estigations should determine ho w their baseline p erformance in teracts with the b eneﬁt of head shado w enhancemen t. The metho d might also b e detrimen tal fo r some part of the p opulation, suc h as listeners with extremely p o o r hearing in one ear: if target sp eec h is then presen ted to the non-implanted ear, they will not b e able to rely on their implan ted ear b ecause of the strong contralateral atten uation. The metho d is also promising for ot her device conﬁgurations, as similar approac hes ha v e b een sho wn eﬀectiv e in improving lo calization p erfo rmance and sp eec h inte lligibility for bilateral hearing aid users (Mo ore et a l., 2016) and bilat eral co chlear implan t users (Lop ez-P ov eda et al., 2 016; Bro wn, 2014). The b eneﬁt migh t ev en b e larger than exp ected, 11 of 15 as improv ed lo calization a llo ws listen ers to orien t to w ards talkers and gain access to vis ual cues, resulting in an additional improv emen t in sp eec h in telligibilit y (v a n Ho esel, 2015). Our metho d is distinguished fr o m previously rep orted strategies due to its simplicit y and lo w computational complexit y . The latter mak es it a lso suitable for a pplicatio n in clinical devices. F uture inv estigations should ensure how the algorithm inte racts with diﬀeren t acoustics and sound pro cessing blo cks : 1. W e exp ect that the algorithm will ha v e no diﬃculties with m ultiple sound sources, as it is based on a ﬁxed b eamformer which naturally handles m ultiple sources. 2. The eﬀect on ITDs remains to b e inv estigated. As b oth bilateral a nd bimo dal CI users hardly p erceiv e ITDs, any detrimen tal eﬀect on ITDs should not b e an issue for this p opulation. How ev er, it might decrease p erformance for bila t eral hearing aid users. 3. The eﬀect of low-freque ncy ILDs f or v ery close sound sources (closer than 1 m, Brungart & Rabino witz, 1999) also remains to b e in v estigated; how eve r, they may b e dealt with b y a time-dep enden t comparator that equalizes left and righ t microphone signals b efore the dela y-and-subtract tak es place. 4. The b eneﬁt of head shadow enhancemen t will mo st probably dep end on the sp ectrum of the sources: the more information that is carried in the lo w freque ncies, the larger the eﬀect of the b eamformer. W e exp ect listeners to b e a ble to adapt to altered ILD cues for diﬀerent source sp ectra (F rancart et a l., 2011a). 5. The b eamformer can b e combine d with any other ( mo na ural) signal pro cessing blo ck , as long as the pro cessing do es no t strong ly distort the signal and the total pro cess ing dela y is the same in b oth ears. F or optimal p erformance, it is probably recommended to ha v e head shadow enhancemen t as a ﬁrst blo ck in the pro cessing c hain. Only fron tal directivit y has to b e applied b efore head shadow enhancemen t, as o therwise head shado w enhancemen t should b e applied to b oth front and rear microphone signals b efore applying frontal directivit y . This would require double the amount of (wireless) data tra nsfer b et w een the t w o devices, and reduce battery life. Note that fron tal directivit y is mostly activ e in higher frequencies, whic h reduces ev en more the p ossibilit y o f an y deteriora ting interaction with head shadow enhancemen t. 6. Comb ination of head shado w enhancemen t with binaural b eamformers with frontal directivit y is not as straigh tforward, as those b eamformers often use the 4 av a ilable microphones to end up with 1 signal that is presen t ed diotically (Buec hner et al., 2014). It should how ev er b e p ossible to com bine head shado w enhancemen t with a binaural b eamfo rmer that preserv es (enhanced) binaural cues. Those designs ty pi- cally trade o ﬀ b etw een noise reduction and binaural cue preserv atio n (V a n den Boga ert et al., 2009). 12 of 15 6 Conclus ions W e presen ted a new metho d to enhance head shadow in low frequencies, with a ﬁxed b eam- former with contralateral atten uation in eac h ear. Head shadow enhancemen t impro v ed lo calization p erfo rmance by almost 24 ◦ RMS error relativ e to 50 ◦ RMS error for sim ulated bimo dal CI listeners. It also improv ed sp eec h intelligibilit y b y up to 15 . 7 dB SNR in spatial conditions where head shado w is exp ected to b e presen t , while it nev er deteriorated speec h understanding. The metho d is also pro mising fo r other hearing-impaired p o pulations, suc h as bilateral co chle ar implan t users or bilateral hearing a id users. Its low computational complexit y ma kes it suitable fo r application in clinical devices. Ac kno wledgmen ts This researc h is funded b y the R esearc h F oundation – Flanders (SB PhD fello w at FW O); this researc h is join tly funded by Co chlear Ltd. and Flanders Inno v atio n & En tr epreneur- ship (formerly IWT), pro ject 150432 ; t his pro ject has also receiv ed funding from the Eu- rop ean Researc h Council (ER C) under the Europ ean Union’s Horizon 2020 researc h and inno v ation programme (grant agreemen t No 637424, ER C starting Gran t to T om F rancart). W e thank o ur participan ts f or their patience and enth usiasm during o ur exp erimen t. References V an den Bogaert, T., Do clo, S., W o uters, J., & Mo onen, M. (20 09). Sp eec h enhancemen t with mu ltic hannel wiener ﬁlter tec hniques in m ultimicrophone binaural hearing aids. The Journal of the A c oustic al So cie ty o f Americ a , 125 , 360–371. Brand, T., & Ko llmeier, B. (2002 ). Eﬃc ien t ada ptiv e pro cedures for threshold and con- curren t slop e estimates for psyc hoph ysics and speec h in telligibilit y tests. The Journal of the A c oustic al So ciety of Americ a , 111 , 2 801–2810. Bro wn, A. D., Ro driguez, F. A., P ortn uﬀ, C. D., Goup ell, M. J., & T ollin, D. J. ( 2 016). Time-v arying distortions of binaura l informatio n by bilateral hearing aids: eﬀects of nonlinear frequency compression. T r ends in he aring , 20 , 2331216516 6 68303. Bro wn, C. A. (2 014). Binaural enhancemen t for bilateral co c hlear implan t users. Ear and he aring , 35 , 580. Brungart, D. S., & Rabinow itz, W. M. (19 9 9). Auditory lo calizatio n o f nearb y sources. head-related transfer functions. The Journal of the A c o ustic al So ciety of Americ a , 106 , 1465–147 9. Buec hner, A., Dyballa, K.-H., Hehrmann, P ., F redelak e, S., & Lenarz, T. (2014). Adv anced b eamformers for co chlear implant users: acute measuremen t of sp eec h p erception in c hallenging listening conditions. PloS one , 9 , e95542. 13 of 15 Ching, T., V an W anro o y , E., & Dillon, H. (2 007). Binaural-bimo dal ﬁtting or bilateral im- plan tation for manag ing sev ere to profound deafness: a revie w. T r ends in ampli ﬁ c ation , 11 , 16 1–192. Dev o c h t, E. M., Janss en, A. M. L., Chalupper, J., Sto kro os, R. J., & George, E. L. (2017). The b eneﬁts of bimo dal aiding on extended dimensions of sp eec h p erception: Intelligi- bilit y , listening eﬀort, and sound quality . T r ends in he aring , 21 , 233121651 7 727900. Dillon, H. (2 001). He aring aids v olume 3 62. Bo omerang press Sydney . F rancart, T., V an den Bogaert, T., Mo onen, M., & W o uters, J. (2009a) . Ampliﬁcation of in teraural lev el diﬀerences improv es sound lo calization in acoustic sim ulations of bimo dal hearing. T h e Journal of the A c oustic al So cie ty of A meric a , 12 6 , 3209 – 3213. F rancart, T., Brokx, J., & W outers, J. (2009b). Sensitivit y to in teraural time diﬀerences with com bined co c hlear implant and acoustic stim ulation. Journal of the Asso ciation for R ese ar ch in O tola ryngo l o gy , 10 , 131–141. F rancart, T., Lenssen, A., & W outers, J. (2 0 11a). Enhancemen t of in teraural lev el dif- ferences improv es sound lo calization in bimo dal hearing . T he Journal of the A c oustic al So ciety of Americ a , 130 , 2817–2826. F rancart, T., Lenssen, A., & W o uters, J. (201 1 b). Sens itivit y of bimo dal listeners to in- teraural time diﬀerences with mo dulated single-and multiple-c hannel stimuli. Au diolo gy and Neur otolo gy , 16 , 82–92. F rancart, T., & McDermott, H. J. (2013). Psyc hophys ics, ﬁtting, a nd signal processing for com bined hearing aid and co c hlear implan t stim ulation. Ear and he aring , 34 , 6 85–700. F rancart, T., V an Wieringen, A., & W outers, J. (200 8 ). Ap ex 3: a mu lti-purp ose test platform for a udito ry psyc hoph ysical experiments . Journal of Neur oscien c e Metho ds , 172 , 2 83–293. F rancart, T., W outers, J., & V an Dij k, B. (201 3). Lo calisatio n in a bilateral hearing device system. US P aten t 8 ,5 03,704. v an Ho esel, R. J. (2015). Audio-visual sp eec h in telligibilit y beneﬁts with bilateral co c hlear implan ts when ta lk er lo cation v aries. Journal of the Asso ciation for R ese ar ch in Oto- laryngolo gy , 16 , 309–3 15. Lop ez-P ov eda, E. A., Eustaquio-Mart ´ ın, A., Stohl, J. S., W olford, R . D., Schatze r, R., & Wilson, B. S. (2 0 16). A binaural co c hlear implan t sound co ding strategy inspired by the con tralateral medial oliv o co c hlear reﬂex. Ear an d he aring , 37 , e138. Luts, H., Jansen, S., Dresc hler, W., & W outers, J. (2015). Developmen t and normative data for the ﬂemish/ d utch ma trix test . T ec hnical R ep ort. 14 of 15 Mo ore, B. C. (2012). An intr o duction to the psycholo gy of h e aring . Brill. Mo ore, B. C., Kolarik, A., Stone, M. A., & Lee, Y.-W. (2016). Ev aluation of a method for enhancing inte raural lev el diﬀerences at lo w f r equencies. The Journal of the A c oustic al So ciety of Americ a , 140 , 2817–2828. P otts, L. G., Skinner, M. W., Litovsky , R. A., Strub e, M. J., & Kuk, F . (20 09). Recognition and lo calization of sp eec h b y adult co c hlear implan t r ecipien ts we aring a digital hearing aid in the nonimplanted ear (bimo dal hearing). Journal of the Americ an A c ademy of A udiolo gy , 20 , 3 53–373. Ric k etts, T., & Henry , P . (2002). Low -frequency ga in comp ensation in directional hearing aids. Americ an Journal of A udiolo gy , 11 , 29–41. Sha w, E. (1974). T ransformation of sound pressure lev el from t he free ﬁeld to the eardrum in the horizon tal plane. T he Journal of the A c oustic al So ci e ty of A meric a , 56 , 1848–186 1. Simpson, A. (2009). F requency-lo w ering devices for manag ing high-frequency hearing loss: A review. T r ends in ampliﬁc ation , 13 , 87–106. V an Wieringen, A. (2013 ). The lilliput, an op en-set cv c test fo r assess ing sp eec h in noise in 4-6 yr olds. In se c ond annual B-audio c onfer enc e 15-1 6 Novem b er 2013 . V eugen, L. C., Chalupper, J., Mens, L. H., Snik, A. F., & v an Opstal, A. J. (20 17). Eﬀect of extreme ada ptiv e frequency compression in bimo dal listeners on sound lo calization and sp eec h p erception. Co c h le ar Impla nts I nternational , (pp. 1–12). 15 of 15

Head shadow enhancement with low-frequency beamforming improves sound localization and speech perception for simulated bimodal listeners

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment