Broadband Multizone Sound Rendering by Jointly Optimizing the Sound Pressure and Particle Velocity

Broadb and Multizone Sound R enderin g b y Jointl y Optimizing the Sound Pressure and Particle V elocity ∗ M. Buerg er, † C. Hofmann, and W . Ke ller mann Chair of Multimedia Communications and Signal Processing, F riedric h-Alexander -U niv ersität Erlang en-N ürnberg (F A U), 91058 Erlang en, Germany (Dated: No vemb er 9, 2017) A bstract In this paper , a recen tl y proposed approa ch to multizone sound ﬁeld synthes is, ref erred to as Joint Pressure and V elocity Match ing (JPVM), is in v esti g ated anal yticall y using a spherical har monics repres entatio n of the sound ﬁeld. The approach is m otivate d b y the Kirchh oﬀ–Helmholtz integral equ ation and aims at controlling the sound ﬁeld inside the local listen ing zones by e v oking the sound pressu re and particle v elocity on surrounding contou rs. Based on the ﬁndings of the modal anal ysis, an impro v ed v ersion of JP V M is propos ed whic h pro vides both better perf or mance and lo we r comple xity . In particular , it is sho wn anal yticall y that the optimization of the tang ential component of the par ticle v elocity v ector , as is done in the original JPVM approac h, is v ery susceptib le to errors and thus not pursu ed an ymore. The anal ysis fur ther more pro vides fundame ntal insights as to how the spherical har m onics used to describe the 3D v ar iant sound ﬁ eld transla te into 2D basis functions as observ ed on th e contours sur roundin g the zones. By means o f simulations, it is v er iﬁed th at discar ding the tangen tial component of the particle v elocity v ector ultimatel y leads to an impro v ed perf or mance. Finall y , the impact of sensor noise on the reproduc tion perf or mance is assessed. ∗ The f ollowing ar ticle h as been submitted to the Journa l o f the Acoustical Society o f America. Af ter it is published , it will be found at http: //scitation.aip. org/J ASA . † michael.buerger@F A U .de; Cor respon ding auth or . 1 I. INTR ODUCTION Since t h e introdu ction of stereophonic sound in the 1930s, the numb er of loudspeakers utilized f or audio reproduction has been increasing steadil y . In home entertainm ent systems, 5.1 systems are common and ha ve e v en been e xt ended to 22. 2 channels. N ow ada ys, moder n cinemas ev en deplo y hund reds of l oudspeakers in order to create an im m ersiv e audio e xper ience, usually vi a W a ve Field Synthesis (WFS) [1–3], Ambisoni cs [4– 6 ], or related approaches. In recent yea rs, much research eﬀor t has also been inv ested in personali zed audio reproduction, i.e., s ynthesizing indi v idualized acoustic scenes f or multip le listeners in diﬀerent a reas of the reproduction room [7–25]. A prom inent us e case is giv en in cars, where the dr iver could l i sten to the inf or matio n provided by a na vigation system while all passengers can enjo y their f a vorite audio content. The diﬀerent approaches to mul t izone sound reproduction vary widely : Some methods are based on WFS and util ize analyticall y derived dri ving functions f or th e loudspeakers [10, 11, 16, 26]; others e xploit the f act that sound ﬁelds can be describ ed using or thogonal basis functions [12, 13, 17], suc h as cy l i ndr ical or spher ical harmo n ics; there are also diﬀerent m ulti-point approaches, where the sound pressure [8], the acoustic energy/contrast between the diﬀerent zones [7, 9], or both quantiti es are o p timized [18]. Mu lti-point techniques often use a des cr ipt i on of t he sound ﬁe ld in the spatial domain, but the y can also be f o rm ulated in the modal domain [13], i.e., usi ng basis functions . No m atter h ow the p roblem of p ersonal aud i o i s app roached, the ultimate goal of al l techniques is to ev oke a desired sound pressure distr ibution within a certain region of interest. All the multi -po int approaches abo ve ha v e in common that merely the sound pressure is considered and that the control p o ints are typically distr ibuted in the inter ior of the local listening areas. For practical applications, the latter imp lies that real microph ones (which need to be u t ilized in order to capture the acoustic transf er funct i ons f or all loud- 2 speake rs) ob struct the interior of the actual l istening area. T o mit igate this problem, the concept of Joint Pressure and V elocity Matching (JPVM) [22] was proposed, which only requires control points on cont o urs around the local listening areas. This can be achiev ed b y not only optimi zing t h e sound pressure, but also t aki ng the par ticle v elocity ve ctor int o account. Ac cording to the Ki rchhoﬀ–Helmholtz integ ral equation, it suﬃces to cor rectly e v oke the sound pressure and the (radial com ponent of ) the par ticle velocity v ector on a closed cont o ur/sur fa ce around a s ource-free area/v o l ume in order to obtain the desired sound pressure i n the enti re int er ior . In t h is contr ibut ion, JPVM is analyzed in the modal dom ain, which provides insights that cannot easily be explained i n the t ransducer domain. The ﬁndings thus obtained are e xpl oited to enhance the per f or mance while decreasing the compu t ational com p le x i ty at the sam e time. The remaind er of this document is structured as f ollow s: Firs t, a br ief re vie w of pressure matching [8] is giv en in Sec. II bef ore an improv ed v ersion of JPVM is presented in Sec. III. Th e analy sis underl y i ng the modiﬁcations of JPVM is carr ied out in Sec. I V, and the th eoretical insigh t s are v eriﬁed b y s imulations in S ec. V, whic h also contains an inv estigation of the impact of sensor noi se on t he reproducti on per f or mance. Finall y , the w ork is concluded in Sec. VI. II. REVIEW OF PRESS U RE MA TCHING (PM) As a star ting poin t, we want to br ieﬂy revie w the concept of Pressure Matching (PM) [8]. For the analytic descript ion of PM, let P ( ® x , ω ) be the sound pressure at position ® x in the region of interest R , and l et ω = 2 π f deno t e the angular frequency . The reproduced sound pressure is generated b y a set of N L loudspeakers lo cated at positio ns ® y l , with l = 1 , . . . , N L . These loudspeak ers ar e typically m o deled as inﬁnit ely long line so u rces (producing a height-inv ar iant 2 D s ound ﬁeld) or poi nt s ources (3D case). Throughout this work, we alwa ys consider point sources, the radiatio n characteristics of which are 3 determ ined by the 3D Green ’ s functi o n [27], G ( ® x | ® y l , ω ) = 1 4 π · e − i k k ® y l − ® x k 2   ® y l − ® x   2 , (1) where i is the imagin ar y uni t , and the wa ve num ber k = ω / c descr ibes the ratio betwe en angular frequency ω and speed of sound c . The reproduced sou nd pressure at posit i on ® x can t hus be expressed as P ( ® x , ω ) = N L Õ l = 1 G ( ® x | ® y l , ω ) W l ( ω ) S ( ω ) | {z } S l ( ω ) , (2) with S l ( ω ) being the dr i ving signal of the l -th loud s peake r , which i s o b tained b y ﬁlter i ng the source signal S ( ω ) wi t h the correspon ding lo u dspeaker preﬁlter W l ( ω ) . Equation (2) ma y be wr itten more compactly as P ( ® x , ω ) = g T ( ® x , ω ) w ( ω ) S ( ω ) , (3) where the superscript (· ) T indicates v ector/m atri x transposition, the column v ector w ( ω ) = [ W 1 ( ω ) , . . . , W N L ( ω )] T contains the frequency responses of the loudspeaker preﬁlters, and the 3D Green ’ s functions f or all loudspeakers are captured in g ( ® x , ω ) = [ G ( ® x | ® y 1 , ω ) , . . . , G ( ® x | ® y N L , ω )] T . In a mul tizone scenar io, the region of interest R accommodates s e v eral local listening areas: a ‘br ight zone’ R B , in which one or more vir tual sou nd s ources sh all b e synthesized, and po ssibly m ultiple ‘dark zones ’ R D where the acoustic scene synthesized in the brigh t zone shal l not be perceiv able. For the sak e of simpl i city and without los s of generality , w e restrict oursel ves to a single vir tual s ource to be synth esized in the brig h t zone R B and a single dark zone R D throughout this document. Let us descr ibe the desired sound 4 pressure in R according to P des ( ® x , ω ) = H des ( ® x , ω ) S ( ω ) , where H des ( ® x , ω ) denotes the targ et transf er function at position ® x ∈ R . In the br igh t zon e R B , the desi red transf er function H des pref erably represents an elementar y solution of the acoustic wa ve equation, such as a cy l i ndr ical wa v e, a spherical wa v e, a pl ane wa ve, or a s u perpo s ition thereof. F or th e dark zone, the ultimate goal is H des ( ® x , ω ) = 0 ∀ ® x ∈ R D such th at no acoustic energy is radiated into it. F ollowing the approach by Poletti [8], the reproduced pressure ﬁeld is matched with the desired one at a set of N M discrete positi ons, ref er red to as matching points or control p oints. Th e cor responding problem f or mul ation f or pressure matching is g i v en by min w ( ω )   G ( ω ) w ( ω ) − h p des ( ω )   2 2 , (4) where h p des ( ω ) = [ H des ( ® x 1 , ω ) , . . . , H des ( ® x N M , ω )] T is a colum n-v ecto r captur ing the desired transf er funct i ons from the v irt ual targ et source to the control points in both the br ight and the dark zone, and t he transfe r functions from the loud speake rs to the control poi nts are accommodated in matr ix G ( ω ) = [ g ( ® x 1 , ω ) , . . . , g ( ® x N M , ω )] T . Giv en that there are no loudspeakers or phy sical bar r iers betw een the diﬀerent l o cal listening areas, i t is phy si cally not f easible t o ha v e per fe ct silence in one area and a desired pressure distr ibution in another area. This impl ies that the targ et sound ﬁeld can on ly be correctly reproduced at a li mited number of p ositions, but not in a spatial cont inuum. If the number of loudspeakers is smaller th an the nu m ber of control point s, an approximate solution of (4) is required, which can be obtained, f or ex ample, u s ing Singul ar V alue Decomposition (S VD) or th e M oore–Pe nrose pseudoinv erse wi t h Tikhonov regul ar ization [28]. A dditional constraints can also be incor porated, e.g., in order to l i mit the ar ra y eﬀor t or Loudspeaker W eig ht Energy (L WE) [29]. It is also possibl e to i m pose a constraint on the acoustic contrast between the control point s in the individ u al zones [18]. Alt er native approaches, such as LASSO [14, 30], additionally m inimize the L 1 -norm of w i n order to control t he number of active l o udspeakers and, th us, obt ain a sparse solut i on. 5 III. ENHANCEMENT OF JOINT P RESSUR E AND VELOCIT Y MA TCHING (JPVM ) In th is section, we present an im pro v ed v ersion of th e or igin al JPVM approach [22], which will be ref er red to as JPVM+. Bef ore the optimization problem f or JPVM+ is f or mulated in Sec. III-B, the underl ying representation of t he par ticle v elocity ve ctor is discussed in Sec. III- A. F or sim plicity , w e assume that bo th the loud s peak ers and the local listening areas are located i n the x - y -plane, i.e., t he targ et sou nd ﬁeld i s deﬁned f or a 2D pl ane, whi le the loudspeakers are still model ed as point sources. The extension to three dimension s is not addressed here f or the sake of brevity . Ho w ev er , the necessar y incor poration of t he z -com ponent of the par ticle velocity v ector , wh i ch i s required to optimize the sound ﬁeld on a sphere rather than on a circular cont our , can be done analogously t o the x - and y - components. A. The Particle V elocity V ector Belo w , w e revisit how t h e par ticle v elocity vector is appro x imated and i ncorp orated into the op t imization problem f o r the or igi n al JPVM approach [22]. The par ticle velocity v ector ® V ( ® x , ω ) is, apart from the scalar sound pressure P ( ® x , ω ) , the second imp o rt ant quantity to chara cteri ze sound wa ves. With time-dependency e i ω t , a relation betwee n the tw o quantities can be established b y Euler’ s equation [31], − g rad P ( ® x , ω ) = i ω ρ ® V ( ® x , ω ) , (5) where ρ is the densi ty of th e propagation medium. T h e components of the parti cle v elocity v ector can be deﬁned f or arbitrar y coordinate systems, e.g., along the x - and y -a xis. In the orig inal JPVM approach [22], the v ector is composed of the radial and tangential component on a circular contour around each local li stening area, i.e., around t h e br ight 6 zone R B and dark zone R D , as ill u strated in Fig. 1. Here, we are only interested in the radial component V rad ( ® x , ω ) , as the tangential com p o nent can be discarded (see Sections IV-C and IV-D f or the theoretical mo tivation and Sec. V-C f or simulati on results). Due to the comple xity of real so und ﬁelds, (5) can g enerally n ot be ev alu at ed anal ytically . In practice, the parti cle v elocity can be obtained using a B-Format mi croph o ne [32], which pro v ides the sound pressu re at its center as well as the three components of the par ticle v el o city v ector . Al tern at i v el y , these signals can be obtained using an omnidirectional microphone as well as three ﬁgure-of-eight m i crophones ar ranged ort hogonally to each other , which is refe rred to as native B-Format [33]. If not all com ponents of the par ticle v el o city vec tor are required, f e wer ﬁgure-of-eight mi crophones are o bviousl y su ﬃcient. Ne vertheless, B-f or mat and directional microphones are usually much more e xpensive and bulky than omnidirectional microphones. Theref ore, w e approximate the par ticle v el o city vector in this work by sp atially sampling the s ound ﬁeld at the control points with om nidirectional microphones and by subsequently com p uting the sp atial diﬀerence quotient. For th is purp ose, th e control point s are ar ranged pairwise on tw o concentric circles of slight ly diﬀerent radii, as illustrated i n Fig. 2. The inner and o u ter circle accommodate M control point s each, which are located at po s itions ® x in , µ and ® x out , µ , respectivel y , wit h µ = 1 , . . . , M . The radial component of the par ticle velocity ve ctor may then be approximated according to V rad ( ® x out , µ , ω ) ≈ − 1 i ω ρ P ( ® x in , µ , ω ) − P ( ® x out , µ , ω )   ∆ ® x µ   2 , (6) where ∆ ® x µ = ® x in , µ − ® x out , µ . For a sound reproduction system, the ev oked radial component of the parti cl e velocity can be obtained b y inser ting (3) into (6), which yi elds V rad ( ® x out , µ , ω ) ≈ w T ( ω )  g ( ® x in , µ , ω ) − g ( ® x out , µ , ω )  S ( ω ) − i ω ρ   ∆ ® x µ   2 . (7) That is, the comp onents of the appro ximated par ticle velocity v ector can be expressed in 7 P S f r a g r e p l a c e m e n t s ~ x ~ V V tan V rad φ 0 FIG. 1: Radial and tange ntial component of the par ticle ve locity vector deﬁned on the circular contour around a l ocal listening area. term s of loud speaker preﬁlters w and acoustic transfer functions g . The tangential compon ent of the par ticle v el o city v ector coul d be descr ibed analogously b y computing the pressu re di ﬀerence alon g the angul ar d i rection, as is done in the o r ig i nal contr ibution [22]. F or t his pur pose, additional m icrophones need to be placed at positio n s ® x out_add , µ on the outer circle, as i llustrated in Fig. 3, wh ere g roups of three microphones are arranged in an L -shape so as to appro ximate both components of the particle v elocity v ector . The t angential component could then be approximated according to V tan ( ® x out , µ , ω ) ≈ − 1 i ω ρ P ( ® x out_ add , µ , ω ) − P ( ® x out , µ , ω )   ® x out_add , µ + 1 − ® x out_add , µ   2 , (8) which can again be expressed in ter ms of w and g . How e v er , as already menti oned abov e, only the radial comp o nent is of interest here, and (8) i s included f or completeness. From Eqs. (6) to (8) it is obvious that the s pacing between t he cont rol point s p la ys a crucial role, as is alwa ys the case when utili zing diﬀerenti al microphone ar ra ys [34]: Larg e spacings result in a l o w spati al aliasing frequency , whereas systems with a small spacing are more prone to senso r noise, which must be taken into account f or practical realizations. Theref ore, the noise s u sceptibility of JPVM+ is analyzed in Sec. V-E. 8 P S f r a g r e p l a c e m e n t s ~ x out , µ − 1 ~ x out , µ ~ x out , µ + 1 ~ x in , µ − 1 ~ x in , µ ~ x in , µ + 1 ∆ ~ x µ − 1 ∆ ~ x µ ∆ ~ x µ + 1 FIG. 2: Pairs of control points located on the contour around a local listening area f or JPVM+ such t hat the radial comp o nent of the par ticle v elocity vector can b e appro x imated. P S f r a g r e p l a c e m e n t s ~ x out , µ ~ x out_add , µ ~ x out , µ + 1 ~ x out_add , µ + 1 ~ x in , µ ~ x in , µ + 1 FIG. 3: L -shaped g rou ps of control point s located on th e conto u r around a local listening as required f or the or igin al version of JPVM [22]. B. Joint O ptimizati on of Pressur e and Parti cle V elocity W e now der iv e the opt imization probl em f or the improv ed v ersion of JPVM, ref er red to as J PVM+, where the indiv idual steps of the der ivation f oll o w the same line as the o n es presented in the or iginal paper [22]. T o f orm ally descr ibe th e cost function f or JPVM+, w e ﬁrst ar range the acoustic t ransf er functions from t he l -th loudspeaker to all M cont rol po i nts on each circle i n column v ectors g z c , l ( ω ) = h G ( ® x z c , 1 | ® y l , ω ) , . . . , G ( ® x z c , M | ® y l , ω ) i T , where the superscri pt z ∈ { B , D } indicates the br ight or dark zon e, and the subscr ipt c ∈ { in , out } ref ers to th e inner or outer circle of control points . Usi ng this notation, the transf er functions 9 from the enti re set of loudspeakers to all control points can be conv eniently captured in G ( ω ) =              g B out , 1 ( ω ) , . . . , g B out , N L ( ω ) g B in , 1 ( ω ) , . . . , g B in , N L ( ω ) g D out , 1 ( ω ) , . . . , g D out , N L ( ω ) g D in , 1 ( ω ) , . . . , g D in , N L ( ω )              , (9) such that the sound pressure at t h e control points can be compactly wri tten as p ( ω ) = G ( ω ) w ( ω ) | {z } g p ( ω ) S ( ω ) , (10) with g p representing the t ransfer function of the o v erall s ystem w .r .t. t he sound p ress u re. As o nly the radial component of the par ticle velocity ve ctor i s considered here, which is motivated in Sec. IV, th e or iginal diﬀerence matr ix [22] u sed f or computing the corre- sponding p ressu re di ﬀerences s i mpliﬁes to D ( ω ) = − 1 i ω ρ ∆ R       − I I 0 0 | {z } bright zone 0 0 − I I | {z } dark zone       (11) where ∆ R =   ∆ ® x µ   2 ∀ µ represents the absolute d i ﬀerence between radii R in and R out of the i nner and th e out er circle o f cont rol point s, respectivel y . Further more, I and 0 denote an id entity and zero matr ix, respectivel y , of dimensi ons M × M . The resul t ing radial components of the part icle velocity v ector along the contours of the br igh t and d ark zone 10 are then gi v en by v rad ( ω ) = D ( ω ) G ( ω ) w ( ω ) | {z } g v el ( ω ) S ( ω ) , (12) where g v el represents the t ransf er function of the o v erall system w .r .t. the radial com ponent of the parti cl e velocity vector . Using (12) and (10), a w eighted least squares optimizati on cr i teri on f or b oth quantit i es can b e f orm ulated, min w ( ω ) n κ ( ω )   G ( ω ) w ( ω ) − h p des ( ω )   2 2 + ( 1 − κ ( ω ))   D ( ω ) G ( ω ) w ( ω ) − h v el des ( ω )   2 2 o , (13) where h p des and h v el des represent the desi red pressure and v elocity t ransf er functions, respec- tiv ely , and κ ( ω ) ∈ [ 0 , 1 ] is us ed to adjust the relative w eight of each quantit y . Similar to the or iginal JPVM approach [22], the o ptimization problem (13) can be ref or mul at ed using stac ked matr ices such t hat a common least squares p rob lem is obtained, which can then be solv ed in a closed f or m , e.g., usin g a Mo ore–P enrose pseudoinv erse wi th Tikhono v regulari zati on [28 ]. F or the sake of brevity , t hese steps are omit ted h ere. N ote that the cost function f or JPVM+ does not contain the tangential component of th e part i cle v elocity v ector imp lying that the optim ization problem has a lo wer compl e xi t y com pared to JPVM. IV . ANAL Y S IS OF THE IMP A CT OF T H E P ARTICLE VE L OCITY VECTOR COMPO - NENTS ON JPVM In this s ection, w e wa nt to anal yze ho w the individual components of the p arti cl e v el o city ve ctor aﬀect the behavior and per f or mance of JPVM. For this p urp ose, the sound pressure is f orm ulated in the modal do main, where impor tant fundamental relations are discussed in Sec. IV- A. Simi lar to the sound pressure, a modal representation o f the radial and tangential comp o nent of th e par ticle ve locity ve ctor is i ntroduced in Sec. IV-B and 11 Sec. IV-C, respectiv el y . Finally , a representative ex am ple is gi v en in Sec. IV-D which illustrates how well th e diﬀerent quantities on the contour are suited to control the sound ﬁeld in the int er ior . A. Modal Repres entation of Soun d F ields Ev ok ed by P oint Sources in a 2D P lane The loudsp eak ers in this work are modeled as point s ources, which is a si mple y et reasonable appro ximation of real lo udspeakers. This im plies that, e v en th o ugh the previous considerations are limi ted to a 2D plane, the soun d ﬁelds are in fac t t hree-dimensional, where an y poi nt ® x i n th e 3D space can be addressed by its azimuth angle φ , colatit ude angle θ , and radius r . It is w ell kno wn that arbitrary 3 D so und ﬁelds within a s o urce-free region can be descr ibed as [27] P ( r , θ, φ , ω ) = ∞ Õ n = 0 n Õ m = − n α mn ( ω ) j n ( k r ) Y m n ( θ, φ ) , (14) where Y m n ( θ, φ ) = s ( 2 n + 1 ) 4 π ( n − | m | ) ! ( n + | m | ) ! | {z } b mn P | m | n ( cos θ ) e + i m φ (15) are th e sph er ical harm onics of o rder n and degree m , wi t h P m n being t he associated Legendre functions, j n is t h e n -th order sph erical Bessel function, and α mn are the modal w eigh ts. F or a poin t source with freq uency response A ( ω ) l o cated at radius r 0 and direction ( θ 0 , φ 0 ), the modal weights are give n by [35] α PS mn ( ω ) = − i A ( ω ) k h ( 2 ) n ( k r 0 ) b mn P | m | n ( cos θ 0 ) e − i m φ 0 , (16) where h ( 2 ) n denotes the n -th order spherical Hank el function of the second kin d . Note that 12 (14) with modal w eights according to (16) and A ( ω ) = 1 ∀ ω is equivalent t o the 3 D Green ’ s function (1). As w e assume in t his w ork that the loudsp eak ers and obs er vation p oints l ie in the x - y plane, i.e., θ 0 = π / 2 and θ = π / 2, we om it the colatitude angles θ and θ 0 in the notation f or brevity . Using the abo ve modal representation, the sound pressure ev oked by a s ingle l oudspeaker , which is mod el ed as an id eal p oint source h ere, can be e x pressed as P ( r , φ , ω ) = ∞ Õ n = 0 n Õ m = − n α PS mn ( ω ) j n ( k r ) b mn P | m | n ( 1 ) e + i m φ . ( 17) On the other h and, the sound pressure at the circular contour around a local listening area can be represented by a Fourier ser ies. More precisely , we express th e sound pressure at the outer circle of control poin t s with radius R out as P ( R out , φ , ω ) = ∞ Õ µ = −∞ a µ ( R out , ω ) e i µ φ , (18) where a µ are t he Fourier coeﬃcients. The coeﬃcients a µ obtained f or a sin g le point source can be computed by equating (18) with (17 ) and ev aluating the inner product with t h e comple x har moni cs, i.e., a µ ( R out , ω ) = 1 2 π 2 π ˆ 0 ∞ Õ n = 0 n Õ m = − n α PS mn ( ω ) j n ( k R out ) b mn P | m | n ( 1 ) e + i m φ e − i µ φ d φ . (19) Due to the or tho g o nality of comple x harm onics, th e integral ma y only be non-zero f or µ = m such that the F ouri er coeﬃcients are given b y a m ( R out , ω ) = ∞ Õ n = | m | α PS mn ( ω ) j n ( k R out ) b mn P | m | n ( 1 ) | {z } γ mn ( k , R out ) . (20) 13 Note that the summation o ver n in (20 ) must be limi ted to n ≥ | m | , since | m | of an associated Lege ndre funct i on cannot be larger n . This is equivalent to the l imitation o f the sum ov er m from − n , . . . , n i n (14). From (20), it can be seen th at a part icular Fourier coeﬃcient a m correspon d s t o a w eight ed sum of the modal we ights α PS mn f or a given d eg ree m and all orders n , where the in d ividual we ights are given b y a set of s calar factors γ mn . This set of frequency -dependent f actors γ mn can be inter preted as a Mu l tiple-Input Single-Outp ut (MISO) system, which maps all modal weights f or a parti cular index m ont o a Fourier coeﬃcient representing a cor respond i ng quantity on the contour (here: sound pressure), as illustrated in Fig. 4. Thus, a p ar ticul ar F ou ri er coeﬃcient a m obtained at the out put of such a MISO sys t em describes how well th e w eighted sum of all modes of degree m can be obs erved when e valuating the sou nd pressure on the con tour . In case the mappin g d u e to γ mn results in a very sm all absolute value a m , a given set of modal weights α PS mn thus appears o n t he contour as strongly attenuated. T o illustrate t h e mappi ng p er f o rm ed by (20), the Fourier coeﬃcient a 0 is plott ed in Fig. 5 f or R out = 0 . 3 m and a po i nt s o urce l o cated at r 0 = 2 . 5 m, where the source direction φ 0 is irrelev ant here due to m = 0. As expected, the magnitude of a m is strongly frequency- dependent, and the w ei g h ted s um of all mod es f or a given m cannot be obser ved w ell at certain frequencies. Interestingly , the env elope/magnit ude of α 0 v er y closely f ollow s the zeroth-order cylindrical Bessel functi o n J 0 rather than the spherical Bessel function j 0 , e v en t hough the sound ﬁeld is resul t ing from a p oint source. Th i s behavior cannot only be obser v ed f or the coeﬃcient a 0 , but f or a m in genera l. It s h ould b e n oted, ho w ev er , that the cy l i ndr ical Bessel functions exhibit zero transitions, whereas the magnitu de of a m does not take on t he v alue zero f or the cor responding arguments. The previous consi d erations descr ibe t h e m apping from given mod al weights α PS mn onto the F our ier coeﬃcients a m representing the sound pressure on the contour around a local listening area. If, on the other hand, the sound ﬁeld shall be controlled b y ev okin g the sound pressure on the contour , the inv erse mapping comes into play . From a system-theoretical 14 + + + + P S f r a g r e p l a c e m e n t s α PS m | m | α PS m | m + 1 | α PS m ∞ γ m | m | γ m | m + 1 | γ m ∞ . . . . . . . . . a m deter min ed by source position (up to scaling ) deter min ed by obser vation positions MISO sys tem FIG. 4: Schematic ill ustration of the mappi ng of the modal w eights α PS mn , with n = | m | , . . . , ∞ , onto the Fourier coeﬃcients a m representing t h e sou nd pressure on the contour around a local li stening area. For brevity , the frequency -dependency i s omit ted here. 0 1 2 3 4 5 6 7 8 − 0 . 04 − 0 . 02 0 0 . 02 0 . 04 f in kHz a 0 ( ω ) R e { a 0 ( ω ) } Im { a 0 ( ω ) } j 0 ( k R out ) J 0 ( k R out ) FIG. 5: Fourier coeﬃcient a 0 of the sound pressure P at R out = 0 . 3 m as resulting from a point source located at r 0 = 2 . 5 m in the x - y pl ane. 15 point of view , thi s may seem problem atic, s ince it suggests the in v ersio n of a MISO sys tem. Ho w ev er , the factors γ mn of this MISO system only depend on the posit ions of the control points and are thus known. Further more, the m odal coeﬃcients α PS mn are dictated by the ph y sics of a p o int sou rce and its position . Th i s i mplies that, f or t h e giv en setup with a point source located in th e x - y plane, there i s o n ly a single possi ble set o f modal coeﬃcients (up to a scaling due to A ( ω ) ) f or a given s o u nd pressure o n the contour . Only i n case a m wa s equal to zero, in ﬁnitely many sets of m odal coeﬃcients w o uld be possible, which is ref er red to as the nonuniqueness problem [36]. Ne vertheless, the in v erse mapping m a y be ill-conditio n ed [37] and i n v olv e a strong ampliﬁcation, such that low absolute values of the sound pressure on the contou r ma y cor respond t o larg e values of the sou n d p ressure in the inter ior of the local listening area. A ccordingly , the sound pressure on th e contour must then be e vok ed v er y precisely in order to reproduce the desired sound ﬁeld proper ly , and ev en sm al l errors on the contour ma y lead to a s ubstantial d eter iorati o n in the inter ior . The ob servability of t he m odes on the contour can theref ore b e regarded as a robustness indicator: The larg er t he values of a m are, t h e less the sou nd ﬁeld in the in t erio r of the zone is aﬀected by reproduction er rors on the contour . Similarl y to t he sound pressure itself, t he radial and tangential pressure diﬀerences on the contours around the l o cal listening areas ma y also be represented in t erms of Fourier seri es, as sho wn below . The e xpress i ons thus obtained pro v i de insigh ts into t he behavior of JPVM and s erve as a basi s f or t he mod i ﬁcations un d er lying the i mpro v ed version JPVM+. B. The Radial Component of the Par ticle V elocity V ector Analogously to t h e abov e, let us again consider a singl e loudspeaker located in th e x - y plane. Using th e sph er ical har monics representation of Eqs. (14) to (16), t he radial pressure 16 diﬀerence i n the num erator o f (6) f or a cer tain angle φ on the contour can b e expressed as ∆ P rad ( R in , R out , φ , ω ) = P ( R in , φ , ω ) − P ( R out , φ , ω ) = ∞ Õ n = 0 n Õ m = − n α PS mn ( ω ) ( j n ( k R in ) − j n ( k R out ) ) b mn P | m | n ( 1 ) | {z } γ rad , mn ( R in , R out ,ω ) e i m φ = ∞ Õ m = −∞ ∞ Õ n = | m | α PS mn ( ω ) γ rad , m n ( R in , R out , ω ) | {z } ∆ a rad , m ( R in , R out ,ω ) e i m φ , (21) where t he order of th e summ ations o v er m and n is e xcha nged in the last step and t h e lim i ts are adapted accordingly . Similarl y to γ mn in (20), the set of frequency-dependent factors γ rad , mn in (21) als o descr ibe a MISO system. The diﬀ erence to (20) is that the modal w eig hts α PS mn are no w mapped onto F ouri er coeﬃcients ∆ a rad , m which represent the radial pressure diﬀerence on th e conto ur rather than the sound pressure itself. A ccordingly , th e output ∆ a rad , m of th e system d escribes ho w w ell t h e weighted sum of all m o des o f degree m can be obser ved on the contour when ev aluating the radial press u re diﬀerence. In tur n, it also tells us how robustly a desired sound ﬁeld can be reproduced b y e v o k ing t he radial pressure di ﬀerence on the contour . Bef ore d iscussing the behavior of the radial pressure diﬀerence, we ﬁrst want t o als o express t he tangential pressure diﬀerence, which is als o optimized in t he or iginal ve rsion o f J PVM [22], in the same wa y . 17 C. The T angenti al Component of th e Particle V elocity V ector Expressing the tangential pressure di ﬀerence in the num erator of (8) by means of the spher ical harm o nics representatio n yields ∆ P tan ( R out , φ , ω ) = P ( R out , φ + ∆ φ , ω ) − P ( R out , φ , ω ) = ∞ Õ n = 0 n Õ m = − n α PS mn ( ω ) j n ( k R out ) b mn P | m | n ( 1 )  e i m ∆ φ − 1  | {z } γ tan , m n ( R out , ∆ φ ,ω ) e i m φ = ∞ Õ m = −∞ ∞ Õ n = | m | α PS mn ( ω ) γ tan , m n ( R out , ∆ φ , ω ) | {z } ∆ a tan , m ( R out , ∆ φ ,ω ) e i m φ , (22) where ∆ φ denotes the angl e betw een tw o neig h bor i ng control poin t s on t he circle wit h radius R out . The frequency-dependent gain f actors γ tan , m n represent another MISO system, which maps the modal coeﬃcients α PS mn onto Fourier coeﬃcients ∆ a tan , m representing the tangential pressu re diﬀerence along th e contou r . Ag ain, these coeﬃcients indicate how w ell a sound ﬁeld can be controlled by ev oking the tangential pressure di ﬀerence on the contour . It can b e seen directly from (22) that the tangential pressure diﬀerence component ∆ a tan , m is a scaled version of the s ound pressure component a m , wh ere the scaling factor e xp ( i m ∆ φ ) − 1 is absolut ely m uch sm aller than one f or small ∆ φ and l ow deg rees m . This implies that th e sound pressure itself can t ypically be better obser v ed than the tangential pressure diﬀerence. A m ore detail ed discussi on and compar ison of the diﬀerent m appings o f mo dal coef- ﬁcients due t o γ mn , γ rad , mn , and γ tan , mn onto the cor respondi n g Fourier coeﬃcients repre- senting the sound pressure on the contour , the radial pressure diﬀerence, and the tangential pressure diﬀerence, respecti vel y , is pro vided below . 18 D. Control lability of the Sound Pressur e and its Radial and T angential Diﬀere nce As an ex ample, let us cons i der a point source l o cated at r 0 = 2 . 5 m and φ 0 = π , where the circles accommodatin g the control poi n ts are o f radii R in = 0 . 275 m and R out = 0 . 3 m , and t h e obs er vation angle is φ = 0 ◦ . The angular diﬀerence ∆ φ is chosen such that t he absolute angular distance between two control point s on the outer circle is identical to the absolute radial distance ∆ R = 2 . 5 cm. Figure 6 show s the resulting F our i er coeﬃcients a m , ∆ a rad , m , and ∆ a tan , m f or m = 1, which represent the respectiv e component of t he sou nd pressure, the radial pressure diﬀerence, and t he tangential pressure diﬀerence on the contour . It can be seen that so und pressure component a 1 , whose magni t ude beha v es sim ilarl y as t he m agn i tude of th e Bessel function J 1 , cannot be obser v ed w ell on the contou r f or cer tain frequencies. Con versel y , reproducing a sound ﬁeld by controlli ng t he sou n d pressure on the contour is prone to reproduction errors f or these frequencies. Especially in a m ultizone scenar io, reproduction errors are i nevitabl y occur r ing as the optimizati on problem is t y p ically il l -conditioned, s u ch that a regular ization is required in order to li mit the gains of the loud s peak er preﬁlters. Theref ore, a low per f or mance must be expected f or these frequencies. A simi l ar reasoning holds f or t he radial pressure diﬀerence a rad , m , which t oo exhibits lo w values at some frequencies, as can als o be seen i n Fig. 6 f or m = 1. How e v er , t he frequencies at which t h e local minim a occur are shifted relative t o the ones of t he sound pressure its el f. As a consequence, the radial pressure diﬀerence has generall y larg e values if t he pressure itself i s low , and vice v ersa, such that at least one quantity can be u sed to robustly ex cite the soun d ﬁeld. Only f or fe w frequencies bey ond 5 kHz, the local minima of the sound pressure component and the radial pressure diﬀerence lie closely together . It is w orth noting here that the location of the local minim a of th e radial pressure diﬀerence can be controlled by th e distance ∆ R betw een t he tw o circles o f control points. If ∆ R → 0, as an extreme case, the diﬀerence becomes propor tional to the spati al der ivativ e wi th its 19 0 1 2 3 4 5 6 7 8 0 0 . 5 1 1 . 5 2 · 10 − 2 f 1 f in kHz | a m ( R out , ω ) | : Pressure at R out | a m ( R in , ω ) | : P ressure at R in    ∆ a rad , m ( R out , R in , ω )    : Radial diﬀerence   ∆ a tan , m ( R out , ∆ φ , ω )   : T angen tial diﬀerence FIG. 6: Magnit udes of the obser v ed Fourier coeﬃcients representing t he sound p ressure components f or m = 1 on the conto u r as well as the radial and tangential press u re diﬀerences. e xt rem al values bei n g at those frequencies where the magnitud e of the sou n d pressure component has i ts steepest slope, i.e., in the vicinit y of the local m inima. Eve n th ough this ma y s eem to be the best choice, the absolute values of the radial pressure diﬀerence a rad , m w o u ld th en be v ery small f or all frequencies which ag ain impl ies a l o w robustness aga inst reproduction errors. Theref ore, a compromise n eeds t o be f ound between the absolute values of a rad , m and the locations of the local minima b y choosing p rop er values f or ∆ R . Finall y , the magnit u de of the Fourier coeﬃcient ∆ a tan , m descri b i ng the tangential pres- sure di ﬀerence o n t he contour is also plotted in F ig. 6 f or m = 1. In contrast t o t he radial pressure di ﬀerence, t he tangential pressure diﬀerence deﬁned in (22) typically is v er y small. The main reason f or it is that the scalin g factor e xp ( i m ∆ φ ) − 1 generall y take s on v er y low absolut e values, unl ess v ery high degrees m or a larg e spacing ∆ φ are consi dered. Due to the f act t h at ∆ φ must be chosen suﬃciently s mall in order to approximate the s patial derivativ e w ell, the scaling factor will be generall y much smaller th an one. The components ∆ a tan , m representing the tangential pressure diﬀerence are theref ore also much smaller than the components a m f or the pressure its el f. M o reo ver , the local minim a of ∆ a tan , m coincide 20 with the l o cal minima of a m , such that an opt imization of the t ange ntial pressure di ﬀerence does not provide any beneﬁt o ver the sou n d pressure it self. In fact, ev okin g the t ange ntial pressure diﬀerence ma y ev en degrade the reproduction per f o r m ance, as sho wn in Sec. V-C. The analy s i s of the abov e e xample sho ws that it is beneﬁcial to sim u ltaneously control the sound pressure and the radial press u re diﬀerence. How ev er , one ma y ask wh et h er it is also suﬃcient to optimi ze only t he sou nd pressure itself on two diﬀ erent radii, which should also (im plicitly) con t rol th e radial pressure diﬀerence. The answ er to that question can also be f ound in Fig. 6: For a frequency of f 1 = 728 Hz, as an e xample, th e m agnitude of a 1 representing soun d pressure components f or m = 1 on th e contour is equall y small f or b o th radii R in and R out , i.e., a local minim um of max { | a 1 ( R in ) | , | a 1 ( R out ) | } is obtained. In fact, the PM approach e xhibits a lo cal per f or mance mi n imum f or frequencies around f 1 , as wi l l be shown by sim ulation results in Sec. V-B. An explicit optimizatio n of th e radial pressure diﬀerence, on t he other hand, does n ot suﬀer from t h is ki nd of per f or mance drop, because the magnitude of the radial pressure diﬀerence com p onent ∆ a rad , 1 is sign iﬁcantly larg er than the cor responding magnit udes of the in dividual sound p ressure compon ents f or the consi dered frequency . In other w o rds, the p erf orm ance of the PM approach is dictated b y tw o pressure comp o nents of the sam e magnitude, whereas the JPVM/JPVM + approach can also exploit and opt imize a quantity whose absolut e value is sign iﬁcantly larger due to the p hase di ﬀ erences between the individual pressu re components. Theref ore, an e xplicit optimizatio n of t h e radial pressure diﬀerence is more robust to er rors than an im plicit optimizatio n . Ne v er theless, the optimization of the sound pressure a m on two circles with diﬀerent radii rather than a si ngle circle is still beneﬁcial f or frequencies where th e absolute value o f the radial pressure diﬀ erence a rad , m is smaller than the absolute values of both individual sound p ressu res themsel v es. 21 V . SIMULA TIONS The reproducti o n per f or mance of PM, JPVM, and JPVM+ is now e v aluated f or a simulated free-ﬁeld environment, wh ere th e simu lation setup and the util ized per f or mance measures are e xp lained in Sec. V-A. After wards, a compar i son of the three approaches is conducted in Sec. V-B, bef ore ev alu at i on results are shown in Sec. V-C which illustrate ho w the optim i zation of tangential pressure diﬀerence aﬀects the reproduction perf or mance. The impact of an e xplicit optimization of the radial component of t he pressu re diﬀerence is sho wn in Sec. V-D by means of s ynthesized sound ﬁelds, which also demo nstrate the broadband beha vior of JPVM+. Finally , i t is inv estigated in Sec. V-E ho w inherent nois e of the microphones, which are required to capture the R o o m Impulse Re sponses (RIR s) in practice, aﬀects t h e reprodu ct i on per f o r mance. A. Simulation S etup F or a validation of the analytical ﬁndings p resented abo v e, we simu l ate a 70-element rectangular loudspeaker ar ra y of di mensions 3 . 95 m × 3 m wit h an inter -element spacing of 9 . 6 cm, w h ere t he l o udspeakers are modeled as ideal poin t sources. This setup allo ws f or a direct compar ison with the results of the or iginal JPVM paper[22]. T w o l ocal listening areas are con s idered, whose centers are separated b y a distance of 1 m alo ng the y -axis and pl aced symmetr ically w .r .t. the array center , as illustrated in Fig. 7. Each zone has a diameter of 0 . 6 m (implying th at R out = 0 . 3 m), i.e., the edges of t he br ight zone R B and th e dark zon e R D are s eparated by 0 . 4 m. The radius of the inner circle of control points is chosen as R in = 0 . 275 m, which amounts to a radial spacing of ∆ R = 2 . 5 cm. T o allo w f or a fa ir perf orm ance com pari s on, th e number of utilized control point s is identical f or all approaches and amounts to 48 per zone. That is, 24 pairs of control poi nts are unif or mly di stribu t ed around each local l i stening area when app lying the improv ed version 22 JPVM+ and PM. In case of the ori ginal version o f JPVM, which requires mo re control points on the outer circle, 16 L -shaped groups consisting of three control poi nts each (cf. F ig. 3) are utilized f or eac h zone, wh ere both the radial and tange ntial sp acing is 2 . 5 cm. The relati v e weight κ f or the sound pressu re and p arti cle velocity in (13) is set to 0.04, and the minimization problem is sol v ed using the Mo ore–P enrose pseudioinv erse, where the regular ization parameter i s i terativ ely i n creased unti l t he Lo u dspeaker W eight Energy (L WE) [29] falls below the speciﬁed upper limit of 10 / N L . As shown in the appendix, the chosen constraint on the L WE is comparable t o a White Noise Gain (WNG) of 10 log 10 ( 0 . 1 N L ) . For compar ison, a WNG of 10 log 10 ( N L ) cor responds to a dela y-and- sum beamf o rm er [38 ], which is th e m ost robust beamf or mer and 10 dB larg er t h an the one obtained here. The targ et so u nd ﬁeld in t h e upper , br ight zone is give n b y a plane wa ve ori ginating from d irection φ src = − 50 ◦ . This constitut es an especially diﬃcult scenar io as th e eﬀectiv e distance betwee n the zones as ‘observed’ from the source direction φ src is close to zero. In other words, t he acoustic energy needs to be f ocused v ery shar ply and its lev el needs t o rapid ly drop in the lateral direction i n order not to t ra vel through the dark zone. Note that t he plane wa ve front deﬁning t he targ et sound ﬁeld i n the br ight zon e d o es not ha ve a magnit u de of one, but it is scaled such that it correspo nds to the magnit ude which would be obtained with a point source located at a di stance of ¯ r ≈ 2 m from the center of R B , where ¯ r is th e av erag e distance from all lou dspeakers t o the zone center . If this scaling was not applied, the preﬁlters would need to compens ate f or the distance- dependent attenuation of the spher ical wa ves emitt ed by the loudspeakers and, thus, ha v e an inappropr iately high energy . Further m ore, the abov e choice allow s f or directly relatin g the L WE to the WN G [39] (see Appendix). All si m ulations are conduct ed using a sampling frequency o f f s = 8 kHz, and a ﬁlter length L = 256 is chosen f or the loudspeaker preﬁlters. As a ﬁrst objective per f or mance measure f or the diﬀerent reproduction algor it h ms, t he Mean Squared Er ror (MSE) of the ov erall system transfer function w .r .t. the sound pressure 23 P S f r a g r e p l a c e m e n t s 3 . 95 m 3 m 1 m x y 0 . 4 m 0 . 6 m R D R B φ src FIG. 7: Schematic illustration of the reproducti on setu p with a plane-wa ve front ori ginating from direction φ src relativ e to the center of th e br ight zone R B . is ev aluated at N gr id = 441 g ri d poi n ts i n each zone, MSE { B , D } ( ω ) = 1 N gr id N grid Õ q = 1    H des  ® x { B , D } q , ω  − g T  ® x { B , D } q , ω  w ( ω )    2 , (23) where ® x B q ∈ R B , ® x D q ∈ R D , and the x - and y -spacing betwe en the ev aluation g r id positions is 2 cm. Fur ther m ore, w e ev aluate th e a v erag e l e v el diﬀerence ∆ L betw een t h e br ight zone and the dark zone, ∆ L ( ω ) = 10 log 10  E B ( ω ) E D ( ω )  dB , (24) where t h e av erage energies f or R B and R D are comp uted as E { B , D } ( ω ) = 1 N gr id N grid Õ q = 1    g T  ® x { B , D } q , ω  w ( ω )    2 (25) at the same g r i d positions . 24 B. P er f ormance Compariso n f or PM, JPVM, and JPVM+ As a ﬁrst in v estigation, we want to compare the per f or mance obt ai n ed wit h PM [8], the o r ig i nal J PVM approach [22], and t h e modiﬁed version JPVM+ presented here. As mentioned abo ve, L -s h aped g roups of control point s are u sed f or JPVM, whereas the control points are ar ranged i n p airs in case of PM and JPVM+. Ev en though this impli es that the setup is not identical f or all approaches, t h e number of utilized control po i nts is the same such that the comp ar ison is still fair . The MSE inside the local listening areas obtained f or the three approaches is shown as a function of frequency in Fig. 8(a). As can be seen, t h e perf orm ance of PM os cillates very strongly in the frequency range betwe en abou t 250 Hz and 1250 Hz, which is attr ibuted to the ill-conditi oning stemming from the lo w obser vability of cer tain modes at particular frequencies (cf. Sec. IV). For compar ison pur pos es, the frequencies f m correspon d ing to the respectiv e ﬁrst local mini mum of t h e magnitude of Fouri er coeﬃcient a m are indicated with arrow s f or m = 0 , . . . , 3. As can be seen, these frequencies m at ch the local MSE maxima ve ry well. When optim izing the entire par ticle velocity v ector in additi on to the sound pressure, as is the case in JPVM, the oscillati on of t he MSE curve can be sig niﬁcantly reduced, which im plies a m uch more un i f or m per f o r mance and suggests l ess coloration of the perceiv ed signal. Ho w ev er , the MSE can be reduced ev en fur ther f or almost all frequencies when util izing JPVM+. This is due to the fact that the optim ization of the tangential pressure diﬀerence in JPVM is very prone to (e v en tiny) er rors, which cannot be a v o i ded in practice. Especiall y in the br ight zone, the per f or mance t ow ards higher frequencies clearl y decreases f or all approaches. Additional inv estigations show ed t h at this can be mi tigated by reducing t h e spacing betwe en the lou dspeaker s, which leads to a low er s patial aliasin g frequency , or by relaxing the constraint on the L WE. How ev er , higher values of t he L WE im ply a lo w er robustness agains t er rors typically occur r ing in practice, such as transducer noise, position ing errors, or variati ons of the proper ties of real 25 0 1 2 3 4 − 55 − 50 − 45 − 40 − 35 − 30 − 25 f 0 f 1 f 2 f 3 f in kHz MSE in dB PM: R B PM: R D JPVM: R B JPVM: R D JPVM+: R B JPVM+: R D (a) interior of local listening areas 0 1 2 3 4 f in kHz (b) control poin ts on contour FIG. 8: MSE insi de (a) and at the control points around (b) the bri ght zon e R B and dark zone R D f or PM (dot ted line), JPVM (dash-do tted l i ne), and JPVM+ (solid line). loudspeakers. F or com p leteness, the MSE f or the control points themselv es (i.e., the contour) is also sho wn in Fig. 8(b). Note that the MSE values are again compu t ed according t o (23). That is, the MSE on ly captures the sou nd pressure and not the par ticle v el o city vector . As expected, the M SE is low est in case of PM, where all degrees of freedom are used to optimi ze the sound pressure at the control points. Interes tingly , the MSE at the control points has local mini ma at frequencies where the M SE inside the l ocal listening area exhibits local maxima. Th is again in dicates the p roblem that a low obser vability o f the sound pressures on the contours, which f orm t h e basis f or the optimi zation problem of PM, may lead t o larg e reproduction er rors i n the inter ior o f the zone. Ev oki n g t he so u nd pressure on t h e contour is thus not s uﬃcient to inf er the sound ﬁeld inside the contour . In case of JPVM and JPVM+, the error of the sound pressu re at t h e control points f or frequencies below about 1 . 5 kHz i s larg er compared to PM, as some deg rees of freedom are used t o optimi ze the part i cle v eloci t y (vector). Nev er theless, the y resul t in a b et t er perf o r m ance inside the listening areas, as sho wn in Fig. 8(a). 26 0 1 2 3 4 0 5 10 15 20 25 f in kHz ∆ L i n dB PM JPVM JPVM+ FIG. 9: Lev el diﬀ erence ∆ L between the bri ght zon e R B and the dark zone R D f or PM (dotted l ine), JPVM (dash-dot ted li n e), and JPVM+ (solid line). Figure 8(a) ma y s ugges t that the beneﬁt of JPVM+ ov er PM is rather low as the absolute MSE values in both zones are v ery lo w f o r all approaches. Ho we ver , it is not s u ﬃcient to only assess t he MSE, but it is also necessary t o ev alu ate the achie ved lev el diﬀerence ∆ L betw een the tw o local listening areas, whi ch is il l ustrated in Fig. 9. Ag ain, the PM approach results in strong oscil lations of the p er f o rm ance m easure, where the lev el diﬀerence drops from 20 dB to alm ost 10 dB when increasing the frequency from 250 H z to 450 Hz. This implies that th e acoustic scene to be synt hesized i n t he br igh t zon e does not leak into t h e dark zone equall y strong f or all frequencies, but cer tain frequencies p rod uce more leakage than ot hers. Th e l e v el diﬀerence achiev ed with JPVM+, i n cont rast, is much more un if or m and also clear ly larger in the entire frequency range. This indi cates th at not only the o v erall energy in the dark zone is low er in case o f JPVM+, but the arr iving signals also hav e less coloration. The or iginal JPVM approach ranges mostly betwe en PM and J PVM+, and ev en though the curve is comparably smooth, the absolute lev el diﬀerences in cer tain frequency range s are ev en lo w er than those of PM. 27 C. Impact of th e T angentia l Component of the Particle V elocit y V ector As the con t rol points f or JPVM and JPVM+ need to be ar ranged diﬀ erently , w e now wa nt t o ev aluate a si n gle setup and inv estigate in an isol ated manner ho w the op timization of the tangential par ticle velocity ve ctor com ponent impairs the reproduction per f or mance. F or th is pur pose, w e ag ain consider the already ev aluated setup f or JPVM wi th 16 L - shaped groups of control points b eing distri buted on th e conto u r around each zone. The resulting frequency-dependent MSE obt ained with JPVM is again shown in Fig. 10, which also contains the MSE obtained when ex cluding the tangential compo n ent o f the par ticle v el o city vector , i.e., when op timizing the sound pressure at all a v ailable control points and the radial com p o nent only . Note that the latter case does n o t cor respond to JPVM+, as the con t rol point s are still ar ranged in L -shaped g roups, and the number of control points on t he outer circle is twice as larg e as f or the inner circle. As can be seen in Fig. 10, the MSE is almo st consistently low er if the tangential component is not opti mized. Onl y f or very fe w frequencies, a negligi ble per f or mance deg radation can be obser v ed. These results conﬁr m the i nsights o b tained by t he theoretical analy sis p resented in Sections IV-C and IV-D, and the y show that the tangential com ponent of the particle v elo ci t y v ector should not b e optimized, as doin g so may generall y reduce the reproduction per f or mance. F or completeness, the MSE obtained when e xcluding the radial component , i . e., when only optimizing the sou n d pressure at all av ailable control points and the tangential com p onent only , is also plotted in Fig. 10. This i s to show that the MSE reducti on obtained when neglecting the tangential com ponent does not stem from t h e reduced comp l e xi t y of the resulting opt imization p rob lem, and it fur ther more conﬁr ms the eﬃcacy of op t imizing t he radial component. 28 0 1 2 3 4 − 50 − 45 − 40 − 35 − 30 f in kHz MSE in dB JPVM: R B JPVM: R D JPVM(w/o ta n.): R B JPVM(w/o ta n.): R D JPVM(w/o rad .): R B JPVM(w/o rad .): R D FIG. 10 : MSE inside the br ight zone R B and dark zone R D f or the or iginal JPVM approach (dash-dot ted l ine), th e JPVM approach without the opt i mization of the tangential component of the par ticle v elocity vec tor (solid line), and the JPVM approach without the optimizatio n of the radial compon ent of the parti cl e velocity ve ctor (dashed lin e). D. Impact of th e R adial Component of the Parti cle V elocity V ector In addition to the objectiv e per f or mance measure e valuated abo ve, we n o w want to illustrate the beneﬁt of explicitly optimi zi n g the radial com ponent of the par ticle v elocity v ector by means of vi sualizations of the sound ﬁelds sy n thesized b y PM and JPVM+. Note that t h e i llustrated sound ﬁelds are nor m ali zed in order to allo w f o r a better assessment . T o demonstrate the broadband beha vi or of JPVM+, a von Hann imp ulse sy nthesized as a plane-wa v e front is sh own in Fig. 11 f or two time instants. The time instants are chosen such that a vir tual plane per pendicular to the propag ation direction of the w a ve front is at the center of the respectiv e lo cal listening area. It can b e seen that the acoustic wa v e trav els nicely around the lo w er , dark zone, wh ere only th e edge of t he wa v e front leaks slightly into the i n teri o r of the zone (a). A t th e center o f th e upper , br ight zone (b), a plane-wa ve front o f desired or ientati on is synth esized. W e do not show the correspondi ng p l ots of PM here, as the d iﬀerences f or these two time instants are rather hard to see when a shor t impulse is reproduced. Instead, the sou n d ﬁelds are illustrated f or a frequency of 450 Hz 29 − 2 − 1 0 1 2 − 1 . 5 − 1 − 0 . 5 0 0 . 5 1 1 . 5 x in m y in m (a) wa v e fron t at cen ter of d ark zone − 2 − 1 0 1 2 x in m − 20 dB − 15 dB − 10 dB − 5 dB 0 dB (b) wa v e front at cen ter of b right zone FIG. 11: Reproduction o f a p lane-w a ve front f or two diﬀerent t i me instants using JPVM+. and 700 Hz i n Fig. 12. These frequencies are close to f 0 and f 1 , which correspon d to local minima of the m agnitudes of the Fouri er coeﬃcients a 0 and a 1 , respectivel y , representing the respecti v e component of the sound pressu re on the contour (see (20)). Especially in the dark zone, it can be seen that the modes f or m = 0 (a) and m = 1 (b) are er roneously e xcited i n the inter ior of the zones when using PM. That is, a larg e s ound energy is present in the dark zone at these frequencies, whereas th e desired wa v e front in the br ight zone is clear ly d i storted. The JPVM+ approach, on the other hand, only slight ly e xcites the modes f or m = 0 in the dark zon e (see Fig. 1 2(c)), and the mod es f o r m = 1 cannot b e obser v ed at all (see Fig. 12(d)). Sim i larl y , the wa ve front in the br i ght zone is much less distor ted f or both frequencies. E. Sensor Noise All pre vious simulat i ons are based on th e free-ﬁeld assumptio n and it is assu m ed that the acoustic transf er functions are per fe ctly k no wn. In practice, ho w ev er , the reproduction sys tem will typically be installed in a closed room and the s ound wa ves em itted by the loudspeakers und ergo reﬂections. In order to compensate f or these undesi red reﬂections, 30 − 2 − 1 0 1 2 − 1 . 5 − 1 − 0 . 5 0 0 . 5 1 1 . 5 x in m y in m (a) PM: 450 Hz − 2 − 1 0 1 2 x in m − 20 dB − 15 dB − 10 dB − 5 dB 0 dB (b) PM: 700 Hz − 2 − 1 0 1 2 − 1 . 5 − 1 − 0 . 5 0 0 . 5 1 1 . 5 x in m y in m (c) JPVM+: 450 Hz − 2 − 1 0 1 2 x in m − 20 dB − 15 dB − 10 dB − 5 dB 0 dB (d) JPVM+: 700 Hz FIG. 12 : Reproduction of a plane-wa ve front of frequency 450 Hz and 700 Hz usi ng PM and JPVM+. the acoustic proper ties of the room need to be captured, possibly in an adaptiv e manner . The t opic of listening room equalization [40–44] is not in the scope of thi s w ork and w e would l ike to ref er to lit erature here [45]. Ho w ev er , w e still want to inv estiga te how the ut ilization of real m icrophones f or ident ifying the RIRs at th e control points aﬀects the reprodu ction perf or mance. In addition to the inherent nois e of microphones, RIR measurements typically also suﬀer from pos itioning er rors and mismatches between the characteristics of the ind ividual microphones. According t o the work by Cox et al. [39] and T eutsch [46], these imper f ectio n s h av e a s imilar eﬀect as spatially whi te n o ise. T o assess 31 the impact on the perf orm ance of PM, JPVM, and JPVM+, w e t heref ore add spatially uncor related white Gaussian noise sequences to the clean free-ﬁeld RIR s cor responding to G in (13), where di ﬀ erent Signal-t o -N oise Ratios (SNRs) are consi dered. The free-ﬁeld RIRs are of length 12 8 sam ples and the SNRs are deter mined based on the energies of the clean RIRs and the noise sequences. The broadband reproduction er ror (MSE) in the br ight zone and the broadband lev el diﬀ erence ∆ L obtained from a veraging (23) and (24), respectivel y , are li sted in T able I f or t he three approaches. The table contains results f or SNR values of 10 dB, 20 dB, 30 dB, and 60 dB, where only frequencies abo ve 100 Hz are considered f or ev aluat i on. As expected, decreasing the SNR results in an increase of th e MSE values and a reduction of the lev el di ﬀerence. In case of PM, the MSE in the br ight zone degrades by less t han 0 . 5 dB when t he SNR i s reduced from 60 dB to 10 dB. Due to the exploitation of the pressure diﬀerences, both JPVM and JPVM+ are more susceptibl e to sensor noi s e, where t he MSE in the b r ig h t zone increases b y m ore than 1 dB f or bo t h approaches. Ne v er theless, t he absolute MSE values are alwa ys lo w est when utili zi n g JPVM+. The impact of senso r noi s e on the le v el diﬀ erence ∆ L is most pronounced i n case of JPVM+, where a reduction of t he SNR from 60 dB to 10 dB resul t s in a degradati o n b y more than 2 . 5 dB. A sl ightly lo w er reduction o f ∆ L can be obs erved f or JPVM, whereas the decrease f or PM i s below 1 dB. How ev er , J PVM+ achiev es t he highest absol u te lev el diﬀerences througho ut the entire SNR range. T h es e result s i ndicate that sensor noise should not be a sev ere problem i n practice. It i s w o r th no ting that, albeit the av erage ga in in lev el diﬀerence pro v i ded by JPVM+ relati v e to PM is only about 2 dB at most, the curves in Figures 8 and 9 indicate that JPVM+ introduces much l ess coloration, which is an impor tant aspect when i t com es the perceiv ed reproduction quality . 32 T ABLE I: Impact of sensor noise o n t he reproduction accuracy in the bri ght zon e (MSE) and the lev el diﬀerence ∆ L betwe en the br ight zone and the d ark zone when usin g PM, JPVM, and JPVM+. PM JPVM JPVM+ SNR i n dB ∆ L MSE ∆ L MSE ∆ L MSE 10 12.3 -34.3 10.9 -33.7 12.7 -34.6 20 13.0 -34.7 12.7 -34.9 14.8 -35.8 30 13.1 -34.7 12.9 -35.2 15.2 -36.1 60 13.1 -34.8 13.0 -35.2 15.3 -36.2 VI. CONCL USION In thi s paper , a recently proposed method f or m ultizone soun d reproductio n , ref erred to as JPVM, i s analyzed and fur ther dev eloped to yield the imp rov ed JPVM+ method. Th e core i d ea of JPVM is to independently control the sound ﬁeld i n diﬀ erent zones by jointly optimizing the so u nd p ressure and par ticle ve locity vector on the sur roundi ng con tours. The analysis o f JPVM presented in this work utili zes spher ical harmon i cs t o descr ibe the sound ﬁeld and provides fundamental insight s into its beha vior . First of all, i t illu strates that the optimizatio n of the tangential component of th e par ticle v elocity v ector is v ery prone to errors, and it ma y ev en degrade the reproduct i on perf or m ance. Theref ore, JPVM+, an improv ed v ersion o f JPVM is p rop osed which only considers the radial component of the part icle v elocity v ector . As a result, the computational complexity is reduced, while increasing the reproductio n p er f o rm ance, as i s v eriﬁed b y simu lation resul t s. The simulatio n s also rev eal t hat the error of the sound pressure on t h e con t our is not a reliable measure f or i n f er r ing the er ror i n t he i n t erio r . Finally , it is s h own that sens o r noise does not ha v e a signi ﬁcant im pact on the reproduction perf orm ance of JPVM+ despite relying on pressure diﬀ erences. 33 A CKNO WLEDGM ENTS The authors w ould li ke to express their sincere g ratitude to thank Gar y Elko, Thushara Abha y apala, R udolf Rabenstein, Filippo F azi, and Buy e X u f or t heir helpful suggestions and fruit ful di s cussions. APPEND IX: RELA T ION BET WEEN W H ITE NOISE GAIN AND L OUDSPE AKER WEIGHT ENERG Y In th is section, w e want to relate t h e Loudspeaker W eight En ergy (L WE) [29] t o t he White Noise Gain (WNG) [39], which is a well-kno wn and commonly used robustness measure in beamf or m ing. The WNG is d eﬁned as the Signal-to-Noise Ratio (SNR) o b tained f or a par ticular ‘targ et direction ’ in th e presence of spati ally white t ransd ucer noise of uni t variance. In the context of m ultizone sound render ing, not only t he direction, but also the distance is cr ucial, such that the t arget directio n becomes a ‘target pos i tion ’ . In fa ct, it is only meaningful to compute the WNG f or target positions ® x B in the br ight zone R B , as the desired s ignal in the dark zone R D is typically zero which, in t h e ideal case, would imply an SNR of −∞ dB. T o establish a relation b et ween the WNG and th e L WE, let us consider t h e optim ization problem f or pressure matching (4) with an additional constraint on t he L WE, min w ( ω )   G ( ω ) w ( ω ) − h p des ( ω )   2 2 s. t. k w ( ω ) k 2 2 ≤ α , (26) where α is the upper bou n d f or the L WE. The targ et so u nd ﬁeld in the bri g h t zone is chosen as H des  ® x B , ω  = 1 4 π   ® y − ® x B   2 e − i k k ® y − ® x B k 2 , (27) which cor responds to the sound pressure th at w ould be obtained with a si ngle loudsp eaker 34 located at pos ition ® y . F or si mplicity , we assume in th e f ollo wing that th e br ight zone is located at the center of a circular loudspeaker arra y of radius r 0 . Further more, i t is ass u med that   ® y   2 = r 0 and that the br ight zone is suﬃciently sm all relativ e to the ar ra y radius r 0 , i.e., the distance-dependent attenuation of the spher ical wa ves emitted by the loudsp eak ers is approximatel y ident ical f or all ® x B . Given t h at the optim ization problem in (26) i s so lv ed reasonably well, the o v erall transfer function of the reproduction system w .r .t . t he sound pressure at the target pos itions in the br ight zone ma y be approximated as g T  ® x B , ω  w ( ω ) ≈ 1 4 π r 0 e − i k k ® y − ® x B k 2 . (28) Note that, due to the assu mption that th e zone size is small relativ e to r 0 , the lev el within the bri ght zon e is approximatel y constant. T o specify the WNG, we also need to d eterm i ne the nois e s ignal N  ® x B , ω  at the targ et position ® x B as result ing from spatially whi te t ransducer noise. Denoti ng the sp at i ally wh ite noise signal o ri g i nating from the l -th loudspeaker as N l ( ω ) , l = 1 , . . . , N L , th e resulti ng noise s i gnal N  ® x B , ω  is give n by (cf. (2)) N  ® x B , ω  = N L Õ l = 1 G  ® x B | ® y l , ω  W l ( ω ) N l ( ω ) = w T ( ω ) n  ® x B , ω  , (29) where n  ® x B , ω  =  G  ® x B | ® y 1 , ω  N 1 ( ω ) , . . . , G  ® x B | ® y N L , ω  N N L ( ω )  T is a vector captur ing the contr ibutions of each lo u dspeaker to t he noise ﬁeld at ® x B . 35 A ccordin g to its deﬁnition, the WN G can be expressed as WNG  ® x B , ω  =   g T  ® x B , ω  w ( ω )   2 E  N  ® x B , ω  N ∗  ® x B , ω   ≈ 1 ( 4 π r 0 ) 2 w T ( ω ) E  n  ® x B , ω  n H  ® x B , ω   | {z } R ( ® x B ,ω ) w ∗ ( ω ) , (30) where E { · } is the e xpectation op erator and the superscript (·) ∗ denotes compl e x conj u ga - tion. As the nois e sig nals N l are spatially white, the cor relati on matrix R is diagonal and contains th e squared magnitudes of the in d ividual Green ’ s functi o ns. Due to the assump- tion th at these m agnitudes are app ro xi mately equal f or all target positions, th e correlation matr ix can be approximated as R ( ® x B , ω ) ≈ 1 ( 4 π r 0 ) 2 I N L , (31) with I N L being an identi ty matr ix of dim ensions N L × N L . Insert ing (31) int o (30) ﬁnall y yields WNG  ® x B , ω  ≈ 1 ( 4 π r 0 ) 2 1 ( 4 π r 0 ) 2 w T ( ω ) w ∗ ( ω ) = 1 k w ( ω ) k 2 2 . (32) That is, the WN G can be app roximated b y the i n verse o f the L WE. Note t hat (32) is only valid if the target sound ﬁeld f or t he br ight zone is speciﬁed such that the distance-dependent attenuation resul ting from the Green ’ s functions i s cons idered. If the targ et sound ﬁeld in (27) was deﬁned with a magnitude of o n e, the WNG in (32) w ould be increased b y a factor of ( 4 π r 0 ) 2 . It sh al l ﬁnally be noted th at, ev en though t he distances b etw een all loudspeakers and targ et posit ions are not i dentical in practice, (32) can still be used to 36 obtain an approximate relation betw een the WNG and the L WE. [1] A. J. Berkhout, D. de V r ies, and P . V og el, “ A cous tic control b y wa v e ﬁeld synthesis, ” J. Aco ust. Soc. Am. (J ASA) , vo l. 93, no. 5, pp. 2764– 2778, 1993. [2] S. Spors, R. Rabens tein, and J. A hrens, “The Theory of W a v e Field Synthesis Re visited, ” in 124t h Audio Eng. Soc. (AE S) C onv . , (Amsterdam), Ma y 2008. Pape r Number: 7358. [3] J. Ahrens, R . Rabens tein, and S. Spors, “Sound ﬁeld synthe sis f or audio present ation, ” Acous t. T oday , vo l. 10, no. 2, pp. 15–25, 2014. [4] M. A. Gerzon, “ Ambisoni cs in m ultic hanne l broa dcas ting and video, ” J. Audio Eng. Soc. (J AES) , vol . 33, no. 11, pp. 859–8 71, 1985. [5] D. B . W ard and T . D. Abha yap ala, “Repro ductio n of a plane-w av e sound ﬁeld using an ar ray of loudspe ak ers, ” IEEE T r ans. Speec h Audio Proces s. , vo l. 9, pp. 697–707, Sep 2001. [6] J. Ahrens and S. S pors, “ Anal ytical driving functio ns f or highe r order ambisonic s, ” in Pr oc. IEEE Int. C onf. Acou stic s, Speec h, Signal Proces s. (ICASSP) , pp. 373–376, IEEE, 2008. [7] J.- W . C hoi and Y .-H. K im, “Generation of an acoust icall y br ight zone with an illuminated region using multiple sources, ” J. Acou st. Soc. Am. (J ASA) , vo l. 111, no. 4, pp. 1695–170 0, 2002. [8] M. Po letti, “ An in ves tigati on of 2-D multizone surround sound sys tems, ” in Pro c. 125th Audio Eng. Soc. (AES) Conv . , Oct. 2008. [9] M. S hin, S . Q . Lee, F . M. Faz i, P . A. Nelson, D. Kim, S . W ang, K . H. Park, and J. Seo, “Maximizat ion of acous tic ener gy diﬀeren ce betw een two spa ces, ” J. Acous t. Soc. Am . (J ASA) , v ol. 128, no. 1, pp. 121–131, 2010. [10] J. Ahrens and S. Spors, “ An anal ytical approac h to local sound ﬁ eld synthesi s using linear arra y s of louds peak ers, ” in Proc. IEEE Int. Conf. Ac ous tics, Speec h, Signal Proces s. (ICASSP) , pp. 65–68, IE E E, Ma y 2011. 37 [11] S. Spors and J. A hrens, “Local sound ﬁ eld synth esis by vir tual secondary sources, ” in Proc. 40t h A udio E ng. Soc. (AES) Conf.: Spatial Audio—Sense the Sound of Space , pp. 6–3, 2010. [12] Y . J. W u and T . D . Abha y apala , “Spatial multizone sound ﬁeld reprod uction : Theory and design , ” IEEE T rans . Audio, Speec h, Languag e P r ocess. , v ol. 19, no. 6, pp. 1711–172 0, 2011. [13] W . Jin, W . B. Klei jn, and D. Virette, “Multizone soundﬁeld reprodu ction using orthogonal basis e xpansion, ” in Proc . IEEE Int. C onf. Acous tics, Speec h, Signal Proces s. (ICASSP) , pp. 311–31 5, IEEE, 2013. [14] N. Radmanesh and I. S . Burnett, “Generat ion of isolated wideband sound ﬁelds using a combine d tw o-stag e lass o-ls algorithm, ” IEE E T r ans. Audio , Speec h, Languag e Pr ocess . , v ol. 21, pp. 378–387 , Feb. 2013. [15] P . C oleman, P . Jacks on, M. O lik, and J. A . Pedersen , “Optimizing the planarity of sound zones, ” in Proc. 52nd Audio Eng. Soc. (AES) Conf.: Sound F ield Contr ol—Engin eering and P er cep tion , 2013. [16] K. Helwani , S. Spors, and H. Buchne r , “The synthesi s of sound ﬁgures, ” Multidimensio nal Sy stems and Signal Proces sing , v ol. 25, no. 2, pp. 379–4 03, 2014. [17] W . Jin and W . B. Klei jn, “Multizone soundﬁeld reproduction in re v erberant rooms using compress ed sensin g techniq ues, ” in P r oc. IEEE Int. Conf. Acou stic s, Speec h, Signal Proces s. (ICASSP) , pp. 4728–47 32, IEEE , 2014. [18] Y . Cai, M. W u, and J. Y ang, “Sound reproductio n in person al audio sy ste ms using the leas t-sq uares appro ach with acoustic contras t control const raint, ” J. Acous t. Soc. Am. (J A SA) , v ol. 135, no. 2, pp. 734–741, 2014. [19] M. A. Poletti and F . M. Fazi , “ An approac h to gen eratin g two zones of silence with applicati on to persona l sound sy stems, ” J. Acou st. Soc. Am. (J ASA) , v ol. 137, no. 2, pp. 598–605, 2015 . [20] T . Betlehem, W . Zhang , M. A. Poletti , and T . D. Abha y apala, “Pe rsonal Sound Z ones: Deliv er ing interf ace-fre e audio to m ultiple listeners, ” IEEE Signal P roc. Mag. , v ol. 32, pp. 81– 91, March 2015. 38 [21] M. Buerg er , C. Hofmann, and W . Ke llermann, “The Impact of Loudspeak er A r ray Imper f ec- tions and Re v erberation on the Perf ormance of a Multizone Sound Field Synthesis Sy stem, ” in Proc . 3r d Int. Conf. Spati al A udio (ICSA) , S ept. 2015. [22] M. Buerg er , R. Maas, H. W . L öllmann, and W . Kellermann, “Multizone sound ﬁeld syn- thesis based on the joint optimizatio n of the sound pressure and particle velo city v ector on closed contour s, ” in Proc. IEEE W or kshop Application s Signal Proce ss. to Audio, Acous tics (W A SP AA) , Oct. 2015. [23] M. A. Pol etti and F . M. F azi, “Gene ration of half-spac e sou nd ﬁelds with applic ation to person al sound sy ste ms, ” J. Acous t. Soc. Am. (J ASA) , vol . 139, no. 3, pp. 1294–13 02, 2016. [24] F . Winter , N. Hahn, and S . Spors, “ Time-d omain realisation of model-based rendering f or 2.5 d local wa v e ﬁeld synthesis using spatial bandwidth-limit ation, ” in Proc. 25th E ur . Signal Pro cess. Conf. (EUSIPCO) , pp. 688–692 , IEEE, 2017. [25] N. Hahn, F . Winte r , and S. Spors, “Synthes is of a spatiall y band-limite d plane wa v e in the time-domai n using wa ve ﬁ eld synthes is, ” in Pro c. 25th Eur . Signal Pro cess. Conf. (EUSIPCO) , pp. 673–67 7, IEEE, 2017. [26] S. Spors, J. Ahrens, and K. Helwan i, “Local sound ﬁ eld synthes is b y vir tual acous tic scattering and time-rev ersal, ” in Proc. 131st A udio E ng. Soc. (A E S) C onv . , Oct 2011. [27] E. G. Williams, F ourier A cous tics: So und Radiation and N earﬁeld Acous tical Hologr aphy . A cademic Press, 1999. [28] G. H. Golub and C . F . V . Loan, Matrix Computation s . John s Hopkins Univ ersity P ress, third ed., 1996. [29] T . B etlehem and C. Withers , “Sound ﬁeld reprodu ction with ener gy cons traint on loudspeak er w eight s, ” IE E E T r ans. Audio, Speec h, L angua g e Proc ess. , v ol. 20, pp. 2388–239 2, O ct. 2012. [30] G. N. Lilis and D . Ange losante , “Sound ﬁeld repro ductio n using the lasso, ” IEEE T ra ns. Audio, Speec h, Languag e Pro cess. , v ol. 11, pp. 1902–191 2, No v . 2010. [31] L. L. Beranek, Acous tics . McGra w-Hill , 1954. 39 [32] P . G. Crav en and M. A. Gerzon, “Coincident m icroph one simulation co ve ring three dimen- sional space and yieldin g va rious directiona l outputs, ” Aug 1977. US Paten t 4,042,779. [33] E. B enjamin and T . Chen, “The Nativ e B-Format Microphon e, ” in Pro c. 119th Audio Eng. Soc. (AES) Conv . , Oct 2005. [34] H. T eutsch and G. W . Elk o, “First- and second-or der adapt iv e diﬀerential microphone arra y s, ” in Proc . Int. W orks hop Acous tic Ec ho Noi se Contr ol (IW AENC) , pp. 35–38, 2001. [35] D. Colton and R. Kress, Inv erse Acou stic and Electr omagnetic Scatteri ng Theory . Spr ing er , 2012. [36] F . M. Fazi and P . A. Nelson , “Nonu niq ueness of the solution of the sound ﬁeld reproduct ion proble m w ith boundary pressure control, ” Act a Acus tica united w ith Acu stic a , vo l. 98, no. 1, pp. 1–14, 2012. [37] F . M. Fazi and P . A. Nelson, “T he ill-conditi oning problem in sound ﬁeld recons tr uction, ” in Pro c. 123r d A udio E ng. Soc. (AES) C onv . , Oct 2007. [38] H. L . V . T rees, Dete ction, Estimatio n, and M odulation Theory, Optimum Array P r ocessi ng (P art IV) . John Wile y & S ons, 2002. [39] H. C o x, R. M. Zeskind, and T . Kooi j, “Practical super g ain, ” IEEE T r ans. Acou st., Speec h, Signa l Pr ocess. , v ol. 34, no. 3, pp. 393–398, 1986. [40] S. Goetze, M. Kalling er , A. Mertins, and K. D. Kamme y er , “Multi-c hanne l liste ning-r oom compens ation using a decoup led ﬁltered-x lms algor ithm, ” in Pr oc. 42nd A silomar C onf. Signa ls, Sy st., C omput. , pp. 811– 815, Oct 2008. [41] M. Schneid er and W . Kel lermann, “ A daptiv e list ening room equaliz ation using a scalable ﬁltering structure in thew a v e domain, ” in P roc. IEE E Int. Conf. Aco usti cs, Speec h, Signal Pro cess. (ICASSP) , pp. 13–16, March 2012. [42] M. Schneide r and W . Kel lermann, “Iterati v e dft-domain inv erse ﬁlter determ inatio n f or adapti v e listen ing room equa lizatio n, ” in Proc. Int. W orksho p Acous tic Ec ho Noise Contr ol (IW A ENC) , pp. 1–4, Sept 2012. 40 [43] D. S. T alag ala, W . Z hang, and T . D. Ab ha y apala, “Eﬃcient multi-ch annel adaptiv e room compens ation f or spatial soundﬁeld reproduct ion using a m odal decompositi on, ” IEEE T r ans. Audio, Speec h, Languag e Pro cess. , v ol. 22, pp. 1522–153 2, Oct 2014. [44] C. H ofmann, M. Guenther , M. Buerg er , and W . Kellermann, “Higher -order listeni ng room compens ation with additive compensation signals, ” in P roc. IEEE Int. Conf. A cous tics, Speec h, Signa l Pr ocess. (ICASSP) , March 2016. [45] M. Buerg er , C. Hofmann, C. Frank enbac h, and W . Kelle rmann, “Multizone sound reprod uc- tion in rev erberan t en vironments using an iterati v e least-s qua res ﬁ lter design method with a spatio temporal weig hting functio n, ” in Proc. IEEE W orkshop Applicati ons Signal Proces s. to Audio, Acou stic s (W ASP AA) , Oct. 2017. [46] H. T eutsc h, Modal A rr ay Signal Pr ocessi ng: Princip les and A pplicat ions of Acous tic W ave - ﬁeld Decompositio n (Lectur e No tes in Control and Inf orm ation Scienc es) . Spr ing er , 2007. 41

Broadband Multizone Sound Rendering by Jointly Optimizing the Sound Pressure and Particle Velocity

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment