Decompounding on compact Lie groups

Decomp ounding on compact Lie groups Salem Said (1) , Christian Lageman (2) , Nicolas Le Bihan (1) and Jonathan H. Man ton (3) (1): GIPSA-Lab / CNRS, Grenoble, F rance; (2): Departmen t of Electrical Engineering and Computer Science, Univ ersite de Liege, Belgium; (3): Departmen t of Electrical and Electronic Engineering, The Univ ersit y of Melb ourne, Australia. Salem.said@gipsa-lab.grenoble-inp.fr christian.lageman@montefiore.ulg.ac.be nicolas.le-bihan@gipsa-lab.grenoble-inp.fr jmanton@unimelb.edu.au Abstract Noncomm utative harmonic analysis is used to solv e a nonparametric estimation problem stated in terms of compound Poisson processes on compact Lie groups. This problem of de c omp ounding is a generalization of a similar classical problem. The prop osed solution is based on a char- acteristic function metho d. The treated problem is imp ortan t to recen t mo dels of the physical in v erse problem of multiple scattering. 1 In tro duction This pap er studies the follo wing nonparametric estimation problem. Let ( X n ) n ≥ 1 b e i.i.d. G -v alued random v ariables for some group G , and let e denote the iden tity element of G . F or example, G might b e the group of 3 × 3 orthogonal matrices, in whic h case each X n w ould b e a random 3 × 3 orthogonal matrix and e would b e the 3 × 3 iden tit y matrix. The process Y ( t ) = N ( t ) Y n =0 X n , X 0 = e, where N = ( N ( t )) t ≥ 0 is a P oisson process with parameter λ > 0, is called a G -v alued compound Poisson pro cess. If G is not comm utative, the abov e pro ducts are tak en to be ordered from left to right, and Y ( t ) is called a left comp ound Poisson pro cess. It is assumed that the random v ariables X n and 1 N ( t ) are indep enden t of eac h other, and for simplicit y , it is further assumed that the Poisson parameter λ is kno wn. The general problem is to estimate the distribution of the X n giv en partial observ ations of one or more realisations of the comp ound Poisson pro cess Y ( t ). Of sp eciﬁc in terest, is the case when m ultiple realisations of Y ( T ) are av ailable, for some ﬁxed time instan t T > 0. The real num bers form a group, with addition b eing the group operation. Cho osing G to b e this group results in the ordinary comp ound Poisson pro cess y ( t ) = P N ( t ) n =0 x n where x 0 = 0 and x n for n ≥ 1 are real-v alued i.i.d. random v ariables. Estimating the distribution of the x n is kno wn as decompounding and has b een w ell-studied [1, 2]. In the presen t pap er, decomp ounding tec h- niques are generalised to the case when G is a noncommutativ e group. This generalisation is non-trivial and requires ideas from noncomm utativ e harmonic analysis. Although group-v alued comp ound P oisson pro cesses w ere introduced b y Applebaum in [3], the corresp onding decomp ounding problem has not b een addressed in generality b efore. This pap er contributes to the relativ ely recent trend consisting in the appli- cation of noncommutativ e harmonic analysis ( i.e. harmonic analysis on groups) to estimation and in v erse problems. It addresses a nonparametric estimation problem stated in terms of comp ound Poisson processes on compact Lie groups. W e refer to this as the problem of de c omp ounding on compact Lie groups, since it directly generalizes the classical problem of decomp ounding for scalar pro cesses. This generalization is mathematically natural and is motiv ated by the physical in verse problem of m ultiple scattering. In particular, this pap er also con tributes to the mo delling of multiple scattering using comp ound Poisson pro cesses. Comp ound Poisson pro cesses mo del the accumulation of rare even ts. As suc h, scalar comp ound Poisson pro cesses are imp ortan t tools in queuing and traﬃc problems and in risk theory . The classical problem of decomp ounding arises in the context of these pro cesses. A functional approac h to this problem is given by Buchman and Gr ¨ ub el [1]. A characteristic function method is studied b y V an Es et al. [2]. The applications of decomp ounding in queuing problems and risk theory are referenced in [1]. W e generalize this problem by considering decomp ounding on compact Lie groups. W e approach this new problem b y using noncomm utative harmonic analysis to generalize the ab ov e mentioned metho d of [2]. The imp ortan t potential whic h noncommutativ e harmonic analysis holds for engineering problems is well illustrated in the b o ok of Chirikjian and Kyatkin [4]. Its imp ortance to nonparametric estimation stems from the fact that it leads to the successful generalization of the highly imp ortant concept of c haracteristic function in probability . In mathematical researc h, this generalization was pio- neered by Grenander [5] and extensively dev eloped by Hey er [6]. It has received sp ecial atten tion in the engineering communit y . See Y azici [7] and the pap ers b y Kim et al. [8, 9, 10, 11]. The paper is organized as follo ws. Section 2 sets do wn the necessary back- ground in harmonic analysis and characteristic functions on compact Lie groups. Section 3 in tro duces comp ound Poisson pro cesses on compact Lie groups. In 2 Section 4 we state the decompounding problem for these pro cesses and present our approach based on noncommutativ e harmonic analysis. In Section 5 w e prop ose a mo del for m ultiple scattering based on compound Poisson processes on the rotation group S O (3). Within this model, decomp ounding app ears as a ph ysical inv erse problem. W e apply our approach as describ ed in Section 4 to this problem using numerical simulations. 2 Characteristic functions on compact Lie groups Characteristic functions of scalar and v ector-v alued random v ariables are deﬁned using the usual F ourier transform. Their extension to random v ariables with v alues on compact Lie groups ow es to the to ols of harmonic analysis on these groups. Our presen tation of c haracteristic functions is adapted from [5, 12]. Harmonic analysis on compact Lie groups is presented in more detail in recent pap ers [8, 7]. More thorough classical references thereon include [13, 14]. Let G b e a compact connected Lie group with identit y e . W e denote b y µ the biin v ariant normalized Haar measure on G . Hilb ert spaces of square integrable (with resp ect to µ ) complex and real-v alued functions on G are noted L 2 ( G, C ) and L 2 ( G, R ). A represen tation of G is a con tin uous homomorphism π : G → GL ( V ) with V a complex Hilb ert space and GL ( V ) the group of in v ertible b ounded linear maps of V . It is called irreducible if an y G -inv ariant subspace of V is trivial i.e. equals { 0 } or V . Two representations π i : G → GL ( V i ) –with i = 1 , 2– are called equiv alent if there exists an inv ertible b ounded linear map L : V 1 → V 2 suc h that L ◦ π 1 = π 2 ◦ L . Using this relation, the set of irreducible represen tations of G is partitioned in to equiv alence classes. The central result of harmonic analysis on compact groups is the P eter- W eyl theorem. F or the current context, it can b e stated as follows. Let Irr( G ) b e the set of equiv alence classes of irreducible represen tations of G . Irr( G ) is a countable set. If δ ∈ Irr( G ) then we hav e the t w o following facts. All represen tations of the class δ hav e the same ﬁnite dimension d δ . There exists in this class a unitary represen tation U δ . Cho osing one such representation w e can suppose that U δ : G → S U ( C d δ ) with S U ( C d δ ) the group of sp ecial unitary d δ × d δ matrices. W e distinguish the unit representation δ 0 ∈ Irr( G ) where U δ 0 ( g ) = 1 for all g ∈ G . With this c hoice b eing ﬁxed, we can state the P eter-W eyl theorem. Theorem 1 (Peter-W eyl) . The functions d 1 / 2 δ U δ ij taken for δ ∈ Irr( G ) and i, j = 1 , . . . , d δ form an orthonormal b asis of L 2 ( G, C ) . Note that U δ ij is the usual notation for the matrix elements of U δ . F or all f ∈ L 2 ( G, C ) the theorem gives the F ourier pair A δ = Z f ( g ) U δ ( g ) † dµ ( g ) (1) f ( g ) = X δ ∈ Irr( G ) d δ tr( A δ U δ ( g )) (2) 3 where † denotes the Hermitian conjugate and tr the trace. The F ourier series (2) conv erges in L 2 ( G, C ). Consider the example G = S 1 . It is p ossible to mak e the identiﬁcation δ = 0 , 1 , . . . . Then U δ ( z ) = z δ for z ∈ S 1 . W ritting z = e iθ for some θ ∈ [0 , 2 π ], this gives the classical F ourier expansion of perio dic functions. W e consider random ob jects and in particular G -v alued random v ariables deﬁned on a suitable probability space (Ω , A , P ). When referring to the prob- abilit y densit y of such a random v ariable X , w e mean a probabilit y densit y p X ∈ L 2 ( G, R ) with resp ect to µ . The c haracteristic function of a G -v alued random v ariable is deﬁned as follows. Compare to [5]. Deﬁnition 1. L et X b e a G -value d r andom variable. The char acteristic func- tion of X is the map φ X given by δ 7→ φ X ( δ ) = E ( U δ ( X )) δ ∈ Irr( G ) Here E stands for exp ectation on the underlying probability space. F or all δ ∈ Irr( G ), the exp ectation in the deﬁnition is ﬁnite since U δ has unitary v alues. When X has a probabilit y density p X its characteristic function gives the F ourier co eﬃcien ts of p X as in (1). W e hav e φ X ( δ ) = E ( U δ ( X )) = Z p ( g ) U δ ( g ) dµ ( g ) δ ∈ Irr( G ) The following prop osition 1 reminds the relation b etw een characteristic func- tions and the concepts of conv olution and conv ergence in distribution. It is a generalization of classical prop erties for scalar random v ariables. Remem b er that a sequence ( X n ) n ≥ 1 of G -v alued random v ariables is said to conv erge in distribution to a random v ariable X if for all real-v alued contin uous function f on G we hav e lim n E ( f ( X n )) = E ( f ( X )) The pro of of proposition 1 is straigh tforw ard. See [5]. Prop osition 1. The fol lowing two pr op erties hold. 1. L et X and Y b e indep endent G -value d r andom variables and let Z = X Y . We have for al l δ ∈ Irr( G ) φ Z ( δ ) = φ X ( δ ) φ Y ( δ ) 2. A se quenc e ( X n ) n ≥ 1 of G -value d r andom variables c onver ges in distribu- tion to a r andom variable X iﬀ for al l δ ∈ Irr( G ) lim n φ X n ( δ ) = φ X ( δ ) In order to solve our estimation problem in section 4 we will require random v ariables to hav e certain symmetry prop erties. W e deal with these prop erties here. The following analysis draws on Liao [12, 15]. 4 W e will sa y that a G -v alued random v ariable X is in v erse in v ariant if X d = X − 1 . W e will sa y that it is conjugate inv ariant if for all k ∈ G w e ha ve that X d = k X k − 1 . As usual d = denotes equality in distribution. The following proposition 2 c haracterizes these t wo symmetry properties in terms of characteristic functions. It will be imp ortan t to remem b er that for any tw o G -v alued random v ariables X and Y we hav e X d = Y iﬀ φ X = φ Y . This results from the completeness of the basis given by the U δ as stated in the P eter-W eyl theorem [5]. Prop osition 2. The fol lowing pr op erties hold. 1. X is inverse invariant iﬀ for al l δ ∈ Irr( G ) we have that φ X ( δ ) is Hermi- tian. 2. L et X b e inverse invariant. If X 1 , . . . , X n ar e indep endent c opies of X then the pr o duct X 1 . . . X n is inverse invariant. 3. X is c onjugate invariant iﬀ for al l δ ∈ Irr( G ) we have that φ X ( δ ) = a δ I d δ wher e a δ ∈ C and I d δ is the d δ × d δ identity matrix. 4. If X and Y ar e indep endent and c onjugate invariant then X Y is c onjugate invariant. 5. X is c onjugate invariant iﬀ for al l G -value d r andom variable Y indep en- dent of X we have X Y d = Y X . Pr o of. 1. Note that for all δ ∈ Irr( G ) we ha v e b y the homomorphism prop ert y of U δ and the fact that it has unitary v alues φ X − 1 ( δ ) = E ( U δ ( X − 1 )) = E ( U δ ( X )) † = φ X ( δ ) † 2. This follo ws from 1 of proposition 2 and 1 of prop osition 1, since the p o wers of a Hermitian matrix are Hermitian. 3. Note that for all k ∈ G we ha ve that X d = k X k − 1 iﬀ for all δ ∈ Irr( G ) E ( U δ ( X )) = E ( U δ ( k X k − 1 )) = U δ ( k ) E ( U δ ( X )) U δ ( k ) † iden tifying φ X on b oth sides, this b ecomes φ X ( δ ) = U δ ( k ) φ X ( δ ) U δ ( k ) † If this relation is veriﬁed for all k ∈ G then φ X ( δ ) is a multiple of I d δ . This follows by Sch ur’s lemma [13]. 4. This follows from 3 of prop osition 2 and 1 of prop osition 1. 5. The if part follo ws by setting Y = k ∈ G for arbitrary k . The only if part follo ws from 3 of prop osition 2 and 1 of proposition 1. 5 1 of proposition 2 motiv ates a practical recip e for generating in verse inv ariant random v ariables from general random v ariables. Let X and Y b e G -v alued random v ariables. Supp ose X and Y are indep endent with Y d = X − 1 . It can b e v eriﬁed by 1 of prop osition 2 that X Y d = Y X and that b oth these pro ducts are in verse inv arian t. In practice, if we hav e generated X then we can immediately generate Y as ab o v e. In this wa y an inv erse in v ariant X Y or Y X is generated from X . 3 Comp ound P oisson Pro cesses Comp ound Poisson pro cesses on groups naturally generalize scalar comp ound P oisson pro cesses. They are in troduced by Applebaum in [3]. Let us start b y reminding the deﬁnition of scalar comp ound P oisson processes. Let N = ( N ( t )) t ≥ 0 b e a Poisson pro cess with parameter λ > 0. Suppose ( x n ) n ≥ 1 are i.i.d. R -v alued random v ariables. Suppose the family ( x n ) n ≥ 1 is itself indep enden t of N . The following pro cess y is said to b e a comp ound Poisson pro cess y ( t ) = N ( t ) X n =0 x n G -v alued comp ound P oisson pro cesses are deﬁned by analogy to this form ula. W e con tinue with the pro cess N . Let ( X n ) n ≥ 1 b e i.i.d. G -v alued random v ariables and supp ose as b efore that the family ( X n ) n ≥ 1 is indep enden t of N . The following pro cess Y is said to b e a G -v alued left comp ound Poisson pro cess Y ( t ) = N ( t ) Y n =0 X n W e understand that pro ducts are ordered from left to right. It is p ossible to obtain a right comp ound Poisson pro cess by considering Y ( t ) − 1 instead. Thus the tw o concepts are equiv alent. See [12, 3]. Before going on, w e make the following remark on the abov e deﬁnition of comp ound Poisson pro cesses. This deﬁnition w as stated for G a compact con- nected Lie group. This top ological and manifold structure of G is not necessary for the deﬁnition, which can b e stated in its abov e form for an y group with a measurable space structure. The compact connected group structure of G allo ws us to use the P eter-W eyl theorem and c haracteristic functions. The Lie group structure allows the in troduction of Brownian noise in Section 4. W e wish to summarize the symmetry prop erties of the random v ariables Y ( t ) for t ≥ 0. Note ﬁrst that for all t ≥ 0, Y ( t ) do es not hav e a probabilit y density . Indeed, for all t ≥ 0 we hav e P ( Y ( t ) = e ) ≥ P ( N ( t ) = 0) = e − λt . It follows that Y ( t ) has an atom at e . In the absence of a probability densit y , we study Y ( t ) for t ≥ 0 using its characteristic function. This is giv en in the following Proposition 3 whic h can b e seen to immediately generalize the w ell kno wn form ula for scalar comp ound Poisson pro cesses. This proposition follo ws [12, 3]. 6 Prop osition 3. F or al l t ≥ 0 the char acteristic function φ Y ( t ) of Y ( t ) is given by φ Y ( t ) ( δ ) = exp( λt ( φ X ( δ ) − I d δ )) (3) for δ ∈ Irr( G ) , wher e φ X ≡ φ X 1 . Pr o of. Let t ≥ 0. φ Y ( t ) can be calculated by conditioning ov er the v alues of N ( t ). Using the independence of N and ( X n ) n ≥ 1 w e hav e for δ ∈ Irr( G ) φ Y ( t ) ( δ ) = e − λt X n ≥ 0 ( λt ) n n ! E n Y m =0 U δ ( X m ) Using the fact that ( X n ) n ≥ 1 are i.i.d. it is p ossible to replace E n Y m =0 U δ ( X m ) = n Y m =0 E ( U δ ( X m )) = φ X ( δ ) n the prop osition follows by rearranging the sum. Com bining Prop ositions 3 and 2 we ha v e the following proposition. It states that for all t ≥ 0 the symmetry prop erties of Y ( t ) are the same as those of the X n . Prop osition 4. F or al l t ≥ 0 we have 1. If X 1 is inverse invariant then so is Y ( t ) . 2. If X 1 is c onjugate invariant then so is Y ( t ) . W e end this section with Proposition 5. It giv es a prop ert y of uniformization of the distribution of Y ( t ) as t ↑ ∞ . This is similar to the b eha vior of the pro ducts X 1 . . . X n for n ↑ ∞ , see [5]. F or a more general version of Prop osition 5 see [12, 15]. W e say that a G -v alued random v ariable X is supp orted by a measurable subset S of G if P ( X ∈ S ) = 1. If X and X 0 are G -v alued random v ariables with X d = X 0 then X is supported by S iﬀ X 0 is supp orted by S . In Prop osition 5, U is a G -v alued random v ariable with probability densit y iden tically equal to 1. That is, U is uniformly distributed on G . Prop osition 5. If X 1 is not supp orte d by any close d pr op er sub gr oup S of G or c oset g S , g ∈ G of such a sub gr oup then Y ( t ) c onver ges in distribution to U as t ↑ ∞ . Pr o of. Under the conditions of the prop osition we ha v e for all for all δ 6 = δ 0 that the eigenv alues of φ X ( δ ) are all < 1 in mo dulus [5]. It follows that the eigen v alues of φ X ( δ ) − I d δ all ha ve negative real parts. Th us when δ 6 = δ 0 w e ha ve by (3) that φ Y ( t ) ( δ ) → 0 as t ↑ ∞ . Moreo v er, it is immediate that φ Y ( t ) ( δ 0 ) = 1 for t ≥ 0. W e conclude using 2 of Prop osition 1. Note that [13] φ U ( δ ) = Z U δ ( g ) dµ ( g ) = 0 δ 6 = δ 0 and φ U ( δ 0 ) = 1 trivially . 7 4 Decomp ounding In existing literature, de c omp ounding refers to a set of nonparametric estimation problems inv olving scalar comp ound Poisson pro cesses [1, 2]. In this section we will consider the generalization of these problems to compound P oisson pro- cesses on compact Lie groups. The new problems can b e stated in the notation of Section 3. W e refer to them also as decompounding problems. As in the scalar case, they consist in estimation of the common probabilit y densit y (sup- p osed to exist) of the random v ariables X n from observ ations of the pro cess Y . The unknown common probability density of the X n will b e noted p . W e are unaw are of any work on similar problems for v ector-v alued comp ound Pois- son pro cesses. Our consideration of compact Lie groups is motiv ated by the applications presented in Section 5. 4.1 T yp ology of decompounding problems Sev eral decomp ounding problems can b e stated, dep ending on the nature of the observ ations made of Y [2]. Decomp ounding is p erformed from high fr e quency observ ations if an individual tra jectory of the pro cess Y is observed o v er time in terv als [0 , T ] where T ↑ ∞ . It is p erformed from low fr e quency observ ations if i.i.d. observ ations are made of the random v ariable Y ( T ) for a ﬁxed T ≥ 0. Decomp ounding from high and lo w frequency observ ations lead to diﬀerent diﬃculties. F or high frequency observ ations, the problem is greatly simpliﬁed if the assumption is made that X n do es not take the v alue e , for any n ≥ 1. With probabilit y 1, a tra jectory of N has inﬁnitely many jumps ov er t ≥ 0. Under the assumption w e hav e made, all these jumps corresp ond to jumps of Y whic h w e do observe. The jumps of Y then give i.i.d. observ ations of X 1 and the a verage time b et w een these jumps is 1 /λ . In particular, it is imp ortan t for high frequency observ ations to take the limit T ↑ ∞ . Lo w frequency observ ations do not give direct access to λ . In scalar decom- p ounding from low frequency observ ations, λ is often assumed to b e kno wn [1, 2]. In the context of a compact group G , Prop osition 5 leads to a diﬃculty that do es not app ear in scalar decomp ounding. Under the conditions of this prop o- sition, if low frequency observ ations are made at a suﬃciently large time T then these observ ations will b e uniformly distributed on G and will hav e no memory of the random v ariables X n . A third intermediate type of observ ations is p ossible. It is p ossible to make observ ations of an individual tra jectory of Y at regular time interv als T , 2 T , . . . . This is in fact equiv alent to low frequency distributions. Remember that N is a L´ evy pro cess, i.e. has indep enden t stationary increments. Moreov er we hav e that the ( X n ) n ≥ 1 are i.i.d . Using this, it is p ossible to pro ve that the G -v alued random v ariables Y ( T ) , Y ( T ) − 1 Y (2 T ) , Y (2 T ) − 1 Y (3 T ) . . . are i.i.d . Th us our observ ations are i.i.d. observ ations of Y ( T ). This remark refers to the fact that Y is a left L´ evy pro cess in G [12]. W e do not develop this 8 here. 4.2 Noise mo del for low frequency observ ations W e will consider decompounding from lo w frequency observ ations. T ≥ 0 is ﬁxed and i.i.d. observ ations ( Z n ) n ≥ 1 of a noisy version Z of Y ( T ) are av ailable. Z is given b y Y corrupted by multiplicativ e noise. W e hav e the noise mo del Z = M Y ( T ) (4) where M is indep endent of Y . By 1 of Prop osition 1 we hav e for the character- istic function of Z φ Z = φ M φ Y ( T ) The noise mo del is equiv alent to having an initial v alue Y (0) = M with a general distribution. W e consider the case of Bro wnian noise. The c haracteristic function of M is then giv en b y [12, 8] φ M ( δ ) = exp  − λ δ σ 2 2  I d δ where σ 2 is a v ariance parameter and for δ ∈ Irr( G ) the constan t λ δ is the corresp onding eigen v alue of the Laplace-Beltrami operator. In particular, λ δ 0 = 0 and λ δ > 0 for δ 6 = δ 0 . It is clear from 3 of Prop osition 2 that M is conjugate in v ariant. It follows by 4 of Prop osition 2 that, as far as the distribution of Z is concerned, left and right multiplication of Y ( T ) by the noise M are indiﬀerent. It is p ossible to construct a G -v alued pro cess ζ such that Z d = ζ ( T ). The corresp onding construction is w ell kno wn in the theory of group-v alued L´ evy pro cesses and is referred to as interlacing [3, 12]. Here we only state this con- struction. Let W b e a Brownian motion on G independent of N and with v ariance parameter ¯ σ 2 . This is a pro cess with contin uous paths and indep en- den t stationary increments. Moreov er, W (0) = e and for δ ∈ Irr( G ) φ W ( t ) ( δ ) = exp  − λ δ ¯ σ 2 2 t  I d δ Let T 0 = 0 and supp ose ( T n ) n ≥ 1 are the jump times of N . The interlaced pro cess ζ is deﬁned as follows. W e ha v e ζ (0) = e . F or t > 0 and n ≥ 1 we hav e ζ ( t ) = ζ ( T n − 1 ) W ( T n − 1 ) − 1 W ( t ) on { T n − 1 ≤ t < T n } where the following form ula holds at each time T n (here ζ ( T n − ) denotes the left limit at T n ) ζ ( T n ) = ζ ( T n − ) X n This deﬁnition is suﬃcient, since T n ↑ ∞ almost surely . The term interlacing comes from the fact that the tra jectories of ζ are obtained by introducing the jumps of Y in to the tra jectories of W as these jumps o ccur. The tra jectories of W are thus interlaced with the jumps of Y . 9 F or t ≥ 0 the characteristic function of ζ ( t ) is giv en by φ ζ ( t ) ( δ ) = exp  tλφ X ( δ ) − tI d δ  λ + λ δ ¯ σ 2 2  (5) for δ ∈ Irr( G ). It follo ws that w e hav e Z d = ζ ( T ) if T ¯ σ 2 = σ 2 . Although we do not deal with the case of high frequency observ ations we w ould like to end this subsection with a remark on the role of noise in this case. The tra jectories of the interlaced pro cess ζ are noisy versions of the tra jectories of Y . How ev er, these tra jectories hav e the same jumps as the tra jectories of Y . In this sense, high frequency observ ations are unaltered by noise. 4.3 A characteristic function method W e present a characteristic function metho d for decomp ounding from lo w fre- quency observ ations. This metho d extends a similar one considered in [2]. In carrying out this extension, w e are guided b y the prop erties of c haracteristic functions on G presented in Section 2. Our observ ations ( Z n ) n ≥ 1 and noise mo del (4) w ere described in 4.2. W e aim to estimate the common densit y p of the X n . A characteristic function metho d consists in constructing nonparamet- ric estimates for p from parametric estimates for its F ourier co eﬃcien ts φ X ( δ ) giv en for δ ∈ Irr( G ). See [8]. W e supp ose that λ and σ 2 are known. Equation (5) can b e copied as follows φ Z ( δ ) = exp  T λφ X ( δ ) − T ¯ λI d δ  δ ∈ Irr( G ) (6) where ¯ λ is a constant determined by λ and σ 2 . W e refer to this transformation φ X 7→ φ Z as the compounding transformation. Decompounding will inv olve lo cal inv ersion of the comp ounding transformation. This is clearly related to in version of the matrix exponential in a neigh b orhoo d of φ Z ( δ ) for all δ ∈ Irr( G ). Rather than deal with this problem in general, w e make the following simplifying h yp othesis. Hyp othesis: X 1 is inv erse in v ariant. F or all δ ∈ Irr( G ) w e ha v e b y applying 1 of Proposition 2 and (6) to this h ypoth- esis that φ Z ( δ ) is Hermitian p ositiv e deﬁnite. Note Log the unique Hermitian matrix logarithm of a hermitian p ositiv e deﬁnite matrix. W e can now express the inv erse of the comp ounding transformation. F rom equation (6) it follows that φ X ( δ ) = 1 T λ Log [ φ Z ( δ )] +  ¯ λ/λ  I d δ δ ∈ Irr( G ) (7) Let δ ∈ Irr( G ). It follows from deﬁnition 1 that empirical estimates of φ Z ( δ ) based on the observ ations ( Z n ) n ≥ 1 are un biased and consisten t. This is a simple consequence of the strong law of large num b ers. See for example [16]. In order to estimate φ X ( δ ) using (7) it is then imp ortan t to ensure that the empirical estimates of φ Z ( δ ) are asymptotically Hermitian p ositive deﬁnite. 10 W e start by deﬁning the empirical estimates ˆ φ n Z ( δ ) for δ ∈ Irr( G ) and n ≥ 1 ˆ φ n Z ( δ ) = 1 2 n n X m =1  U δ ( Z m ) + U δ ( Z m ) †  Hermitian symmetrization of empirical estimates is necessary for the applica- tion of (7). Since it is a pro jection op erator, this symmetrization moreov er con tributes to a faster conv ergence of the ˆ φ n Z ( δ ) to φ Z ( δ ). Con tinuous dependence of the sp ectrum of a matrix on its co eﬃcien ts is a classical result in matrix analysis. Sev eral more or less sophisticated versions of this result exist [17]. F or a remark ably straightforw ard statement see [18]. F or a complex matrix C we will note λ ( C ) its sp ectrum. F or each δ ∈ Irr( G ) and n ≥ 1 deﬁne the even t R n δ b y R n δ = { λ ( ˆ φ n Z ( δ )) ⊂ ]0 , ∞ [ } F or δ ∈ Irr( G ), the sequence ( R n δ ) n ≥ 1 con trols the conv ergence of the sp ectra of the empirical estimates ˆ φ n Z ( δ ). In particular, P ( ∪ n ≥ 0 ∩ m ≥ n R m δ ) = lim n P ( ∩ m ≥ n R m δ ) = 1 Using the even ts R n δ w e can write down w ell deﬁned estimates of φ X . These are noted ˆ φ n X ( δ ) for δ ∈ Irr( G ) and n ≥ 1 ˆ φ n X ( δ ) = 0 on Ω − R n δ ˆ φ n X ( δ ) = 1 T λ Log h ˆ φ n Z ( δ ) i +  ¯ λ/λ  I d δ on R n δ This expression gives our parametric estimates for the F ourier co eﬃcien ts of p . W e use them to construct nonparametric estimates based on an expression of the form (2). Let (Γ l ) l ≥ 1 b e an increasing sequence of ﬁnite subsets Γ l ⊂ Irr( G ) with the limit ∪ l ≥ 1 Γ l = Irr( G ) − { δ 0 } . Let K ≥ 0 and for eac h δ ∈ Irr( G ) note f δ = d δ e − K λ δ F or n ≥ 1 and l ≥ 1 our nonparametric estimate ˆ p n l is given by ˆ p n l ( g ) = 1 + X δ ∈ Γ l f δ tr  ˆ φ n X ( δ ) U δ ( g ) †  g ∈ G (8) The subscript l ≥ 1 corresp onds to a cutoﬀ or smo othing parameter. Indeed, inﬁnitely man y representations are excluded from the sum ov er Γ l . A more complete expression of this fact app ears in [8]. When K > 0 the co eﬃcien ts f δ form a conv olution mask ensuring that the estimates ˆ p n l can be tak en to con verge to a smooth probability densit y . W e make this more precise in 4.4. It is usual to rewrite expressions similar to (8) in terms of a group inv ariant k ernel. See [8, 9]. Such a transformation is not p ossible here due to the indirect nature of our observ ations. This is in particular related to the more in v olv ed form of the ˆ φ n X ( δ ) as given ab o v e. 11 4.4 Con v ergence of parametric and nonparametric esti- mates Here w e discuss the conv ergence of the parametric and nonparametric estimates giv en in 4.3. Our argument is presented in the form of Prop ositions 6 and 7 b elo w. Prop osition 6 giv es the consistency of the parametric estimates ˆ φ n X ( δ ). Prop osition 7 states a subsequen t result for the nonparametric estimates ˆ p n l . F or Prop osition 6 w e will need inequalities (9) and (10). These express stabilit y results for the eigenv alues of Hermitian matrices and for the Hermitian matrix function Log. Let A and B b e Hermitian d × d matrices, for some d ≥ 1. F or 1 ≤ i ≤ d let α i and β i b e the eigenv alues of A and B respectively . Supp ose they are arranged in nondecreasing order. W e hav e d X i =1 ( β i − α i ) 2 ≤ | B − A | 2 (9) where | . | is the Euclidean matrix norm. This inequality is kno wn as the Wielandt- Hoﬀman theorem. In [17], it is stated for A and B real symmetric. The general case of Hermitian A and B can b e obtained from this statement using a canon- ical realiﬁcation isomorphism. Supp ose A and B are p ositive deﬁnite. F or our purpose it is suitable to assume both λ ( A ) and λ ( B ) are contained in an in terv al [ k , 1] for some k > 0. Under this assumption we hav e the follo wing Lipschitz prop ert y | Log( B ) − Log ( A ) | ≤ √ dk − 2 | B − A | (10) In order to obtain (10) it is p ossible to start by expressing Log( A ) as follo ws Log( A ) = Z 1 0 ( A − I d )[ t ( A − I d ) + I d ] − 1 dt This expression results from a similar one for the real logarithm applied to each eigen v alue of A . Subtracting the same expression for Log ( B ), (10) follows by simple calculations. Prop osition 6. F or al l δ ∈ Irr( G ) we have the limit in pr ob ability lim n ˆ φ n X ( δ ) = φ X ( δ ) . Pr o of. W e only need to consider δ 6 = δ 0 . Indeed, ˆ φ n X ( δ 0 ) = φ X ( δ 0 ) = 1 for all n ≥ 1. Let δ 6 = δ 0 , for all n ≥ 1 we hav e | ˆ φ n Z ( δ ) | op ≤ 1 2 n n X m =1 | U δ ( Z m ) | op + | U δ ( Z m ) † | op = 1 where | . | op is the op erator matrix norm. P assing to the limit, we hav e the same inequalit y for φ Z ( δ ). It follo ws that all eigen v alues of ˆ φ n Z ( δ ) or φ Z ( δ ) are ≤ 1. 12 Since φ Z ( δ ) is p ositive deﬁnite, there exists k δ > 0 such that λ ( φ Z ( δ )) ⊂ [ k δ , 1]. F or n ≥ 1, note ˜ R n δ the even t ˜ R n δ = { λ ( ˆ φ n Z ( δ )) ⊂ [ k δ / 2 , 1] } F rom inequality (9) we hav e P (Ω − ˜ R n δ ) ≤ P ( | ˆ φ n Z ( δ ) − φ Z ( δ ) | > k δ / 2) Since ˜ R n δ ⊂ R n δ , it follows from inequalit y (10) that P ( | ˆ φ n X ( δ ) − φ X ( δ ) | > ε ∩ ˜ R n δ ) ≤ P ( | ˆ φ n Z ( δ ) − φ Z ( δ ) | > k 2 δ ε/L ) for all ε > 0, where L = 4 √ d δ /T λ . The pro of can be completed b y a usual application of Cheb ychev’s inequality , P ( | ˆ φ n X ( δ ) − φ X ( δ ) | > ε ) ≤  8 + 2 L 2 /ε 2 n   √ d δ k 2 δ  2 (11) for all ε > 0. Prop osition 7 relies on Prop osition 6 and the Peter-W eyl theorem. It im- plies the existence of sequences ( ˆ p k ) k ≥ 1 , of nonparametric estimates given by (8), conv erging to p in probability in L 2 ( G, C ) with any prescrib ed rate of con- v ergence. Con v ergence in probabilit y in L 2 ( G, C ) means that the following limit in probability holds lim k k ˆ p k − p k = 0 where k . k is the L 2 ( G, C ) norm. It is clear from (8) that for all k ≥ 1 w e hav e ˆ p k ∈ L 2 ( G, C ). In order to obtain nonparametric estimators in L 2 ( G, R ) and con verging to p in the same sense, it is enough to consider the real parts of the ˆ p k . The following pro of of Prop osition 7 implicitly uses Plancherel’s formula as in [8]. Prop osition 7. Putting K = 0 in (8), we have the limit in pr ob ability lim l lim n k ˆ p n l − p k = 0 Pr o of. F or l ≥ 1 let p l ∈ L 2 ( G, C ) b e given by p l ( g ) = 1 + X δ ∈ Γ l tr  φ X ( δ ) U δ ( g ) †  for g ∈ G . By the Peter-W eyl theorem, lim l k p l − p k = 0. By (8) and Proposition 6 w e hav e lim n k ˆ p n l − p l k = 0 in probability for all l ≥ 1. The prop osition follo ws b y observing that k ˆ p n l − p k 2 = k ˆ p n l − p l k 2 + k p l − p k 2 (12) for all n, l ≥ 1. 13 Prop osition 6 obtained conv ergence in probability of the parametric esti- mates ˆ φ n X ( δ ) for all δ ∈ Irr( G ). These parametric estimates dep end only on the observ ations. In particular, they can b e ev aluated without an y a priori kno wl- edge of p . By introducing suc h knowledge, it is possible to deﬁne parametric estimates ˜ φ n X ( δ ) con v erging in the square mean to the same limits φ X ( δ ). F or δ ∈ Irr( G ) and n ≥ 1 the ˜ φ n X ( δ ) are given by ˜ φ n X ( δ ) = 0 on Ω − ˜ R n δ ˜ φ n X ( δ ) = 1 T λ Log h ˆ φ n Z ( δ ) i +  ¯ λ/λ  I d δ on ˜ R n δ where the even ts ˜ R n δ are as in the pro of of Prop osition 6 and we assume known a priori constants k δ necessary for their deﬁnition. As in (8), we can deﬁne nonparametric estimates ˜ p n l where for n, l ≥ 1 ˜ p n l ( g ) = 1 + X δ ∈ Γ l f δ tr  ˜ φ n X ( δ ) U δ ( g ) †  g ∈ G F or all δ ∈ Irr( G ) and n ≥ 1 we hav e E | ˜ φ n X ( δ ) − φ X ( δ ) | 2 ≤ L 0 n  d δ k 2 δ  2 (13) where L 0 is a constan t dep ending on the product T λ . This follows by a reasoning similar to the pro of of Prop osition 6. Moreov er, for all n, l ≥ 1 w e ha ve after putting K = 0 E k ˜ p n l − p k 2 ≤ L 0 n X δ ∈ Γ l ( d 3 δ /k 4 δ ) + k p l − p k 2 (14) for the functions p l deﬁned in the pro of of Prop osition 7. This follows from Planc herel’s formula in (12). W e ha v e characterized the con v ergence of parametric estimates using (11) and (13) and the con v ergence of nonparametric estimates using (12) and (14). W e make the following remarks on these form ulae. Inequalities (11) and (13) only give gross b ounds for the rate of conv ergence of parametric estimates. The qualit y of these b ounds impro ves when the constants k δ are greater, i.e. closer to the v alue 1. This is equiv alent to the L 2 ( G, R ) distance b etw een p and the uniform density b eing greater. This last p oin t can b e appreciated in relation to the example of ﬁgure 5.3 in 5.3. (12) and (14) describ e the conv ergence of nonparametric estimates in a wa y similar to the one used in [8]. Indeed, the nonparametric estimation error is decomp osed into tw o terms. One is giv en by the parametric estimation error and the other dep ends only on p . This second term is given by the conv ergence of the F ourier series of p . This is determined by the smo othness prop erties of p . W e note the tw o following diﬀerences with [8], b oth related to the indirect nature of our observ ations. First, the ﬁrst and second terms in (14) can not b e identiﬁed as the ”v ariance” and ”bias” of ˜ p n l . Second, (14) c haracterizes 14 the nonparametric estimation error as dep ending on the whole sp ectrum of p –through the constants k δ – rather than just its smo othness prop erties. W e ﬁnally return to the role of the parameter K introduced in (8). F or simplicit y , we ha ve put K = 0 for Prop osition 7 and inequality (14). Let K > 0. The follo wing function p K ∈ L 2 ( G, R ) is an inﬁnitely diﬀeren tiable probabilit y densit y [12, 8] p K ( g ) = 1 + X δ 6 = δ 0 f δ tr( A δ U δ ( g ) † ) (15) Using the same K in (8) and proceeding as for proposition 7 it is p ossible to obtain the limit in probabilit y lim l lim n k ˆ p n l − p K k = 0 A similar limit also holds for the ˜ p n l . Note that in addition to b eing smo oth, p K can b e chosen arbitrarily close to p in L 2 ( G, R ) for K > 0 small enough. 5 Decomp ounding on S O (3) and m ultiple scat- tering This section fulﬁlls t w o goals. First, it summarizes recent use of comp ound P oisson pro cesses on the rotation group S O (3) in the mo delling of multiple scattering and introduces decomp ounding on S O (3) as a physical inv erse prob- lem. Second, it illustrates the characteristic function metho d presented in 4.3 by applying it to a n umerical example of decomp ounding on S O (3). nonparametric estimation on the rotation group S O (3) has received special attention [11, 9]. It is imp ortan t to man y concrete applications and constitutes a privileged starting p oin t for generalization to compact groups. 5.1 The comp ound P oisson mo del for m ultiple scattering Man y experimental and applied settings aim to infer the properties of com- plex, e.g. geophysical or biological, media by considering multiple scattering of mec hanical or electromagnetic wa v es b y these media. Inference problems aris- ing in this wa y are formulated as physical in verse problems within the frame- w ork of v arious approximations of the exact equations of radiativ e transfer. See [19, 20, 21]. A comp ound Poisson mo del for the direct problem of m ultiple scattering w as considered by Ning et al. [22]. It is based on a R -v alued comp ound Poisson pro cess. Consideration of comp ound Poisson processes on S O (3) leads to a mo del of m ultiple scattering whic h is suﬃciently precise as well as amenable to statistical treatment. This mo del extends the v alidit y of the small angles appro ximation of radiative transfer. It also allows the form ulation of the ph ysical in verse problem of multiple scattering as a statistical nonparametric estimation problem. 15 W e give an example expanding the ab o v e discussion. The dev elopmen t of Section 3 is con verted in to the terminology of radiativ e transfer, see [23]. Certain usual results in harmonic analysis on S O (3) are here referred to freely . They are set down in a precise form in 5.2. A scalar plane wa ve is p erpendicularly inciden t up on a plane parallel mul- tiple scattering lay er of thickness H . The velocity of the wa v e in the la y er is normalized so that w e hav e τ = ` for the mean free time τ and mean free path ` . Coordinates and time origin are chosen so that the w av e enters the la y er at time 0 with direction of propagation s (0) = (0 , 0 , 1). After time t in the la yer this direction of propagation becomes s ( t ) = ( s 1 ( t ) , s 2 ( t ) , s 3 ( t )). This is considered to b e a random v ariable with v alues on the unit sphere S 2 ⊂ R 3 . The distribution of the random v ariable s ( H ) is noted I H . It is iden tiﬁed with the normalized angular pattern of intensit y transmitted by the la y er. W e return b elo w to the v alidity of this iden tiﬁcation. The interaction of the w a v e with the lay er tak es place in the form of a succession of scattering ev ents. These are understo o d as in teraction of the w av e with individual scatterers presen t at random emplacemen ts throughout the la yer. The random num b er of scattering ev en ts up to time 0 ≤ t ≤ H will b e noted N ( t ). Supp ose the n th scattering even t tak es place at the time 0 ≤ T n ≤ H . This aﬀects the direction of propagation as follo ws s ( T n ) = s ( T n − ) X n (16) Here X n is a random v ariable with v alues in S O (3). It is identiﬁed with a random orthogonal matrix. F ormula (16) is understo od as a matrix equality where s ( T n ) and s ( T n − ) are line v ectors. F rom (16) and the deﬁnition of N ( t ) w e can write for 0 ≤ t ≤ H s ( t ) = s (0)   N ( t ) Y n =0 X n   (17) A certain num b er of standard ph ysical h yp otheses can be replaced in (17). This will allow for the random pro duct therein to b e exhibited as a conjugate in v ariant comp ound Poisson pro cess on S O (3). Under the condition `  H it is p ossible to make the hypothesis that the time b et ween successiv e scattering ev en ts has an exponential distribution [21]. This allo ws us to model N ( t ) as a Poisson process with parameter 1 /` . More- o ver, we supp ose the scatterers identical and scattering even ts indep enden t. This amoun ts to taking the S O (3)-v alued random v ariables X n to b e i.i.d. . If the additional assumption is accepted that the num b er of scattering even ts is indep enden t of the whole outcome of these even ts then form ula (17) can be rewritten 0 ≤ t ≤ H s ( t ) = s (0) Y ( t ) (18) Where Y is a (left) comp ound Poisson pro cess on S O (3) with parameter 1 /` . It is usual to assume that the random v ariables X n ha ve a common probability 16 densit y p . F or homogeneit y with 4 we mention that p is a square integrable probabilit y density with resp ect to the Haar measure of S O (3). In the theory of radiative transfer, p is known as the phase function of the lay er [23]. In order to simplify the F ourier series of p to a Legendre series (22) w e proﬁt from the physical h yp othesis of statistical isotropy . This implies that scattering even ts in the lay er as giv en by (16) are symmetric around the direction of propagation s ( T n − ). Statistical isotropy is a v alid assumption in a plurality of concrete situations. It is v eriﬁed by analytical mo dels such as Gaussian and Hen yey-Greenstein phase functions, commonly used to describ e scattering in geoph ysical and biological media [24]. Under the hypothesis of statistical isotropy the phase function p is a zonal function in the sense precised in 5.2. It admits a Legendre series (22) wherein the co eﬃcients a δ for δ ∈ N are said to form the asso ciated p ow er sp ectrum of heterogenities [23]. If p is the Heny ey-Greenstein phase function then the p o wer sp ectrum of heterogenities is giv en by a δ = g δ for δ ∈ N and p can b e expressed in the closed form [24, 25] p (cos θ ) = 1 − g 2 (1 + g 2 − 2 g cos θ ) 3 2 (19) In this form ula the v ariable θ ∈ [0 , π ] refers to the scattering angle from an individual scatterer. It is giv en a mathematical deﬁnition in form ula (22) of 5.2. The parameter g ∈ [0 , 1[ is called the anisotrop y or asymmetry parameter. It can b e shown to giv e the av erage cosine of the scattering angle θ . F or the scattering of ligh t w av es by w ater clouds and bloo d w e hav e resp ectiv ely g = 0 . 85 and g = 0 . 95, see [25]. Prop osition 3 of Section 3 can b e used to give the angular pattern of trans- mitted intensit y I H in terms of the p o w er sp ectrum of heterogenities. This is expressed in the follo wing equation (20). This relates the directly observ able outcome of m ultiple scattering in the lay er to the constitutive microscopic prop- erties of the lay er, t ypically quite diﬃcult to ascertain directly . Replacing in Prop osition 3 the deﬁnition of the pro cess Y of (18) and using the Legendre series (22) of p w e hav e I H ( θ ) 2 π = X δ ≥ 0 (2 δ + 1) e H ` ( a δ − 1) Z θ 0 P δ (cos ξ ) sin ξ dξ (20) F or the ratio I H ( θ ) of in tensity transmitted within a pencil of angle 2 θ around s (0). Equation (20) is well known in the small angles appro ximation of radiative transfer where it is deriv ed under the assumption of strong forward scatter- ing [23]. Mathematically , this translates into a phase function p with a sharp p eak around θ = 0. Our probabilistic developmen t of equation (20) do es not explicitly mak e this assumption. How ever, the iden tiﬁcation of I H with the angular pattern of transmitted intensit y implicitly requires for all the intensit y of the wa v e en tering the lay er to b e transmitted. This precludes an imp ortant deviation b et w een s (0) and s ( H ). 17 Equation (20) is an in teresting starting p oin t for the form ulation of the ph ysical in verse problem of multiple scattering. Supp osing a situation where this equation holds, being able to inv ert it implies access to the p o wer spec- trum of heterogenities or alternativ ely the phase function from direct in tensit y measuremen ts. This implies inference of physical parameters such as the param- eter g of the Heny ey-Greenstein phase function or determination of microscopic prop erties such as the shap e of individual scatterers [25]. Our use of comp ound Poisson pro cesses on S O (3) to mo del multiple scat- tering lead to the probabilistic counterpart (18) of equation (20). In relation to (18), the physical in v erse problem inherent to equation (20) is reform ulated as a statistical estimation problem. This app ears as the problem of decomp ounding on S O (3) or some related parametric estimation problem. A crucial diﬀerence b et ween the tw o approac hes is that they proceed from diﬀeren t types of data. Supp ose the distribution of s (0) is known and symmetric around (0 , 0 , 1) –this is the case in many exp erimen tal settings. Instead of carrying out mea- suremen ts of transmitted intensit y , it is p ossible to mak e observ ations of s ( H ). Under the h ypothesis of statistical isotrop y these observ ations of s ( H ) are equiv- alen t to observ ations of Y ( H ). If our ob jectiv e is to estimate the phase function p then we hav e to deal with decomp ounding on S O (3) from low frequency ob- serv ations of Y . In many cases, w e could b e interested in the pow er sp ectrum of heterogenities or some related physical parameters. W e then hav e to deal with a parametric estimation problem. 5.2 Harmonic analysis on S O (3) W e here make a short digression on harmonic analysis on S O (3) in order to clarify the references made to this sub ject in 5.1 and to prepare for 5.3. S O (3) is often used as the arc het ype compact connected Lie group. Essentially , we will sp ecify the Peter-W eyl theorem as stated in Section 2 to the case G = S O (3). F or the follo wing see [9] or the more detailed account in [4]. W e use the notation of Section 2. In particular, µ denotes the Haar measure of S O (3). It is p ossible to identify Irr( S O (3)) = N so that d δ = 2 δ +1 for eac h δ ∈ Irr( S O (3)). With this identiﬁcation, the most current c hoice of functions U δ : S O (3) → S U ( d δ ) can b e given in analytical form using the parameterization of S O (3) by Euler angles. The Z Y Z Euler angles ϕ, ψ ∈ [0 , 2 π ] and θ ∈ [0 , π ] are w ell deﬁned coordi- nates only on a subset of S O (3). This is how ev er a dense subset in the Euclidean top ology of S O (3) and has Haar measure equal to 1. Let p : S O (3) → C . If p is con tinuous or p ∈ L 2 ( S O (3) , C ) it follo ws that p can b e identiﬁed with a func- tion of the Euler angles p ≡ p ( ϕ, θ, ψ ). The c hosen functions U δ are extended b y contin uity from the following expression for their matrix elements U δ ab ( ϕ, θ , ψ ) = e − i aϕ d δ ab (cos θ ) e − i bψ (21) for δ ∈ Irr( S O (3)) and − δ ≤ a, b ≤ δ . The notation d δ ab is used for the real- v alued Wigner d-functions, which can b e giv en in terms of the Jacobi p olyno- 18 mials. F or δ ∈ Irr( S O (3)) w e ha v e d δ 00 = P δ the Legendre polynomial of order δ . The Haar measure µ is expressed in the co ordinates ( ϕ, θ , ψ ) as follo ws dµ ( ϕ, θ , ψ ) = 1 8 π 2 sin θ dϕdθdψ Supp ose a function p ∈ L 2 ( S O (3) , C ) is expressed in the form p ( ϕ, θ , ψ ). In order to obtain its F ourier co eﬃcien ts, it is enough to replace the ab o v e expressions for the functions U δ and µ in formula (1). This formula then reduces to a triple in tegral. By the Peter-W eyl theorem, the F ourier co eﬃcien ts of p give rise to a F ourier series appro ximating p in L 2 ( S O (3) , C ). The class of zonal functions on S O (3) arises in relation to the hypothe- sis of statistical isotrop y mentioned in 5.1. W e will sa y that a function p ∈ L 2 ( S O (3) , C ) is zonal if p ≡ p ( θ ). That is, if the expression of p in the co or- dinates ( ϕ, θ , ψ ) dep ends only on θ . Zonal functions form a closed subspace of p ∈ L 2 ( S O (3) , C ). If p is a zonal function then its F ourier series reduces to a Legendre series p ( θ ) = X δ ≥ 0 (2 δ + 1) a δ P δ (cos θ ) (22) where for δ ≥ 0 the Legendre co eﬃcient a δ is given by a δ = 1 2 Z π 0 p ( θ ) P δ (cos θ ) sin θ dθ (23) Iden tities (22) and (23) can b e found as follows. Let p b e a zonal function. F or δ ∈ Irr( S O (3)) let A δ b e the F ourier co eﬃcien ts of p obtained by replacemen t in (1). The matrix elements of each A δ are noted A ab δ for − δ ≤ a, b ≤ δ . F or all δ, a, b as abov e we hav e that A ab δ is given by –this follo ws using (1) 1 8 π 2 Z 2 π 0 Z π 0 Z 2 π 0 e i bϕ p ( θ ) d δ ba (cos θ ) e i aψ sin θ dϕdθdψ Th us for all δ ∈ Irr( S O (3)) w e hav e that A ab δ 6 = 0 only if a = b = 0. In other w ords the matrix A δ con tains at most one nonzero elemen t. This is the diagonal elemen t A 00 δ = a δ giv en b y iden tity (23). Identit y (22) follows b y constructing the F ourier series of p as in (2). 5.3 Numerical simulations Here we will illustrate the characteristic function metho d of 4.3 by applying it to a numerical example of decomp ounding on S O (3). Within this example w e will consider a parametric estimation problem related to a physical in v erse problem as described in 5.1. Our example is of a comp ound Poisson pro cess Y on S O (3). As in 5.1, S O (3)-v alued random v ariables are iden tiﬁed with random orthogonal matrices. F or t ≥ 0, Y ( t ) = N ( t ) Y n =0 X n 19 where the Poisson pro cess N has parameter λ = 0 . 3 and the random v ariables X n ha ve a common probability density p given by expression (19). F our v alues will be considered for the parameter g in this expression: 0 . 85 , 0 . 9 , 0 . 95 and 0 . 99. W e will put T = 10. W e sim ulate a n um b er n of i.i.d. observ ations of Y ( T ). The following v alues of n are used: 500 , 5000 and 50000. Note that on av erage the num ber N ( T ) of factors inv olved in the random pro duct Y ( T ) is equal to 3. Before going on, we conﬁrm that the metho d of 4.3 can be applied for this example. In other w ords, that the X n with the proposed density p are inv erse in v ariant. This follo ws from the dev elopmen t after identities (22) and (23). Indeed, the matrices A δ obtained for p are diagonal with exactly one nonzero diagonal elemen t a δ = g δ . Since g is real w e ha v e that A δ is Hermitian for all δ ∈ Irr( S O (3)). Inv erse inv ariance follows by 1 of Prop osition 2. W e will present three sets of ﬁgures. Figure 5.3 is concerned with the com- p ounding transformation of p . Figure 5.3 illustrates the inﬂuence of n on para- - 1.0 - 0.5 0.0 0.5 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 cos Θ relative frequency (a) Histogram of cos θ under density p - 1.0 - 0.5 0.0 0.5 1.0 0.00 0.05 0.10 0.15 cos Θ relative frequency (b) Histogram of cos θ under distribution of Y ( T ) Figure 1: Comp ounding transformation of p (histograms) 20 metric and nonparametric estimation errors. Figure 5.3 studies the inﬂuence of g on the nonparametric estimation error for ﬁxed n . F or ﬁgures 5.3 and 5.3 we ha ve g = 0 . 9. F or ﬁgures 5.3 and 5.3 we hav e n = 50000. W e no w comment on eac h of these ﬁgures. Figure 5.3 illustrates the relation b et ween the distribution of the X n as giv en b y the densit y p and the distribution of Y ( T ). Both these distributions are studied using histograms. The histogram in ﬁgure 1(a) is for the cosine of the Euler angle θ ∈ [0 , π ] asso ciated with the random v ariable X 1 . The histogram in ﬁgure 1(b) is for the cosine of θ asso ciated with Y ( T ). Figure 5.3 is concerned with the direct comp ounding transformation rather than the inv erse decomp ounding transformation. It is mean t to show the his- togram in ﬁgure 1(b) as function of the one in 1(a). As expected, the latter histogram app ears as a wider version of the former. This corresp onds to the con tent of Proposition 5 of Section 3. Note also that the dominant v alue in ﬁgure 1(b) has mov ed aw ay from θ = 0. F or ﬁgure 5.3, the observ ations made of Y ( T ) are used to carry out the de- comp ounding approac h of 4.3. Parametric and nonparametric estimation errors are given graphically for diﬀeren t v alues of n . Figure 2(a) compares the esti- mated Legendre co eﬃcien ts of p to their theoretical v alues a δ = g δ for δ ≥ 0. In ﬁgure 2(b), a priori knowledge of the analytical form of the a δ is supp osed. This is used to estimate g . A diﬀeren t parametric estimate is obtained from eac h estimated Legendre co eﬃcien t. In ﬁgures 2(a) and 2(b) theoretical v alues are represented by a solid line. In ﬁgure 2(a) we hav e the estimated ﬁrst l = 31 Legendre coeﬃcients for eac h v alue of n . Let us note these co eﬃcien ts ˆ a n δ for 0 ≤ δ ≤ l and the corresp onding v alue of n . They can b e used to ev aluate a nonparametric estimate of p as in form ula (8). This is done by replacing them in a truncated Legendre series (22). W e hav e the nonparametric estimate of p whic h we note ˆ p n l ˆ p n l ( θ ) = 1 + l − 1 X δ =1 (2 δ + 1) ˆ a n δ P δ (cos θ ) where for all v alues of n we ha ve that ˆ a n 0 = a 0 = 1. Depending on n , the random nonparametric estimation error from ˆ p n l is given by X δ 0 } , the distribution of Y ( T ) is a mixture of distributions with Hen y ey-Greenstein density . More precisely , for all n > 0 we hav e the conditional probability density for the Euler angle θ asso ciated with Y ( T ) p ( θ | N ( T ) = n ) = 1 − g 2 n (1 + g 2 n − 2 g n cos θ ) 3 2 In particular, in the limit g ↑ 1 we hav e that Y ( T ) is almost surely equal to the iden tity matrix. Conditionally on { N ( T ) > 0 } , we hav e in the limit g ↓ 0 that Y ( T ) is uniformly distributed on S O (3). Let us note that in our example P ( N ( T ) > 0) ' 0 . 96. Figure 5.3 can b e understo od in ligh t of the abov e discussion. F or greater v alues of g , observ ations of Y ( T ) are concentrated near the identit y matrix. This leads to fast conv er- gence of our estimates for the Legendre co eﬃcien ts of p . F or smaller v alues of g , observ ations of Y ( T ) are more disp ersed and the conv ergence of estimates is slo wer. In the limit g ↓ 0 the observ ations are close to uniformly distributed on S O (3) and our approach breaks do wn due to numerical problems. 6 Conclusion Nonparametric estimation on compact Lie groups, esp ecially using characteristic function metho ds, is by now a relativ ely familiar topic in relation to several 23 engineering applications. It has received comprehensiv e treatment in the case where estimation is carried out directly from some stationary pro cess. That is, from i.i.d. observ ations of a group-v alued random v ariable. This paper has applied a characteristic function metho d to the problem of decomp ounding on compact Lie groups. F or this problem, nonparametric estimation is required from indirect observ ations deﬁned in terms of a nonstationary pro cess. A ﬁrst approac h of decompounding on compact Lie groups w as giv en. It was guided b y existing c haracteristic function metho ds for the classical problem of decomp ounding. These metho ds were transp osed directly to the setting of har- monic analysis on compact Lie groups. Under a suitable symmetry hypothesis, treatmen t of the indirect nature of observ ations w as simpliﬁed. The ensuing nonparametric estimation error w as c haracterized as depending on the whole sp ectrum of the target densit y rather than just its smo othness class. In some asp ects, our approac h of decomp ounding on compact Lie groups might app ear summary . W e hop e how ever that is will attract attention to v arious problems of the statistics of nonstationary sto c hastic processes on groups. This paper also discussed the importance of decompounding on S O (3) to the ph ysical inv erse problem of m ultiple scattering. Under a probabilistic interpre- tation of the theory of radiativ e transfer, models based on comp ound Poisson pro cesses on S O (3) were found consisten t with the results of the small angles appro ximation of radiativ e transfer. The p ossibilit y of reform ulating physical in- v erse problems of multiple scattering as parametric or nonparametric statistical estimation problems was discussed. The statistical nature of this new p oin t of view seems desirable giv en the high complexit y of m ultiple scattering situations. In practice, it might require considerably more elaborate measurements. References [1] B. Buchmann and R. Gr ¨ ub el, “Decompounding: An estimation problem for Poisson random sums,” The annals of statistics , v ol. 31, no. 4, pp. 1054–1074, 2003. [2] B. v an Es, S. Gugushvili, and P . Spreij, “A k ernel type nonparametric densit y estimator for decomp ounding,” Bernoul li , vol. 13, no. 3, pp. 672– 694, 2007. [3] D. Applebaum, “Compound Poisson pro cesses and L´ evy pro cesses in groups and symmetric spaces,” Journal of the or etic al pr ob ability , vol. 13, no. 2, pp. 383–425, 2000. [4] G. Chirikjian and A. Ky atkin, Engine ering applic ations of nonc ommutative harmonic analysis . CR C Press, 2000. [5] U. Grenander, Pr ob abilities on algebr aic structur es . John Wiley & Sons Inc., 1963. 24 [6] H. Heyer, Pr ob ability me asur es on lo c al ly c omp act gr oups . Springer V erlag, 1977. [7] B. Y azici, “Sto c hastic deconv olution o ver groups,” IEEE tr ansactions on information the ory , vol. 50, no. 3, pp. 494–510, 2004. [8] J.-Y. Ko o and P . Kim, “Asymptotic minimax b ounds for sto c hastic decon- v olution ov er groups,” IEEE tr ansactions on information the ory , vol. 54, no. 1, pp. 289–298, Jan. 2008. [9] P . T. Kim and J.-Y. Ko o, “Optimal spherical deconv olution,” Journal of multivariate analysis , v ol. 80, pp. 21–42, 2002. [10] P . Kim and D. Richards, “Decon volution density estimation on compact Lie groups,” in A lgebr aic metho ds in statistics and pr ob ability . AMS, 2001, pp. 155–171. [11] P . Kim, “Decon volution densit y estimation on SO(N),” The annals of statistics , vol. 26, no. 3, pp. 1083–1102, 1998. [12] M. Liao, L ´ evy pr o c esses on Lie gr oups . Cambridge Univ ersity Press, 2004. [13] T. Br¨ ock er and T. tom Dieck, R epr esentations of c omp act Lie gr oups . Springer, 1985. [14] J. Duistermaat and J. Kolk, Lie gr oups . Springer V erlag, 2000. [15] M. Liao, “L´ evy pro cesses and F ourier analysis on compact Lie groups,” The annals of Pr ob ability , v ol. 32, no. 2, pp. 1553–1573, 2004. [16] O. Kallenberg, F oundations of mo dern pr ob ability . Springer V erlag, 2002. [17] C. V an Loan and G. Golub, Matrix c omputations . The John Hopkins Univ ersity Press, 1989. [18] J. Uherk a and A. Sergott, “On the contin uous dep endence of the ro ots of a p olynomial on its co eﬃcients,” The americ an mathematic al monthly , v ol. 84, no. 5, pp. 368–370, 1977. [19] H. Sato and M. F ehler, Seismic wave pr op agation and sc attering in the heter o gene ous e arth . Springer, 1998. [20] R. Xu, Particle susp ensions: Light sc attering metho ds . Kluw er Academic Publishers, 2002. [21] P . Sheng, Wave sc attering, lo c alization and mesosc opic phenomena . Aca- demic Press, 1995. [22] X. Ning, L. P apiez, and G. Sandinson, “Comp ound poisson process method for the m ultiple scattering of c harged particles,” Physic al R eview E , vol. 52, no. 5, pp. 5621–5633, 1995. 25 [23] A. Ishimaru, Wave pr op agation and sc attering in r andom me dia, V ol.1,2 . Academic Press, 1978. [24] L. Klimes, “Correlation functions of random media,” Pur e and applie d ge o- physics , vol. 159, pp. 1811–1831, 2002. [25] A. Kokhano vsky , “Small angle appro ximations of the radiative transfer theory ,” Journal of Physics D , vol. 30, pp. 2837–2840, 1997. 26

Decompounding on compact Lie groups

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment