Belief Propagation and Beyond for Particle Tracking
We describe a novel approach to statistical learning from particles tracked while moving in a random environment. The problem consists in inferring properties of the environment from recorded snapshots. We consider here the case of a fluid seeded wit…
Authors: Michael Chertkov, Lukas Kroc, Massimo Vergassola
Belief Pr opagation and Bey ond f or P article T racking Michael Chertkov T -13, Theoretical Division Los Alamos National Lab . Los Alamos, NM 87545, USA chertkov@lan l.gov Lukas Kroc Cornell University Ithaca, NY 14850 , USA kroc@cs.corn ell.edu Massimo V ergassola Institut Pasteur 75724 Paris Cedex 15, France massimo@past eur.fr Abstract W e describe a novel appro ach to statistical learnin g from p articles tracked while moving in a random environmen t. The problem consists in inferring properties of the environment from recorded snapshots. W e con sider here the case o f a fluid seeded with identical passi ve particles that diffuse and are advected by a flow . Our approa ch re sts on efficient algorithm s to estimate the weighted n umber of possi- ble m atchings amon g particles in two consecu ti ve sn apshots, the partitio n fun c- tion of the un derlying grap hical model. The par tition fun ction is then maximized over the mode l parameters, namely dif fusivity and velocity grad ient. A Belief Propagatio n (BP) sch eme is the bac kbone of our algorith m, providin g ac curate results for the flow parameters we want to learn. The BP estimate is addition - ally improved by incorpo rating Loop Series (LS) c ontributions. For the w eighted matching problem, LS is compactly expressed as a Cauchy inte gral, accurately estimated by a saddle point ap proxim ation. Nu merical experiments show that the quality o f o ur imp roved BP algorithm is comparable to the o ne o f a fully p oly- nomial ran domized approx imation scheme, based on the Markov Chain Monte Carlo ( MCMC) method, while the BP-based schem e is substantially faster than the MCMC scheme. 1 Intr oduction Graphical model app roaches to statistical lea rning and infere nce are wide spread in m any fields of science, ranging from m achine learnin g to bioinform atics, statistical physics and erro r-correction. Such applications often r equire e valuation of a weighted sum o ver an exponentially large number of configur ations — a formidab le # P -hard problem in the majority of cases. In this paper we focus on one such difficult p roblem, which occurs when t rackin g identical particles moving in a rando m environment. As long as particles are sufficiently d ilute, their tr acking in two consecutive frames is rather straig htforward. When the density of p articles and/o r th e acquisition time in crease, m any possible sets of traje ctories beco me statistically co mpatible with the acquired data an d m ultiple match ings of the particles in two consecutive snapshots are l ikely . Despite of th ese uncertainties, one expects that reliable estimates of the properties of the en v ironme nt sho uld still be possible if the num ber N of tracked particles is sufficiently large. T his is the prob lem that we want to address here. The na ture of th e m oving particles a nd their e n vironm ent ar e n ot subjec t to pa rticular restriction s, e.g. they might move acti vely , su ch as living organisms, or pa ssi vely . Here, we shall consider the case of a fluid seed ed with passi ve particles, a pr oblem arising in th e context of fluid mechanic s experiments. Given a statistical mode l of the fluid flow with unknown param eters, along with the positions o f N ind istinguishable p articles in two su bsequent snapshots, on e aims at pred icting the most probable values of the model par ameters. T his task is form ally stated in Section 2 as searching for the maximum of a weighted sum over all possible m atchings between par ticles in the two snap- 1 shots. The problem turns out to b e equi valent to compu ting the permanent of a non-n egati ve matrix, known to be a # P -co mplete problem [ 11 ]. The main con tribution of th is paper is a n efficient a nd accurate alg orithm of Belief P r opaga tion (BP) type for calculatin g the perman ent for the class of weight matric es arising from the particle tracking p roblem. Th e BP algorithm seeks a minimum of the Bethe Free Energy [ 13 ] for a suitable graph ical mo del. Th e graphical mo del is a fully connected bipartite grap h: no des ar e associated with the me asured particles, ed ges are weighted accord ing to the m odel o f the flo w tr ansporting th e p articles an d con straints enfo rce the condition that e xactly one edge per n ode is ac ti ve. It is k nown that BP gives the exact r esult for the max imum likelihoo d version o f the prob lem (fin ding a maximu m weight match ing) in spite of mu ltiple lo ops character- izing the grap hical model [ 2 ]. Th e BP algorithm for the matchin g problem is deriv ed and discussed in Section 3 . BP eq uations co uld be under stood as a re- parametrizatio n, or gau ge transform ation, of factor func- tions in the graph ical model [ 12 ]. Furthermo re, BP s olution s also provide an e xplicit representatio n of the exact partition f unction in ter ms of the so-called Loop Series [ 5 , 6 ]. Our main technical result is th e derivation of a co mpact expr ession and efficient appr o ximation for the Loop Series in the pr ob - lem o f weighted particle matching . This is do ne in Section 4 , wher e th e L oop Series is expr essed in terms of an 2 N -th or der m ixed deriv ati ve of an explicit fun ctional, reduced to 2 N -dimensional Cauchy integral and finally estimated by a saddle- point ap proxima tion. Section 5 describes empir- ical results dem onstrating the p erforma nce of bare BP and the sadd le-point impr oved BP in com- parison with a ( simplified) fully p olynom ial ran domized approx imation schem e for c omputing the permane nt [ 8 ]. Our improved BP achieves comp arable accu racy , with sig nificant gains in terms of speed. As the numb er o f particles tracked in experiments is typically large (o rder tens of thousands) we argue that our approach is both useful and promising for applications. 2 Particle tracking pr oblem An imp ortant part of modern experiments in fluid m echanics is based on tracking of p re-seeded particles by sophisticated optical method s [ 1 ]. If par ticles are sufficiently sma ll an d chosen of ap - propr iate (mass) density , their effect on the flo w is essentially negligible and one can safely a ssume that they are p assi vely tran sported by the flow . The (numb er) density o f p articles is usually r ather high and a single snapshot typically contains a large number of the m. The reason is that the smallest scales of the flow , which is generally turbulent, ought to be resolved. T wo decades i n a three dimen- sional flow requir e to follow at least o ne million , 10 2 × 3 , pa rticles. Furthermore, turbulence is qu ite effecti ve in rapidly transporting particles so that th e acquisition time between co nsecutive snapshots should be kept small. Modern cameras have impressive resolutions, in the o rder of ten s of thousand s frames per second, ye t the flow of information is huge: ∼ Gig abit/s to monito r a two-dimensional slice o f a (10 cm ) 3 experimental cell with a p ixel size of 0 . 1 mm and exposition time of 1 ms . Th is extremely high rate makes it impossible to pro cess d ata on the fly , unle ss very efficient algorithm s are developed. Previous po ints mo tiv ate the de velopment o f a novel set of alg orithmic tools for fast and efficient particle tr acking. One key eleme nt is inco rpora ting statis tical models of the environment where particles are transpo rted and tracked. For turbulent flows, modeling proceed s as fo llows. Consid er N particles fr om the same time fr ame, labeled by i = 1 , · · · , N and positione d at the set of points x i , su ch that the typical distance between neighbo ring particles is smaller then the viscous scale of the flow . Then, Lagran gian particles e volve acco rding to the set of stochastic equatio ns, ˙ ρ i = U + S ρ i + ξ i , whe re ρ i are p article displacements on a line (g eneralization to mu ltiple dimension s is straightfor ward) measured with respect to a reference po int; U and S are t he large-scale mean and gradient of the velocity field; ξ i ( t ) is the stochastic zero-m ean Gaussian Lange vin noise, describing molecular diffusi vity , defined by its corr elation function: h ξ i ( t 1 ) ξ j ( t 2 ) i = κδ ij δ ( t 1 − t 2 ) . Particles are indistinguishable and the m atching problem consists in assigning each particle from the original frame x i = ρ i (0) to particles in th e subsequ ent frame y i = ρ i (∆) . Even if the flo w par ameters, U and S , and the diffusion coefficient, κ , were known and frozen in time ( the latter is a rea sonable assumption provided the acquisition time ∆ is sufficiently small), the matching cannot be identified with ab solute cer tainty due to the stochastic nature of diffusion. The problem can be statistically modeled con sidering all p ossible particle m atchings ~ σ between two successiv e fram es and weigh ting 2 them accordin g to p ( ~ σ ) = F ( ~ σ ) · Y ( i,j ) p j i , p j i = exp − S σ j i ( y j − e S x i ) 2 κ (exp(2 S ) − 1) p π ( e 2 S − 1) /S , Z = X ~ σ p ( ~ σ ) . (1) Here, σ j i ∈ { 0 , 1 } is a Boolean variable indica ting absen ce/presence o f matchin g between x i and y j , the vector ~ σ = ( σ j i | i, j = 1 , · · · , N ) and F ( ~ σ ) = Q j δ ( P i σ j i , 1) Q i δ ( P j σ j i , 1) e nforces th e constraints fo r a perfec t matching (all par ticles match with exactly one p article in the other f rame). For simp licity , U = 0 (the drift comm on to all particles is subtracted ) an d time is rescaled to have ∆ = 1 . The p artition fun ction Z is the we ighted sum over all po ssible matching s and p ( ~ σ ) / Z is their normalized probab ility distribution. By construction, the partition function is the permanent of the N × N positi ve matrix, ˆ p = ( p j i | i, j = 1 , · · · , N ) , i.e. Z = per ( ˆ p ) . Our goals are : (1) For given parameters S, κ an d th e set of particle p ositions ~ x an d ~ y in two sub se- quent frame s, have an algorithm for finding (a) th e most pro bable matchin g, (b) marginal matching probab ilities for any two particles from different frames, which is eq uiv alent to computing th e par- tition function . (2) Learn and provide reliable estimates of the model parameters S, κ . Problem (1a) is solved b y the auctio n exact po lynom ial a lgorithm [ 3 ]. Con versely , p roblems (1b) and (2) belon g to th e # P -com plete class, i.e. are likely to be exponen tially comp lex, and we th en aim at developing an efficient and systematically imp rovable heuristics. Ou r app roach is based on the o bservation, made in [ 2 ], that a BP scheme equiv alent to the auction algo rithm can be formu lated for (1a), in spite of the underly ing fully connected bi-partite g raph with mu ltiple loops (see also [ 4 ]). Notice that problem (1 a) is th e Maximum -Likelihood v ersion o f (1b ). W e solve the p roblem (2) by taking the best po ssible e stimate for Z at given values of S and κ , an d then m aximizing the r esult over these parameters. W e observed empirically that estimates based o n Exp ectation Maximization (EM) a lgorithm [ 7 ] do not ensure accurate learning of the flow parameter s S and κ in some of the scen arios of interest, in pa rticular the one with dif fusion on ly . Con versely , BP gives accurate results in terms of the position of the max imum w .r .t the parameter s, altho ugh the estimate for Z is ofte n orders of magnitud es wrong. T o fu rther improve on this, we app ly , in Sec tion 4 , the general BP-based Loop Calculus approac h developed in [ 5 , 6 ] to th e perf ect match ing prob lem. This significantly improves the estimates o f Z , es pecially in difficult cases when uncertain ties in the matchings are significant. T o the best of our knowledge, particle tracking as a learning problem – no t to mentio n th e alg o- rithmic developments based on con temporar y inference method s presented below – is n ovel and it was not discussed pr eviously (see [ 9 ] for a su rvey o f algor ithms cu rrently u sed in flu id mechanics experiments). 3 Belief prop agation and Bethe free ener gy For a model with states ~ σ having weight p ( ~ σ ) ( as in ( 1 )), the con vex f unctional F { b ( ~ σ ) } ≡ X ~ σ b ( ~ σ ) ln b ( ~ σ ) p ( ~ σ ) (2) has a single min imum, at b ( ~ σ ) = p ( ~ σ ) / Z (und er the no rmalization co ndition P ~ σ b ( ~ σ ) = 1 ), and the corr espondin g value of the function al F is the fre e energy , − ln Z . As shown in [ 13 ], the Bethe free energy approximation and BP equations stem from ( 2 ) by considering an ansatz of the form b ( ~ σ ) ≈ Q i b i ( ~ σ i ) Q j b j ( ~ σ j ) Q ( i,j ) b j i ( σ j i ) . (3) V ector s ~ σ i ≡ { σ j i | j = 1 , · · · , N } an d ~ σ j ≡ { σ j i | i = 1 , · · · , N } are allowed to take any of the N possible values (0 , · · · , 0 , 1 , 0 , · · · , 0) with exactly on e nonzer o e ntry . Beliefs b j i ( σ j i ) , b i ( ~ σ i ) , b j ( ~ σ j ) satisfy fo r any i an d j the consistency relation for marginal pro babilities: b j i ( σ j i ) = P ~ σ i \ σ j i b i ( ~ σ i ) = P ~ σ j \ σ j i b j ( ~ σ j ) , where P ~ σ i \ σ j i denotes th e sum over all possible v alues o f th e vector ~ σ i keeping 3 fixed the value of the compo nent σ j i . Eq . ( 3 ) is exact for a tr ee and serves as an approx imation for graphs with loops, e.g. for the fully connected bi-partite graph of our matching problem. Beliefs, as approx imations fo r prob abilities, should also satisfy the norm alization con ditions: ∀ ( i, j ) , b j i (1) + b j i (0) = 1 . Using the no rmalization an d consistency conditio ns, we can express all beliefs via β j i ≡ b j i (1) an d obtain for the Bethe free energy and the normaliza tion conditions F B P { β } = X ( i,j ) β j i ln β j i p j i − (1 − β j i ) ln(1 − β j i ) ! , (4) ∀ i : X j β j i = 1 ; ∀ j : X i β j i = 1 . (5) A simple argumen t shows that con strained minima of ( 4 ) are either a perfect matching (belief s are all zeros and N of them are un ity) or they are attained in the in terior o f the domain . The latter is the case generally encounter ed in the situations of interest to us, i.e. where no statistically dominan t matching is present. Minima in th e in terior a re stationary po ints of F B P , co rrespond ing to the following set of equations: ∀ ( i, j ) : β j i (1 − β j i ) = p j i exp µ i + µ j , (6) where the 2 N Lagrangian multipliers µ (ch emical potentials) are determin ed by Eqs. ( 5 ). Note that the Bethe fre e energy is not con vex, an d thus multiple minima in the interior of the domain m ight be possible. Empiric ally , we never foun d more than one though. In the limit where only the Maximum Likelihood con figuration is of in terest, entro py terms are discard ed and ( 4 ) re duces to Linear Pro- grammin g, y ielding optimal integer solution in acco rdance with [ 2 , 4 ]. Another relev ant remark is that conv exity is re stored fo r a mo dified expression of th e free e nergy , where the minu s sign of the second term in Eq. ( 4 ) is r ev ersed. The latter expression follows from an integral representation for Z , appr oximated in a saddle-po int way . T his approx imation overestimates the diffusion coe fficient and we use its unique solu tion (easy to find n umerically) as initial con dition to the following iter ati ve version of Eqs. ( 5 , 6 ): ∀ ( i, j ) : β j i ( n + 1) = λβ j i ( n ) + (1 − λ ) p j i p j i + ( P k β j k ( n ) / 2 + P k β k i ( n ) / 2 − β j i ( n )) 2 / ( u i ( n ) v j ( n )) , (7 ) ∀ i : u i ( n + 1) = 1 − P j ( β j i ( n )) 2 P k p k i v k ( n ) , ∀ j : v j ( n + 1) = 1 − P i ( β j i ( n )) 2 P k p j k u i ( n ) , (8) where the arguments o f the β ’ s indicate the order of the iterations, u i = exp( µ i ) and v j = exp( µ j ) . The dam ping param eter λ (typically ch osen 0 . 4 ÷ 0 . 5 ) h elps with c onv ergence. T o ensure appro priate accuracy fo r solu tions with β ’ s close to zero or un ity we als o insert a n ormalization step after Eqs. ( 7 ) but prior to E qs. ( 8 ), m aking the following two transfor mations co nsequen tly , (a) ∀ ( i, j ) : β j i → β j i / P k β k i , and (b) ∀ ( i, j ) : β j i → β j i / P k β j k . Numerical e xperim ents sh ow that th is procedure conv erges to a stationary point of the Bethe free energy ( 4 ). 4 Loop series, Cauchy integral and saddle-point appr oxi mation As shown in [ 5 , 6 ] , the exact par tition function o f a generic gra phical model can be e xpressed in terms of a Loop Series ( LS), where each term of the series is compu ted explicitly using the BP solution. Ad apting this gen eral result to the matchin g pro blem, bulky yet straigh tforward a lgebra leads to the following exact expression for the partition functio n Z defined in Eq. ( 1 ): Z = Z BP ∗ z , z ≡ 1 + X C r C , r C = Y i ∈ C (1 − q i ) ! Y j ∈ C (1 − q j ) Y ( i,j ) ∈ C β j i 1 − β j i . (9) Here, the Beth e free e nergy F BP = − ln Z BP , the variables β a re in acc ordance with Eq s. ( 5 , 6 ), and C stands for an arbitrary gen eralized loop, defined as a subgraph of th e fully conn ected bi- partite gr aph with all its vertexes having degree of connectivity > 1 . T he q i (or q j ) in Eq. ( 9 ) is 4 the C -depend ent de gree of connectivity o f nodes, i.e. q i = P { j | ( i,j ) ∈ C } 1 a nd q j = P { i | ( i,j ) ∈ C } 1 . According to Eq. ( 9 ), lo ops with even/odd number of vertexes g iv e positive/ne gative contrib utions r C . Th erefore , the series is no t positiv e d efinite, which is also co nsistent with the fact that Z BP in general does not provid e a lo wer bound for the exact partition fun ction. ( In some special cases, e.g. for the model studied in [ 10 ], all terms in the series ar e known to be positi ve and thus Z BP ≤ Z .) In all cases of the weighted matching problem we ha ve e xperim ented with, we have empirically found that th e inequ ality Z BP < Z still holds. Let us finally notice an important special feature of the weighted m atching p roblem: for any generalized loop C , its individual con tribution | r C | ≤ 1 . Th e proof can be foun d in the Append ix. Eq. ( 9 ) allo ws for the follo wing compact representatio n in terms of 2 N -th order mixed local deriv a- ti ve of an explicit function of 2 N v ariables z = ∂ 2 N Z ( ρ 1 , · · · , ρ N , ρ 1 , · · · , ρ N ) ∂ ρ 1 · · · ∂ ρ N ∂ ρ 1 · · · ∂ ρ N ρ 1 = ··· = ρ N = ρ 1 ··· = ρ N =0 , (10) Z ( ~ ρ ) ≡ exp X i ρ i + X j ρ j Y ( i,j ) 1 + β j i (1 − β j i ) exp − ρ i − ρ j ! , (11) where ~ ρ = ( ρ 1 , · · · , ρ N , ρ 1 , · · · , ρ N ) are auxiliary variables. Howe ver , ca lculating th e 2 N -o rder mixed derivati ve exactly is a task of e xpon ential complexity and o ne w onder s whe ther the mixed deriv ativ e can be approx imated ef ficiently . Partial answer to this question is given b elow . Using the Cauchy in tegral representation for the first-ord er deriv a ti ve o f an analytic fu nction ( Z is analytic over ρ i , ρ j with finite real p arts), Eqs. ( 10 , 11 ) ca n be recast as the following conto ur integral: z = I Γ ρ exp ( −G ( ~ ρ )) Q i dρ i Q j dρ j (2 π i ) 2 N , G ( ~ ρ ) ≡ X i 2 ln ρ i + X j 2 ln ρ j − ln Z , (12) where Γ ρ is a direct product of 2 N close contours circ ling clockwise th e o rigin ~ ρ = ~ 0 , where deriv ativ es in ( 10 ) are to be comp uted. W e observe th at each integral over an i ndividual ρ variable in Eq. ( 12 ), say ρ i , ha s a pole at ρ i = 0 and essential singularities at ρ i = ±∞ . Notice also that G ( ~ ρ ) is a concave function o f ~ ρ in each one of the 2 2 N quadra nts, i.e. wher e co mponen ts of ~ ρ are finite, real and ha ve a definite sign. T herefor e, it is natural to shift the contour of integration to one of 2 2 N maxima of G ( ~ ρ ) , ∀ i : 2 i = 1 − X j 1 + 1 − β j i β j i e i + j ! − 1 , ∀ j : 2 j = 1 − X i 1 + 1 − β j i β j i e i + j ! − 1 , (13) and orient the con tour alon g th e direction of steepest descent from the saddle p oint. Once th e sign of each compon ent of ~ ρ is fixed, appr oximating the respe cti ve solu tion of E qs. ( 13 ) num erically is straightfor ward due to the concavity of G . W e shall enumer ate the v a rious maxima by the index s . The saddle-poin t appro ximation of Eq. ( 12 ), accoun ting for Gaussian integral co rrections ab out all the maxima of Eq. ( 13 ), yields z ≈ 2 2 N X s =1 exp −G ( ~ ( s ) ) (2 π ) N q det( ˆ Λ ( s ) ) ≡ 2 2 N X s =1 exp −G ( s ) sp ( ~ ( s ) ) , (14) where ˆ Λ ( s ) is the Hessian of G ( ~ ρ ) at th e saddle-poin ts. Correc tions to each ter m in Eq. ( 14 ) are measured in term s o f hig her-order te rms, the leadin g (four th-ord er) being estimated as G ( s ) 4 = − 1 8 P α,β ,γ ,ν Υ ( s ) αβ µν (( ˆ Λ ( s ) ) − 1 ) αβ (( ˆ Λ ( s ) ) − 1 ) µν , wher e Υ ( s ) αβ µν is the tensor of fo urth- order d eriv ati ves of G ( ~ ρ ) . The im proved a pprox imation for the par tition functio n becom es z ≈ P s exp( −G sp ( ~ ( s ) ) − G 4 ( ~ ( s ) )) . One reason to account for the fourth- order term is that the ra- tio |G 4 / G sp | gi ves a standard m easure of the saddle-poin t v alidity . In cases where the saddle- approx imation becomes asy mptotically exact, the ratio appr oaches zero. A weaker condition, |G 4 / G sp | < 1 , suffices for a heuristically reasonab le approximation . 5 0.5 1 1.5 2 −0.85 −0.8 −0.75 −0.7 −0.65 −0.6 Diffusivity, κ Log of permanent per particle, ln Z/N no advection, N=100 −F bp −F bp − G sp + −F bp − G sp + − G 4 + ln Z MC /N −1.1 −1.05 −1 −0.95 −0.9 −1.5 −1 −0.5 0 0.5 Stretching, S Log of permanent per particle, ln Z/N κ =1, N=100 −F bp −F bp − G sp + −F bp − G sp + − G 4 + ln Z MC /N Figure 1: Nu merical results comparing BP , Loops series improvement and MCMC. Left panel: diffusion only; rig ht pan el: diffusion and advection . Red cu rve are BP r esults, blue and g reen a re Loop series improvements, and black curve is MCMC. W e also expect that the typical o rder o f magn itude of the 2 2 N terms G ( s ) sp and G ( s ) 4 grows as O ( N ) . Therefo re, th e sum in s over th e saddle-poin ts migh t be dominated by a single term in the limit of large N . As discussed in the next Section, we find emp irically that such dom inant term indeed exists and happens to correspond to the vector ~ (+) having all its components p ositiv e. Th erefore, the sum over all the max ima, indexed b y s , can be simp lified by keeping only th e dominan t co ntribution correspo ndent to ~ (+) . The expression that we have em ployed in nu merical exper iments discussed in the next Section reads finally , z ≈ ex p −G sp ( ~ (+) ) − G 4 ( ~ (+) ) . 5 Numerical results In this Section we compare the accuracy of BP based approx imations fo r the pa rtition fun ction, Z , of the m odel ( 1 ) with MCMC simu lations. As br iefly stated in Section 2 , compu ting Z f or the weighted matching problem is e quiv alent to compu ting a perman ent of a non-n egati ve matrix, f or which a Fully Polynomia l Ran domized Approxima tion Scheme (FPRAS) exists based on the Markov Chain Monte Carlo method [ 8 ]. W e implemente d the basic idea of FPRAS, with some simplifications applicable to our problem. This algorithm was used to assess accuracy of our approxim ations, but is orders o f ma gnitude slower than the BP b ased ap proach es, and thu s no t app licable to large particle tracking problem s. T o study dep endence of Z on κ and S (see discussion of Section 2 ), we estimate Z at different values of κ and S and compare th e cu rves, sear ching for the maxim um with respect to these param eters. Our BP simulation s co nsist of the fo llowing steps for each value of κ or S . First, we find so lution of BP equation s running the num erical scheme describ ed in Eq s. ( 7 , 8 ) and calculating the r esulting F B P accordin g to Eq. ( 4 ) . Sec ond, we find the solution of the saddle-point Eqs. ( 13 ) in the v arious quadra nts, calcu late the cov ariance matrix ˆ Λ ( s ) for this s addle solution and thus estimate the leading saddle-po int correction G sp in accord ance with Eq. ( 14 ). Finally , we estimate th e respective fourth- order correction , G ( s ) 4 . W e compare respective contr ibutions to the partition fu nction associated with sad dle-poin ts with different choices of the signs. In all exper iments where the typic al overlap am ong particles is sig- nificantly smaller th an the total number of particles we find that the contribution with all sign s + dominates. Moreover, the gap separ ating the leading + co ntribution and other contributions is sig- nificant and grows linear ly w ith N . Th is allo ws us to ignor e all other saddle-p oint contributions but the + o ne. (In cases of modera te N we h av e made the exhaustiv e c omparison . I n gen eral, we compare d th e all + contribution with all − contribution, con tributions with a limited number of signs flipped a nd also with signs gener ated ran domly .) W e observe that th e saddle-point validity condition s holds reasonably well if N is suf ficiently large, at N = 1 00 the ratio |G 4 / G sp | is ty pi- cally 0 . 1 ÷ 0 . 4 . Since |G 4 / G sp | → 0 when S → + ∞ , we find tha t the saddle-po int is v ery accurate (possibly asymptotically exact) in this limit. 6 Results of numerical s imulation s for number of particles N = 1 00 are shown in Figure 1 . Th e plots show r esults for on e set of particle positions each, but we fou nd that variability b etween scenarios decreases with increasing number of particles, and one scenario is thus suf ficient to characterize the main trends. The left panel shows results for a situation with diffusion on ly , with ac tual d iffusi vity κ act = 1 . 0 used for generating th e particle po sitions. The x-axis spans values of th e governing parameter κ (pretend ed not to be known), and the y-axis shows configuration weight per particle (what is shown is ln Z/ N ). Th e black curve are results of the MCMC s imulation , which are close to the true v alues of Z . The cu rve indeed peaks aroun d the corr ect value κ = κ act = 1 . 0 , althou gh the max imum is rather shallow (the lower κ act the mo re prono unced is the maximum ). The re d curve cor respond s to r esults ob tained by BP only , an d shows th at BP se verely und erestimates the p artition f unction, although its maximum is at the right v alue. The remaining blue and green curves are the two saddle correction s discussed in Sectio n 4 1 . As can be seen , the sad dle correction s significantly imp rove the BP estimate. The time to comp ute each p oint in the plot u sing the BP b ased scheme is ≈ 5 sec , while the MCMC algorith m takes ≈ 10 mins . The right panel of Figure 1 shows resu lts for a scenario with both advection and dif fusion. The x-axis now spans values of the velocity gradien t S , and y-axis is aga in ln Z/ N . Th e actual velocity gradient used for generatin g the particles was S act = − 1 . 0 . The f our curves sh ow similar main trends: BP underestimates Z a nd loop co rrections provide very tight belt around the MCMC results. The main difference is that in the case when ad vection is present, the peak is very well pronou nced and all me thods gi ve extreme ly accu rate an swer for the velocity gradient S . In this ca se, the running times were again ≈ 5 sec fo r the BP scheme, but ≈ 5 hour s for MCMC per poin t. 6 Conclusions W e have presented new c omputatio nal tools for p article tracking , based on Belief Prop agation and Loop Series and compare d them to the Markov Chain Monte Carlo scheme for th e estimation of the per manent [ 8 ]. W e have sp ecifically con sidered tracking o f passiv e par ticles in fluid dyn amics experiments. Th e me thods are q uite general though an d applications to other track ing pr oblems, e.g. to self-pr opelling b iological o bjects, are possible and will be p ursued. Our long- term goal is to develop computational tools ef fective en ough to perm it on-the-fly processing of par ticle tracking frames. Algorithm s pr esented her e en sure an excellent accu racy and the BP-improved schem e is already orders of magnitud es f aster than the MCMC scheme. Ap pendix Here we prove the following important property of the loop series for perfect matching: Proposition 1. In th e loop calculus expansion o f the graph ical mod el for p erfect matching pr o b- lem ( 9 ), | r C | ≤ 1 fo r all generalized loops C . Pr oo f. W e re write expr ession ( 9 ) as: r C = Q i ∈ C ψ i ; C Q j ∈ C ψ j C , where ψ i ; C = (1 − q i ) Q { j | ( i,j ) ∈ C } q β j i / (1 − β j i ) , and an alogously fo r ψ j C . W e will use the fact th at in a fixed p oint of our BP scheme P i β j i = P j β j i = 1 . W e proceed by sho wing that | ψ i ; C | is max imized for β j 1 i = . . . = β j q i i = 1 /q i , an d that e ven for such b eliefs | ψ i ; C | ≤ 1 . The situatio n is analogo us for ψ j C . I t then follows that | r C | ≤ 1 as d esired. 1 W e find that the quality of saddle-approximation decreases with κ , when the numb er of “polarized” match- ings, i .e. those with β j i → 1 , in creases. Polarized beliefs, corresponden t to almost co mmitted matchings, d o not contribute significantly to the L oop Series yet their respecti ve contributions are misrepresented in the saddle- approximation. T o compensate for this cav eat of the saddle-approximation we decimated the original graph, reducing it to a subgraph w ith all particles in volv ed in “polarized” matchings ( and all edges associated wit h them) pruned out. For the nume rical experiments in Figure 1 , we used the polarization criterion, β j i > 0 . 01 . 7 Let u s fix a g eneralized lo op C and a n ode i with d egree q i ≥ 2 in C and seek values of β j 1 i = . . . = β j q i i ∈ [0 , 1 ] s.t. P q i k =1 β j k i ≤ 1 that maximize | ψ i ; C | . First, by ob serving that β / (1 − β ) is a growing function of β , we see that indeed P q i k =1 β j k i = 1 must be the case. Focusing only on the subexpression of ψ i,C with β in it, we hav e the fo llowing con strained o ptimization problem: max β Q j :( i,j ) ∈ C β j i 1 − β j i s.t. P q i k =1 β j k i = 1 . W e apply the method of Lagrangian M ultipliers, and arrive at th e following co nditions for extrema: ∀ i, j : ∂ ∂ β j i Y { j | ( i,j ) ∈ C } β j i 1 − β j i + λ 1 − X { j | ( i,j ) ∈ C } β j i = 1 (1 − β j i ) 2 Y j ′ 6 = j β j ′ i 1 − β j ′ i − λ = 0 T aking a ny two j 1 6 = j 2 , we d erive β j 2 i (1 − β j 2 i ) = β j 1 i (1 − β j 1 i ) from the above. Th is, be ing a quadra tic equ ation in β j 1 i , h as exactly two solutions, an d it is not difficult so see that they must be β j 1 i = β j 2 i or β j 1 i = 1 − β j 2 i . T hese are the conditions on β s fo r an extremum of the product. T o incorpora te the (1 − q i ) subexp ression of ψ i ; C , let us deal with the special case when q i = 2 separately: in this case, since β j 1 i + β j 2 i = 1 , we have that the whole produ ct we maximize is equal to 1 , and thus | ψ i ; C | = 1 . For q i > 2 : if, for an y p air j 1 , j 2 : β j 1 i = 1 − β j 2 i , then all oth er β s (other th an β j 1 i and β j 2 i ) mu st be 0 due to P q i k =1 β j k i = 1 . This means th at the prod uct is zer o, and therefor e ψ i ; C = 0 . I f, on th e other hand , β j 1 i = β j 2 i for all pair s j 1 , j 2 , we have that β j i = 1 /q i . In this situation, the qu estion is wheth er ( q i − 1 ) 1 − q i / 2 ≤ 1 , which is true for any q i ≥ 2 . In fact, we hav e shown that the expression ( q i − 1) 1 − q i / 2 is an upper bound on | ψ i ; C | fo r any node i in any generalized loop C . Refer ences [1] R.J. Adrian. Particle-i maging techniques from experimental fluid mechanics. Annual Review of Fluid Mecha nics , 23:261, 2005. [2] M. Bayati, D. Shah , and M. Sharma. Max-product for maximum weigh t matching: con ver gence, correct- ness and lp duality . IE EE T ransactions on Information Theory , 54:124 1–1251, 2008. [3] D.P . B ertsekas. Auction algo rithms for network flow problems: A tuto rial introd uction. Comput. Optimiz. Applic. , 1:7–66, 1992. [4] M. Chertko v . Exactness of belief propa gation for some graphical models wit h loops, 200 8. [5] M. Chertkov and V . Chernyak. Loop calculus i n statisti cal physics and information science. P hysical Review E , 73:065102 (R), 2006. [6] M. Ch ertkov and V . Chern yak. Loop series for discrete statistical models on graphs. Jo urnal of S tatistical Mecha nics , page P 06009, 2006 . [7] A. P . Dempster , N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journa l of the Royal Statistical Society . Series B (Methodolo gical) , 39(1):1–38, 1977. [8] M. Jerrum, A. Sinclair, a nd E. V igoda. A polynomial-time appro ximation algorithm for the permanen t of a matrix with nonne gativ e entries. J. A CM , 51(4):671 –697, 2004. [9] N. T . Ouelette, H. Xu, and E. Bo denschatz. A quantitati ve study of th ree-dimensional lagrangian tracking algorithm. Experiments in Fluids , 40:301, 2006. [10] E.B. Sudderth, M.J. W ainwright, and A.S. Willsky . Loop series and bethe variation al bounds in attractiv e graphical models. In Proce edings of NIPS 2007 , 2007. [11] L. G. V aliant. The complexity of computing the permanent. Theor etical Computer Science , 8:189–201, 1979. [12] M.J. W ainwright, T .S . Jaakkola, and A. S. W illsky . T ree-based reparametrization framew ork for approx- imate estimation on graphs with cycles. Information Theory , IE EE T ransa ctions on , 49(5):1120–11 46, 2003. [13] J. S. Y edidia, W . T . Freeman, and Y . W eiss. Constructing free-energy approximations and generalized belief propagation algorithms. Information Theory , IEEE T ransactions on , 51(7):2282– 2312, 2005. 8
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment