Duality, Ancestral and Diffusion Processes in Models with Selection

The ancestral selection graph in population genetics was introduced by KroneNeuhauser (1997) as an analogue of the coalescent genealogy of a sample of genes from a neutrally evolving population. The number of particles in this graph, followed backwar…

Authors: Shuhei Mano

Duality, Ancestral and Diffusion Processes in Models with Selection
Dualit y , Ancestral and Diffusion Pro cesses in Mo dels with Selection ∗ Shuhei Mano † Dep a rtment of Sta tistics, University of Oxfor d , OX1 3TG, Unite d Kingdom Gr aduate Scho ol of N atur al Scie nc es, N agoya City Unive rsity, Nagoya 467-8501, Jap an Jap an Scie nc e and T e chnolo gy A gency, 4-1-8 Honcho, Kawaguchi 332-0012, Jap an ∗ P art of the r esearc h for this pap er was carried out while the author visited the De- partmen t of Statistic s at Univ ersit y of Oxford. † The auth or w as supp orted in part by Grants-in-Aids for Scient ific Researc h from the Ministry of Ed ucation, Culture, S p orts, Science and T ec hn ology of Japan. Address for corresp ondence: Graduate Sc h o ol of Natural Sciences, Nago ya Cit y Universit y , 1 Y amanohata, Mizuho-c ho, Mizuho-ku, Nago ya 467-8 501, J apan; E-mail: mano@nsc .nagoya-c u.ac.jp ; T el/F ax: +81-(0)5 2-872-58 21 1 2 abstract The a ncestral selection g raph in p opulation genetics w as intro d uced by Krone and Neuhauser (1997) as an analogue of the coalescen t genealogy of a sample of genes from a n eu trally ev olving p opulation. The num b er of particles in this graph, follo wed bac kw ards in time, is a bir th and d eath p ro cess with quadratic death and linear birth rates. In this p ap er an explicit form of the pr ob ab ility distribu tion of th e num b er of p articles is obtained by using the densit y of th e allele fr equency in t he co rresp onding diffusion mo del obtained b y Kim ura (1955). It is sho wn that t he p ro cess of fixation of the allele in the diffusion m o del corresp onds to con vergence of th e ancestral pro cess to its stationary measure. T he time to fi xation of t he allele conditional on fixation is stu d ied in terms of the ancestral pro cess. Keyw ords: Ancestral Pro cess, Diffusion Mo del, An cestral Selection Grap h , Coalescen t Mo del, Birth and Death Pro cess 3 1. Int roduction In p opulation genetics, the genealogy of a sa mple of genes pla ys an imp ortan t role in a probabilistic description of the sample. Consider a discrete-time W right -Fisher mo del of a p opulation consisting of 2 N neutral genes. If w e measure time in un its of 2 N genera- tions and let N tend to infin it y , then the W r ight-Fisher pro cess con ve rges to a d iffusion pro cess. The genealogy of a sample of n genes, follo w ed bac kwards in time, is describ ed b y the coalesc ent pro cess (Kingman, 1982b). The con vergence is r obust und er a n umb er of differen t mo dels (e.g. Moran mo del). Let { a n ( t ); t ≥ 0 } b e the num b er of ancestors of a sample of n genes at time t in the p ast. T hen the size pro cess { a n ( t ); t ≥ 0 } is a death pro cess with death rate i ( i − 1) / 2 wh en the size is i . The size pro cess will b e referred to as the ancestral pro cess. T h e distribution of a n ( t ) is known (see Griffiths (1979), T a v ar ´ e (1984)) and (1.1) P [ a n ( t ) = i ] = n X k = i ( − 1) k − i (2 k − 1)( i ) k − 1 [ n ] k i !( k − i )!( n ) k ρ 0 k ( t ) , i = 1 , 2 , ..., n, where ρ 0 k ( t ) := exp {− k ( k − 1) t/ 2 } , [ n ] k = n ( n − 1) · · · ( n − k + 1) and ( n ) k = n ( n + 1) · · · ( n + k − 1). Th e total v ariation norm b et ween a n ( t ) and a n ( ∞ ) = δ 1 has a simp le form (1.2) k a n ( t ) , δ 1 k var = 1 − P [ a n ( t ) = 1] . There is a first time W 0 n, 1 suc h that a n ( W 0 n, 1 ) = 1, which is the time to the m ost r ecent common ancestor. The density of W 0 n, 1 follo ws (1.3) P [ W 0 n, 1 ≤ t ] = P [ a n ( t ) = 1] = n X k =1 ( − 1) k − 1 (2 k − 1)[ n ] k ( n ) k ρ 0 k ( t ) . A b ound f or P [ a n ( t ) = 1] is known (see Kingman (198 2b ), T a v ar´ e (1984)) and (1.4) e − t ≤ 1 − P [ a n ( t ) = 1] ≤ 3 n − 1 n + 1 e − t , n = 2 , 3 , ... The ancestral selectio n graph in tro duced by Kr one an d Neuhauser (1997) is an analogue of the coalescen t genealogy . Assume that a p air of allelic typ es A 1 and A 2 are segregating in a p opulation, and the selectiv e adv an tage of a typ e A 1 gene ov er a t yp e A 2 gene is 4 s ( > 0). Let N → ∞ while c = N s is held constan t. The elemen ts are referr ed to as particles. Let b n ( t ) b e the num b er of edges, or ancestral particles, in a cross section of the ancestral selection grap h for a sample of n genes at time t in the p ast. In the ancestral selection graph, coa lescence o ccur s at rate α i = i ( i − 1) / 2 and branc hing o ccurs at rate β i = 2 ci when the size is i . Th en the ancestral pro cess { b n ( t ); t ≥ 0 } is a bir th and death pro cess with r ates β i and α i . A particle is called real if it is a p art of the real genealo gy of the sample, otherwise the p article is called virtual. If tw o particles r eac h a coalescing p oint , the resulting particle is real if and only if at least one of the tw o particles is real, otherwise the resulting particle is virtual. If a real particle reac h es a b r anc hing p oin t, it splits in to a real p article and into a virtual particle. If a virtual particle reac h es a b ranc hing p oint , it splits in to t wo virtual particles. If a t yp e A 2 particle reac h es a branching p oin t, it splits in to t w o t yp e A 2 particles. If a t yp e A 1 particle reac hes a b ranc hing p oin t, it splits in to t wo p articles, where at least one of the tw o particles is t yp e A 1 . Because the death rates are quadr atic wh ile the the bir th r ates are only linear, there is a first time W c n, 1 suc h that b n ( W c n, 1 ) = 1. Krone and Neuhauser (1997) consider stopping the p ro cess at this time, since the genetic composition of the samp le is determined b y then. Th ey ca lled the ancestral particle at the time the ultimate ancestor. In the case of no mutatio n, the real genealog y em b edded in an ance stral selection graph is the same as in t he neutral pro cess (Krone and Neuh au s er (1997); Th eorem 3.12). Th us the ancestral p ro cess of the real particles can b e describ ed by the neutral process { a n ( t ); t ≥ 0 } . In this article, we discuss prop erties of th e ancestral p r o cess { b n ( t ); t ≥ 0 } withou t mutatio n whic h is not stopp ed up on r eac hin g the ultimate ancestor. F earn head (2002) h as studied a pro cess wh ich is not stopp ed up on reac hing the ultimate ancestor. He ident ifies the stationary distribu tion of th e pro cess and uses the distribution to characte rize the substitution pro cess to the common ancestor. Kim ura (19 55 ) stu d ied the densit y of the allel e fr equency by analyzing th e d iffusion pro cess to which the W righ t-Fisher mo d el with directional selection con ve rges. Let x p ( t ) 5 b e the frequency of the allele A 1 at time t forward in a p opulation in which the initial frequency is p . Then the Kolmogoro v forward equation for th e diffusion pro cess { x p ( t ); t ≥ 0 } on (0 , 1) is (1.5) ∂ φ ∂ t = 1 2 ∂ 2 ∂ x 2 { x (1 − x ) φ } − 2 c ∂ ∂ x { x (1 − x ) φ } , with the in itial condition φ ( p, x ; 0) = δ ( x − p ). The so lution is (1.6) φ ( p, x ; t ) = 2(1 − r 2 ) e c ( r − 1) e 2 cx ∞ X k =0 V (1) 1 k ( c, r ) V (1) 1 k ( c, z ) N 1 k ρ c k +2 ( t ) , where r = 1 − 2 p, z = 1 − 2 x, ρ c k +2 ( t ) := exp( − λ k t ) , k = 0 , 1 , 2 , ... and − λ k (0 < λ 0 < λ 1 < · · · ) a re the eigen v alues of th e generator. A plot of λ 0 is give n in Figure 1 of Kim ur a (1955). V (1) 1 k ( c, z ) is the oblate s p heroidal w a ve function (see App endix): (1.7) V (1) 1 k ( c, z ) = X l ≥ 0 ′ f k l ( c ) T 1 l ( z ) , where T 1 l ( z ) is the Gegen bauer function (may also b e denoted by C 3 2 l ( z )) and th e s umma- tion is o ver ev en v alues of l if k is ev en , o dd v alues of l if k is o d d . N 1 k is the normalizati on constan t of V (1) 1 k ( c, z ). Th e p robabilit y mass a t the exit b oundaries are (1.8) f ( p, 1; t ) = 2(1 − r 2 ) e c ( r +1) ∞ X k =0 V (1) 1 k ( c, r ) V (1) 1 k ( c, − 1) 2 λ k N 1 k (1 − ρ c k +2 ( t )) , and (1.9) f ( p, 0; t ) = 2(1 − r 2 ) e c ( r − 1) ∞ X k =0 V (1) 1 k ( c, r ) V (1) 1 k ( c, 1) 2 λ k N 1 k (1 − ρ c k +2 ( t )) . In Section 2, the an cestral pr o cess { b n ( t ); t ≥ 0 } without absorbing states is stu d ied. An explicit form of t he probabilit y distribu tion of b n ( t ) is o btained by u sing the momen t dualit y b et w een the ancestral pr o cess and the W righ t-Fisher d iffusion with dir ectional selection. It is also observ ed that the distribu tion of th e ancestral pro cess con v erges to a stationary distribu tion. In Section 3, the r ate of con v ergence is discussed. I n con tr ast to the neutr al pro cess, th e final rates of conv ergence are giv en by the largest eigen v alue for all the states. Bound s for the prob ab ility that b n ( t ) is at the state 1 are obtained b y an 6 elemen tary martingale argum en t, whic h corresp onds to the b ounds (1.4) for the n eutral pro cess. In S ection 4, the ancestral p ro cess with absorbing states is considered. It is sho wn that the first passage times of the ancestral pro cess { b n ( t ); t ≥ 0 } at the states 1 , 2 , ..., n − 1 are larger than that in the neutral pr o cess for all the states. By killing the mo d ifi ed pro cess, in w hic h the state 1 is the absorb in g state, the form of the joint probabilit y generating function of b n ( t ) and the n um b er of br anc hing ev ent s is obtained. By using this form ula, the exp ectatio n of the total length of the edges in the an cestral selection graph is obtained. In Section 5, the ancestral pr o cess of the wh ole p opulation { b ∞ ( t ); t ≥ 0 } is s tudied. It is sho w n that the p ro cess of fixation of the all ele in the diffusion model corresp ond s to con v ergence of the a ncestral pro cess to its stationary mea sure. The t ime to fixation of an allele co nd itional on fi x ation is studied in terms of th e ancestral pro cess. I t is sh o w n that the densit y of th e time to fixation of a sin gle m utan t ge ne conditional on fixation is giv en b y the probability of the whole p opu lation b eing descended from a single real ancestral particle, r egardless of the allelic type. In the neutral pro cess, th e densit y of th e waiting time u n til the ancestral pro cess hits the state 1 and the density of the cond itional fixation time are giv en by th e probabilit y that the ancestral pro cess is at the state 1. The p rop erty do es not h old in the pro cess with sel ection. 2. Number o f ancestra l p ar ticles In this section, we will obtain an equation that relates the momen ts of the W righ t- Fisher d iffusion with directional selectio n to a Marko v p r o cess that s p ecifies the num b er of particles (real a nd virtu al) that are presen t in the ance stral selection graph. T o deriv e this resu lt, w e w ill exploit the concept of dualit y from the theory of Mark o v p ro cesses (Ethier and Kurtz (1986); Section 4.4). If X = { X t ; t ≥ 0 } and Y = { Y t ; t ≥ 0 } are Mark o v pr o cesses with v alues in sets Z X and Z Y , resp ectiv ely , then X and Y are said to b e dual w ith resp ect to a fun ction f ( x, y ) if the identit y (2.1) E x [ f ( X t , y )] = E y [ f ( x, Y t )] 7 holds for ev ery x ∈ Z X and y ∈ Z Y . Dualit y is a u seful concept b ecause it allo ws us to use our kn owledge of one p ro cess to learn ab out the other. Although there is no general pro cedure f or iden tifying dual pr o cesses, dualit y can sometimes b e deduced using simp le generator calculations. Sp ecifically , if G x is the infinitesimal generator of the pro cess X and G y is the infin itesimal generator of the pr o cess Y , then the du alit y r elationship sho wn in (2. 1 ) will b e s atisfied if the iden tit y (2.2) G x f ( x, y ) = G y f ( x, y ) holds for all x and y . Here w e think G x f ( x, y ) as acting on th e x -v ariable of the function f ( x, y ) for eac h fixed v alue of y . T o apply these results to the W right- Fisher diffusion with selection, it will b e necessary to consider the frequency y q ( t ) = 1 − x p ( t )( q = 1 − p ) of the less fi t allele, wh ic h is itself go verned b y a W righ t-Fisher diffusion with generator: (2.3) G y f ( y ) = 1 2 y (1 − y ) f ′′ ( y ) − 2 cy (1 − y ) f ′ ( y ) . Notice that the selection co efficient is n egativ e in this case ( c ≥ 0). If we define the function f ( y , n ) = y n , then a simple calculat ion shows that G y f ( y , n ) =  n 2  [ f ( y , n − 1) − f ( y , n )] + 2 cn [ f ( y , n + 1) − f ( y , n )] = G n f ( y , n ) , (2.4) where G n is the op erator defined by (2.5) G n f ( n ) =  n 2  [ f ( n − 1) − f ( n )] + 2 cn [ f ( n + 1) − f ( n )] . In other wo rds, G n is the infin itesimal generator of the birth and death pro cess { b n ( t ); t ≥ 0 } wh ic h k eeps trac k of the n u m b er of a ncestral p articles in th e ancestral selection graph. Because f ( y , n ) is b ounded, we can use a resu lt of Ethier and Kurtz (1986) (Ch apter 4, Corollary 4.13) to dedu ce that the W right- Fisher diffusion { y q ( t ); t ≥ 0 } and the birth and death pro cess b n ( t ) are dual with resp ect to the function f ( y , n ): 8 Theorem 2.1. (2.6) E [ q b n ( t ) ] = E [( y q ( t )) n ] , n = 1 , 2 , ... Because t he righ t-hand side of this identi t y in vo lv es momen ts of the process { y q ( t ); t ≥ 0 } , the pro cess { b n ( t ); t ≥ 0 } is said to b e a momen t dual for { y q ( t ); t ≥ 0 } . Th e exis- tence of momen t duals of W righ t-Fisher diffusions with p olynomial co efficien ts w as first sho wn by Shiga (1981), and the explicit d escription of d ualit y b et ween th e b irth and death pro cess and the W righ t-Fisher diffusion with directi onal selection w as discussed by A threy a and Sw art (2005). The dualit y can b e pro v ed in terms of the ancestral selection graph (A threya a nd Swart (2005)): Pr o of. P artition an ancestral selection graph G into disconnected su b graphs G i , i = 1 , 2 , ... Let E t b e the edges, or the ancestral p articles, of a cross sectio n of G tak en at time t bac kw ard. Then, b n ( t ) = |E t | . Each E 0 ∩ G i consists only of t yp e A 2 particles if and only if E t ∩ G i consists only of t yp e A 2 particles, since at least one type A 1 particle surviv es from time t to 0 if E t ∩ G i con tain t yp e A 1 particles. Here th e ancestral selection graph is view ed forward in time. If a type A 1 particle reac hes a coalesce nce p oin t, the num b er of t yp e A 1 particles increase by 1. If a typ e A 1 particle reac hes a b ranc hing p oin t and meets another particl e, the resulting p article is alw a ys t yp e A 1 . T h us, a sample consists only of t yp e A 2 particles if and only if the ancestral particles at time t consists only of type A 2 particles.  If a samp le conta ins typ e A 1 particles, then for n = 0 , 1 , ... ; m = 1 , 2 , ... , (2.7) E [( x p ( t )) m ( y q ( t )) n ] = E [(1 − y q ( t )) m ( y q ( t )) n ] = m X i =0 ( − 1) i  m i  E [ q b i + n ( t ) ] , with the conv entio n b 0 ( t ) = 0. In particular, (2.8) E [ x p ( t )] = E [1 − y q ( t )] = E [1 − q b 1 ( t ) ] = ∞ X k =1 P [ b 1 ( t ) = k ] k X i =1  k i  p i q k − i . 9 The expression follo ws im m ediately from the distrib ution of th e n umber of particles in E t . Since the ancestral selection graph is ir reducible, a typ e A 1 particle is sampled if and only if E t con tains at least one t yp e A 1 particle. The fir st su m mation is ov er |E t | and the second summation is o v er the n um b er of t yp e A 1 particles. By the Itˆ o formula, we hav e a system of different ial equations for the momen ts of y q ( t ) (2.9) dξ n dt = − ( α n + β n ) ξ n + α n ξ n − 1 + β n ξ n +1 , n = 1 , 2 , ... where ξ n = E [( y q ( t )) n ]. The Kolmogoro v bac kw ard equ ation for the an cestral p ro cess { b n ( t ); t ≥ 0 } without absorbing state s is also giv en by (2. 9), where ξ n = P [ b n ( t ) = i ]. Th us, (2.9) with ξ n = E [ q b n ( t ) ] holds . T he isomorphism of th ese equations is a consequence of (2.6). A realizati on of the ancestral selection graph consists of tw o disconn ected subgraphs em b edded in a diag ram of a sample path of x p ( t ) can be seen in Figure 1. The ab s cissa is the forwa rd time in terv al (0 , t ) and ordinate is x p ( t ). Th ic k lines repr esen t the real geneal- ogy . Lines u sed by t yp e A 2 particles are d otted. Th e graph con tributes to E [( x p ( t )) 3 y q ( t )] and b 4 ( t ) = 4. I f an ancestral selection graph consists of the upp er subgraph only , it con tributes to E [ y q ( t )] = E [ q b 1 ( t ) ] a nd b 1 ( t ) = 2. Using an in tegral transf orm b y the Gegen b auer fu nction ( Erd´ elyi et al. (19 54)) for l = 0 , 1 , ... ; n = 1 , 2 , ... ; i = 0 , 1 , ... , (2.10) Z 1 − 1 T 1 l ( z )(1 + z ) n (1 − z ) i dz = 2 n + i i !( l + 1)( l + 2) ( n + 1) i +1 3 F 2 ( − l, l + 3 , i + 1; 2 , i + n + 2; 1) , where 3 F 2 ( − l, l + 3 , i + 1; 2 , i + n + 2; 1) is th e generalized hyp ergeometric fun ction, and with an iden tity (5.2), it is p ossible to obtai n an explicit expression for the probabilit y 10 generating function o f b n ( t ), and we hav e E [ q b n ( t ) ] = E [ y q ( t ) n ] = Z 1 0 (1 − x ) n φ ( x, p ; t ) dx + f ( p, 0; t ) = e 4 cq − 1 e 4 c − 1 + 2(1 − r 2 ) e c ( r − 1) ∞ X k =0 V (1) 1 k ( c, r ) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) = ∞ X i =1 P [ b n ( t ) = i ] q i , (2.11) where F k n ( c ) := X l ≥ 0 ′ f k l ( c ) ∞ X i =0 (2 c ) i ( n + 1) i +1 l X j =0 ( − l ) j ( l + 1) j +2 ( i + 1) j 2 · j !( j + 1)!( i + n + 2) j , r = 1 − 2 p a nd f ( p, 0; t ) is the pr obabilit y mass at 0 giv en in ( 1.9). If n = ∞ , F k ∞ ( c ) = 0 and f ( p, 0; t ) gives the probabilit y generating function. Using a p o wer s eries exp ansion in q of the Gegen b auer function (2.12) T 1 l ( r ) = ( − 1) l l X i =0 ( − l ) i ( l + 1) i +2 2 · i !( i + 1)! q i , l = 0 , 1 , .. w e obtain an explicit expression for th e probabilit y distrib ution of b n ( t ): (2.13) P [ b n ( t ) = 1] = π 1 + 8 e − 2 c ∞ X k =0 V (1) 1 k ( c, − 1) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) , and (2.14) P [ b n ( t ) = i ] = π i +8 e − 2 c ∞ X k =0 G k i ( c ) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) , i = 2 , 3 , ..., where G k i ( c ) := X l ≥ i − 1 ′ f k l ( c )( − 1) l ( − l ) i − 1 ( l + 1) i +1 2( i − 1)! i ! + i − 1 X j =1 (2 c ) j − 1 (2 c − j ) j · ( j − 1)! X l ≥ i − j − 1 ′ f k l ( c )( − 1) l ( − l ) i − j − 1 ( l + 1) i − j +1 2( i − j − 1)!( i − j )! , 11 and π i are giv en in (3.1). Note that there are finite p robabilities at the states n + 1 , n + 2 , ... . The exp ected num b er of ancestral particles is E [ b n ( t )] = π 1 e 4 c + 8 e − 2 c ∞ X k =0 V (1) 1 k ( c, − 1) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) +8 e − 2 c ∞ X i =2 i ∞ X k =0 G k i ( c ) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) , (2.15) and the fal ling factorial momen ts are (2.16) E [[ b n ( t )] i ] = i ! π i e 4 c +8 e − 2 c ∞ X j = i [ j ] i ∞ X k =0 G k j ( c ) N 1 k ( F k n ( c ) − V (1) 1 k ( c, 1) 2 λ k ) ρ c k +2 ( t ) , i = 2 , 3 , ... F or small c , th e p robabilit y distribution is app r o ximately (2.17) P [ b n ( t ) = 1] = P [ a n ( t ) = 1] − 2 c +2 c n +1 X k =2 ( − 1) k (2 k − 1)  [ n ] k ( n ) k + k ( k − 1)[ n ] k − 1 ( n ) k +1  ρ 0 k ( t )+ O ( c 2 ) , and for i = 2 , 3 , ... , P [ b n ( t ) = i ] = P [ a n ( t ) = i ] + 2 cδ i, 2 − 2 c n +1 X k = i ( − 1) k − i (2 k − 1)( i ) k − 1 i !( k − i )!  k ( k − 1)[ n ] k − 1 ( n ) k +1 + ( k 2 − k + 2 i − 2)[ n ] k ( k − i + 1)( k + i − 2)( n ) k  ρ 0 k ( t ) + 2 c ( i − 1) i − 1 [ n ] i − 1 ( i − 1)!( n ) i − 1 ρ 0 i − 1 ( t ) + O ( c 2 ) , (2.18) with a con ven tion [ n ] n +1 = 0, where a n ( t ) is the num b er of ancestors of a s amp le of n neutral genes at the time t in the past. It is p ossib le to obtain the s olution of (2 .9 ) as a p erturbation series in 2 c , wh ere the series is rep resen ted b y eige nv alues of the neutral pro cess. Let (2.19) ξ n = ξ (0) n + 2 cξ (1) n + (2 c ) 2 ξ (2) n + · · · , n = 1 , 2 , ... Denote th e rate m atrix of the neutral pr o cess { a n ( t ); t ≥ 0 } b y Q 0 = ( q 0 ,ij ), where q 0 ,i +1 ,i = α i +1 , q 0 ,ii = − α i for i = 1 , 2 , ... and other eleme nts are zero. Let the Laplace transform of ξ ( i ) n ( t ) b e ν ( i ) n ( λ ). It is straigh tforwa rd to sho w that (2.20) ν ( i ) = { ( Q 0 − λE ) − 1 C } i ( λE − Q 0 ) − 1 ξ (0) (0) , i = 1 , 2 , ... 12 where C = ( c ij ) is giv en by c ii = i, c i,i +1 = − i for i = 1 , 2 , ... and other elemen ts are zero. Note that the in v erse Laplace transform of the eleme nt in the n -th ro w and i -t h column of the mat rix { ( Q 0 − λE ) − 1 C } j ( λE − Q 0 ) − 1 giv es the j -th order co efficien ts in 2 c of P [ b n ( t ) = i ]. Let r n ( t ) b e the n umb er of b ranc hing ev en ts in the time in terv al (0 , t ) in the ancestral selection graph of a sample o f n g enes, where r n (0) = 0. T he join t probabilit y generating functions of b n ( t ) and r n ( t ) satisfy a system of differen tial equation (2.21) dξ n dt = − ( α n + v β n ) ξ n + α n ξ n − 1 + v β n ξ n +1 − (1 − v ) β n ξ n , n = 1 , 2 , .. with the in itial condition ξ n (0) = q n , wher e ξ n = E [ q b n ( t ) v r n ( t ) ]. The solution is giv en b y killing of the modifi ed pr o cess { ˜ b n ( t ); t ≥ 0 } in wh ic h the selection coefficient is v c , and w e ha v e (2.22) E [ q b n ( t ) v r n ( t ) ] = E  q ˜ b n ( t ) exp  − 2 c (1 − v ) Z t 0 ˜ b n ( u ) du  . By setting v = 0 in the last expression, w e obtain the identit y (2.23) E [ q b n ( t ) , r n ( t ) = 0] = E  q a n ( t ) exp  − 2 c Z t 0 a n ( u ) du  , whic h shows the P oisson natur e of the b ranc hing in the an cestral pro cess { b n ( t ); t ≥ 0 } . The in tegral is th e total length of th e edges in the neutr al genealog y without branc hing in (0 , t ). In p articular, P [ b 1 ( t ) = 1 , r ( t ) = 0] = e − 2 ct . 3. Conve rg ence Standard results on birth and d eath pro cesses (see, e.g., Karlin and T a ylor (1975)) giv es the stationary measure of the ancestral pro cess { b n ( t ); t ≥ 0 } . It is str aightforw ard to obtai n the stationary measure (3.1) π i := P [ b n ( ∞ ) = i ] = (4 c ) i i !( e 4 c − 1) , i = 1 , 2 , ..., 13 whic h is the zero-truncated Poisson distribution. S ince the an cestral pro cess of the real particles is the neutr al p ro cess { a n ( t ); t ≥ 0 } and the num b er of real particles b ecomes 1 in finite ti me, π 1 is the p r obabilit y that there are n o virtual particles. It is clear from (2.1 3 ) and (2.14) that for i = 1 , 2 ... , (3.2) π i − P [ b n ( t ) = i ] = O ( ρ c 2 ( t )) , t → ∞ . F or small c , th e final r ates of co nv ergence are approxima tely (3.3) lim t →∞ ( ρ c 2 ( t )) − 1 { π 1 − P [ b n ( t ) = 1] } = 3 e − 2 c  n − 1 n + 1 − 4 c ( n + 1)( n + 2)  − 2 ce − 2 c δ n, 1 + O ( c 2 ) , (3.4) lim t →∞ ( ρ c 2 ( t )) − 1 { π 2 − P [ b n ( t ) = 2] } = − 3 e − 2 c  n − 1 n + 1 − 2 n ( n − 1) c n + 2  − 2 ce − 2 c δ n, 1 + O ( c 2 ) , and (3.5) lim t →∞ ( ρ c 2 ( t )) − 1 { π i − P [ b n ( t ) = i ] } = − 3 e − 2 c n − 1 n + 1 (2 c ) i − 2 ( i − 2)! + O ( c i − 1 ) , i = 3 , 4 , ... In con trast to the neutral pro cess, th e final r ates of con v ergence are giv en b y the largest eigen v alue for all th e state s. In the neutral p ro cess, w e ha v e (3.6) lim t →∞ ( ρ 0 i ( t )) − 1 P [ a n ( t ) = i ] = ( i ) i [ n ] i i !( n ) i , i = 1 , 2 , ..., n. The tot al v ariation norm has no simple form a s in (1.2). A simp le argumen t giv es a b oun d for P [ b n ( t ) = 1]. The even t that the num b er of ancestral particles is 1 is a su bset of the ev en t that the num b er of real particle is 1, and w e ha v e (3.7) P [ b n ( t ) = 1] ≤ P [ a n ( t ) = 1] , n = 1 , 2 , ... An elemen tary argument on a martinga le giv es explicit b ou n ds for P [ b n ( t ) = 1], whic h corresp onds to th e b ounds (1.4) in the n eutral process. Let η ( n ; c ) satisfy a recur sion (3.8) ( λ 0 − α n − β n ) η ( n ; c ) + α n η ( n − 1; c ) + β n η ( n + 1; c ) = 0 , n = 1 , 2 , ... 14 with the b oun dary co nd ition η (1; c ) = − 2 c . Since η is an eigenv ector of the transition probabilit y matrix of the ancest ral pr o cess { b n ( t ); t ≥ 0 } , η ( b n ( t ); c )( ρ c 2 ( t )) − 1 is a martin- gale to the ancestral pro cess (se e, e.g., Karlin and T a ylor (1975)). Then, (3.9) E [ η ( b n ( t ); c )] = η ( n ; c ) ρ c 2 ( t ) . Although the explicit f orm of η ( n ; c ) is n ot a v ailable, it is p ossible to obtain an asymptotic form. Because o f (3.10) η ( n ; c ) η ( n − 1; c ) → 1 + 2 λ 0 n ( n − 1) + O ( n − 3 ) , n → ∞ , w e deduce the asymptotic fo rm η ( n ; c ) ≈ η ( ∞ ; c )(1 − 2 λ 0 /n ), where η ( ∞ ; c ) is a function of c . Although the explicit form of η ( ∞ ; c ) is not a v ailable, it is ve ry close to 3 exp ( − c ) (See Figure 2). F or small c , η ( n ; c ) can be expanded in to a p o w er series i n c : (3.11) η ( n ; c ) = 3 n − 1 n + 1  1 − n 2 + n + 2 ( n − 1)( n + 2) c + 4(3 n 2 + 10 n + 18) 25( n + 2)( n + 3) c 2  + O ( c 3 ) , n = 2 , 3 , ... Note that η ( ∞ ; c ) is not exactly equal to 3 exp ( − c ). Lemma 3.1. η ( n ; c ) is monotonic al ly incr e asing in n . Pr o of. Denote th e rate matrix of the ancestral pro cess { b n ( t ); t ≥ 0 } b y Q c = ( q c,ij ), where q c,i +1 ,i = α i +1 , q c,ii = − ( α i + β i ) , q c,i,i +1 = β i for i = 1 , 2 , ... and other elemen ts are zero. η is an eigen v ector of an oscillatory matrix E + Q c (2 N ) − 1 whic h b elongs to the second largest eigen v alue 1 − λ 0 (2 N ) − 1 . An eigenv ector of an osc illatory matrix wh ic h b elongs to the second largest eigen v alue has exactly one v ariation of sign in the co ord inates (see, Gan tmac h er (1959), pp . 105). Assume η ( i ; c ) > 0 , i ≥ L and η ( i ; c ) ≤ 0 , 1 ≤ i ≤ L − 1. 15 Supp ose l ≥ L − 1. By an induction w e deduce from (3.8) that η ( l + 1; c ) − η ( l ; c ) = α l β l { η ( l ; c ) − η ( l − 1; c ) } − λ 0 η ( l ; c ) = ( l − 1)! (4 c ) l − 1 ( η (2; c ) − η (1; c ) − λ 0 π 2 l X i =2 η ( i ; c ) π i ) . By taking t = ∞ in (3.9) it follo ws that (3.12) ∞ X i =1 η ( i ; c ) π i = 0 . Th us, (3.13) η ( l + 1; c ) − η ( l ; c ) = λ 0 8 c 2 π l − 1 ∞ X i = l +1 η ( i ; c ) π i > 0 . Next, supp ose 2 ≤ l ≤ L − 2. W e ha ve (3.14) η ( l + 1; c ) − η ( l ; c ) = ( l − 1)! (4 c ) l − 1 ( η (2; c ) − η (1; c ) − λ 0 π 2 l X i =2 η ( i ; c ) π i ) > 0 . Finally , η (2; c ) − η (1; c ) = λ 0 > 0.  F rom Lemma 3.1 it follo ws that P [ b n ( t ) = 1] η (1; c ) + P [ b n ( t ) > 1] η (2; c ) ≤ E [ η ( b n ( t ); c )] ≤ P [ b n ( t ) = 1] η (1; c ) + P [ b n ( t ) > 1] η ( ∞ ; c ) , (3.15) here w e note th at there are finite pr ob ab ilities at the s tates larger than n . Th en, from (3.9) w e ha ve th e f ollo wing b ounds: Theorem 3.2. If η ( n ; c ) satisfies the r e cursion (3.8), then (3.16) η ( n ; c ) e − λ 0 t + 2 c η ( ∞ ; c ) + 2 c ≤ 1 − P [ b n ( t ) = 1] ≤ η ( n ; c ) e − λ 0 t + 2 c λ 0 , n = 1 , 2 , ... Figure 3 sho ws the b ound s w h en c = 0 , 0 . 1 , 1 and 8 for n = 10. When c > 0, in contrast to the neutral case, the states larger than 1 ha v e fi n ite probabilities in the stationary measure (3.1) and the upp er and lo wer b ound s do not conv erge to the same v alue. Figure 3b sho ws that the up p er b ound is sharp for small v alues of c and loose for inte rmediate 16 v alues of c (Figure 3c), and that b oth the up p er and lo wer b ounds conv erge to 1 as c tends to infinit y . 4. First p ass age times Let (4.1) W c n,i := inf { t ≥ 0; b n ( t ) = i } , i = 1 , 2 , ... and { b 1 n ( t ); W c n, 1 ≥ t ≥ 0 } b e a mod ified pro cess, where th er e is an absorb ing state at 1, or the ultimate ancestor. The mo dified p ro cess is the same as that in tro du ced for ancestral recom bination graph, where 4 c is replaced by the recom bination parameter ρ (Griffiths (1991)). Theorems 1, 2, 3 in Griffiths (1991) hold f or the mo dified pr o cess. The mo dified pr o cess was stud ied b y Krone and N euhaus er (1997). Here, m o dified pro cesses { b i n ( t ); W c n,i ≥ t ≥ 0 } , wher e th ere is an ab s orbing state at i = 1 , 2 , ..., n − 1, are stud ied to discuss the first p assage times of th e ancestral pro cess { b n ( t ); t ≥ 0 } at the states 1 , 2 , ..., n − 1. It is p ossible to sho w that the exp ected first passage times of the ancestral pro cess { b n ( t ); t ≥ 0 } at the states 1 , 2 , ..., n − 1 are larger than those in the neutral p ro cess { a n ( t ); t ≥ 0 } . E [ W c n, 1 ] is give n in Krone a nd Neuhauser (1997). Theorem 4.1. L et (4.2) W 0 n,i := inf { t ≥ 0; a n ( t ) = i } , i = 1 , 2 , ... Then, (4.3) E [ W c n,i ] = 2 n − 1 X k = i ∞ X j =0 (4 c ) j ( k ) j +2 > E [ W 0 n,i ] , i = 1 , 2 , ..., n − 1 , wher e W c n, 1 is the time to the ultimate anc estor. 17 Pr o of. The theorem follo ws from stand ard r esults o n birth and death pro cesses (see, e.g., Karlin and T a ylor (1975)). The mo dified pro cesses { b i n ( t ); W c n,i ≥ t ≥ 0 } hit the states i = 1 , 2 , ..., n − 1 in finite ti me with probabilit y one, since (4.4) ∞ X m = i m +1 Y k = i +1 α k β k = (4 c ) i − 1 ( i − 1)! ∞ X m = i m ! (4 c ) m = ∞ , i = 1 , 2 , ..., n − 1 . F rom the Kolmogoro v bac kward equation f or the mo d ified pro cess { b i n ( t ); W c n,i ≥ t ≥ 0 } , whic h is (2.9) for n = i + 1 , i + 2 , ... with ξ n = P [ b i n ( t ) = i ] and the b oun dary condition ξ i = δ ( t ), the exp ected first passage times satisfy a recursion for i = 1 , 2 , ..., n − 1 (4.5) ( α n + β n ) ζ ( n ) − α n ζ ( n − 1) − β n ζ ( n + 1) = 1 , n = i + 1 , i + 2 , ..., n − 2 with the b oun d ary cond ition ζ ( i ) = 0, wh ere ζ ( n ) = E [ W c n,i ]. It is straigh tforwa rd to s olv e the recursion a nd obtain (4.6) E [ W c n,i ] = ∞ X m = i γ m + n − 2 X j = i j +1 Y k = i +1 α k β k ∞ X l = j +1 γ l = 2 n − 1 X k = i ∞ X j =0 (4 c ) j ( k ) j +2 , i = 1 , 2 , ..., n − 2 , and (4.7) E [ W c n,n − 1 ] = ∞ X m = i γ m = 2 ∞ X j =0 (4 c ) j ( k ) j +2 , where γ i = 1 α i +1 = 2 i ( i + 1) , γ m = β i +1 β i +2 · · · β m α i +1 α i +2 · · · α m α m +1 = 2(4 c ) m − i ( i ) m − i +2 , m = i + 1 , i + 2 , ... It is clear from (4.6) and (4.7) that (4.8) E [ W c n,i ] > 2 n − 1 X k = i 1 k ( k + 1) = 2  1 i − 1 n  = E [ W 0 n,i ] , i = 1 , 2 , ..., n − 1 .  As c → ∞ , for i = 1 , 2 , ..., n − 1, (4.9) E [ W c n,i ] = 2 n − 1 X k = i 1 k ( k + 1) ∞ X j =0  4 c k +2  j Q j − 1 l =0  1 + l k +2  > 2 n − 1 X k = i e 4 c k +2 k ( k + 1) → ∞ . 18 Corollary 4.2. F or the whole p opulatio n ( n = ∞ ), the exp e cte d first p assage times ar e (4.10) E [ W c ∞ ,i ] = 2 ∞ X j =0 (4 c ) j ( j + 1)( i ) j +1 , i = 1 , 2 , ... Pr o of. It follo ws immediately from an identit y (4.11) ∞ X k = i 1 ( k ) j +2 = 1 ( j + 1)( i ) j +1 , j = 0 , 1 , ...  It is straight forward to obtain higher moments of the first passage times of the ancestral pro cess { b n ( t ); t ≥ 0 } at the states 1 , 2 , ..., n − 1 in the same manner. T he second momen ts E [( W c n,i ) 2 ] sa tisfy a recursion (4.12) ( α n + β n ) ζ ( n ) − α n ζ ( n − 1) − β n ζ ( n + 1) = 2 E [ W c n,i ] , n = i + 1 , i + 2 , ..., n − 2 with the b ound ary condition ζ ( i ) = 0, wher e ζ ( n ) = E [( W c n,i ) 2 ]. Ho wev er, there is no simple form for the d ensit y as in (1.3). The Laplace transf orm of the first passage times of the ancestral pro cess satisfy a recursion for i = 1 , 2 , ..., n − 1 (4.13) ( λ + α n + β n ) ζ ( n ) − α n ζ ( n − 1) − β n ζ ( n + 1) = 0 , n = i + 1 , i + 2 , ..., n − 2 with the b oundary co ndition ζ ( i ) = 1, where ζ ( n ) = E [ e − λW c n,i ]. The join t p r obabilit y generating fun ction of b 1 n ( t ) and r n ( t ) satisfies a system of differ- en tial equation (2 .21) with ξ n = E [ q b 1 n ( t ) v r n ( t ) ]. By ta king t = ∞ , w e h a v e (4.14) 0 = − ( α n + β n ) ξ n + α n ξ n − 1 + v β n ξ n +1 , n = 1 , 2 , .., with the b ound ary condition ξ 1 = 1, where ξ n = E [ v r n ( ∞ ) ]. T he form of the probability generating function o f r ( ∞ ) is (4.15) E [ v r n ( ∞ ) ] = E  exp  − 2 c (1 − v ) Z W vc n, 1 0 ˜ b 1 n ( u ) du  , 19 while the explicit form of the p robabilit y generating function is giv en b y Theorem 5.1 in Ethier and Griffith s (1 990), where ρ is replaced b y 4 c , and we hav e (4.16) E [ s r n ( ∞ ) ] = R n ( v ) R 1 ( v ) , where R n ( v ) = Z 1 0 x 4 c (1 − v ) − 1 (1 − x ) n − 1 e − 4 cv (1 − x ) dx = ( n − 1)! (4 c (1 − v )) n 1 F 1 ( n ; 4 c (1 − v ) + n ; − 4 cv ) = ∞ X i =0 ( n + i − 1)!( − 4 cv ) i (4 c (1 − v ) + n ) n + i i ! . (4.15) pro vides a w a y to compu te th e exp ectation of the total length of the edges in the ancestral selecti on graph in the ti me in terv al (0 , W c n, 1 ), and we ha v e (4.17) E  Z W c n, 1 0 b 1 n ( u ) du  = 1 2 c E [ r n ( ∞ )] = ∞ X k =1 (4 c ) k − 1 n − 1 X m =1 1 ( m ) k . It is p ossible to obtain the p r obabilit y that the mo difi ed process { b 1 n ( t ); W c n, 1 ≥ t ≥ 1 } hits the states n + 1 , n + 2 , ... Theorem 4.3. L et z (1) = 0 , z (2) = 1 , and (4.18) z ( j ) = 1 + α 2 β 2 + α 2 α 3 α 2 β 3 + · · · + α 2 α 3 · · · α j − 1 β 2 β 3 · · · β j − 1 = j − 2 X k =0 k ! (4 c ) k , j = 3 , 4 , ... Then, the pr ob ability that the mo difie d pr o c ess { b 1 n ( t ); W c n, 1 ≥ t ≥ 1 } hits the states m = n + 1 , n + 2 , ... is (4.19) P [ W c n, 1 > W c n,m ] = z ( n ) z ( m ) . Pr o of. The theorem follo ws from stand ard results on birth and death pro cesses (see, Karlin and T a ylor (1975), pp. 323). It is straight forward to show that z ( b 1 n ( t )) is a mar- tingale for th e mo dified pr o cess. m in { W c n, 1 , W c n,m } is a Mark o v time with resp ect to the mo dified pro cess. W e apply th e optional sampling theorem t o conclude t hat (4.20) z ( n ) = E [ b 1 n (min { W c n, 1 , W c n,m } )] = P [ W c n, 1 > W c n,m ] z ( m ) , m = n + 1 , n + 2 , ... 20  Remark 4.4. F or smal l c , P [ W c n, 1 < W c n,m ] c an b e exp ande d into a p ower series in c . (4.21) P [ W c n, 1 > W c n,m ] = (4 c ) m − n [ m − 2] m − n + O ( c m − n +1 ) , m = n + 1 , n + 2 , ... Figure 4 sho ws the hitting probabilit y for n = 10 and 50 as a function of c . It can b e seen that the hitting probabilit y gro w s qu ic k ly around a critical v alue of c . F or n = 50 and m = 51, the probabilit y gro ws linearly un til c is smaller th an 4 . 5 (See Remark 4.4). Then it a pp r oac hes 1 quic kly . 5. time to f ixa tion In studying ev olutionary pro cesses from the standp oin t of p opu lation genetics, the pr ob- abilit y and the time to fixation of a mutan t gene pla y imp ortan t r oles. The exp ected time to fixation of a m utan t gene conditional on fixation was obtai ned by Kim u r a and Ohta (1969). F urth er m ore, Ew ens (1973) and Maruyama and Kimura (1974) sho wed that the exp ected length of time whic h it tak es for an allele to increase f requency from q to y ( > q ) on the wa y to fixation is equal to the exp ected length of time which the same allele tak es when its fr equ ency decrease from y to q on the wa y to extinction. The time-rev ersibilit y prop erty is equiv alen t to the prop erty th at the density of the exp ected s o jour n time do es not dep end on the sign of the selection coefficient, whic h w as shown b y Maruyama (1972). While these results are well kno wn, their interpretatio n in terms of the ancestral pro cess of the wh ole popu lation { b ∞ ( t ); t ≥ 0 } are in teresting. The fixatio n p robabilit y w as obtai ned by solving the Kolmogoro v bac kward equation for the W r igh t-Fisher diffusion w ith dir ectional selection { x p ( t ); t ≥ 0 } (Kim ura (1957)). The fixation p robabilit y of th e allele A 1 is (5.1) u 1 ( p ) = 1 − e − 4 cp 1 − e − 4 c , 21 and the fixati on probabilit y of the allele A 2 is 1 − u 1 ( p ). It follo ws from (1.9) that (5.2) 1 − u 1 ( p ) = 2(1 − r 2 ) e c ( r − 1) ∞ X k =0 V (1) 1 k ( c, r ) V (1) 1 k ( c, 1) 2 λ k . It is p ossible to obtain the fixation pr obabilit y from the stationary measure of the ancestral pro cess (3.1). If the allele A 2 fixes in a p op u lation, the ancestral particles of the whole p opulation in in finite time bac kw ards consist of t yp e A 2 particles only , and w e ha v e (5.3) E [ q b ∞ ( ∞ ) ] = ∞ X i =1 π i q i = e 4 cq − 1 e 4 c − 1 = 1 − u 1 ( p ) . The r elationship b et w een t he fixation p robabilit y and the probability generating function of the stationary measure is a sp ecial case of Th eorem 2 (f ) in Athrey a and Sw art (2005). The densit y of time to fixation of the allel e A 2 conditional on fixation h as a genealo gical in terpretation. Let (5.4) T c 0 := inf { t ≥ 0; y q ( t ) = 1 } . Then, it fo llo ws from the expression (5.5) P [ T c 0 < t | T c 0 < ∞ ] = E [ q b ∞ ( t ) ] 1 − u 1 ( p ) = P ∞ i =1 P [ b ∞ ( t ) = i ] q i P ∞ i =1 π i q i that th e pro cess of fixation of the allele in a diffu sion mo del, in which the left hand side con v erges to one as t → ∞ , corresp ond s to conv ergence of th e distribution of the ancestral pro cess P [ b ∞ ( t ) = i ] t o its statio nary measur e π i as t → ∞ . The exp ected time to fixation of the allele A 2 conditional on fixation was obtained by solving the K olmogoro v bac kward equation (Kimura a nd Ohta (1969), Maruy ama (1972)), and (5.6) E [ T c 0 | T c 0 < ∞ ] = Z 1 0 Φ( q , y ) dy , 22 where Φ ( q , y ) is the d ensit y of the exp ected so journ time of th e allele A 2 at fr equency y in the p ath sta rting from frequency q and going to fixation, and Φ( q , y ) = S ( y ) S (1 − y ) 2 cy (1 − y ) S (1) , y > q , = S ( y ) 2 cy (1 − y )  S (1 − y ) S (1) − S ( q − y ) S ( q )  , y < q , (5.7) and S ( y ) = exp(4 cy ) − 1. Then, (5.8) E [ T c 0 | T c 0 < ∞ ] = Z 1 0 S ( y ) S (1 − y ) 2 cy (1 − y ) S (1) dy − Z q 0 S ( y ) S ( q − y ) 2 cy (1 − y ) S ( q ) dy , where Z 1 0 S ( y ) S (1 − y ) 2 cy (1 − y ) S (1) dy = π 1 8 c 2 ∞ X i =1 ∞ X j =1 (4 c ) i + j i ! j ! Z 1 0 y i − 1 (1 − y ) j − 1 dy = π 1 2 c ∞ X k =1 (4 c ) k ( k + 1)! k X i =1 1 i ( k − i + 1) = 4 π 1 ∞ X k =0 H k +1 (4 c ) k ( k + 2)! , H k = 1 + 1 / 2 + · · · + 1 /k , and Z q 0 S ( y ) S ( q − y ) 2 cy (1 − y ) S ( q ) dy = 1 2 cS ( q ) ∞ X i =1 ∞ X j =1 (4 c ) i + j i ! j ! Z q 0 y i − 1 ( q − y ) j 1 − y dy = 1 2 cS ( q ) ∞ X i =1 ∞ X j =1 (4 cq ) i + j i ( i + j )! 2 F 1 (1 , i, i + j + 1; q ) = 1 2 cS ( q ) ∞ X i =1 ∞ X j =1 (4 cq ) i + j i ( i + j )! ∞ X k =0 ( i ) k q k ( i + j + 1) k . It is p ossible to obtain the exp ected time to fixation of the allele A 2 conditional on fixation from the distribution b ∞ ( t ), and we ha v e E [ T c 0 | T c 0 < ∞ ] = 1 P ∞ i =1 π i q i Z ∞ 0 t d dt " ∞ X i =1 { P [ b ∞ ( t ) = i ] − π i } q i # dt = 1 P ∞ i =1 π i q i ∞ X i =1 q i Z ∞ 0 { P [ b ∞ ( t ) = i ] − π i } dt. (5.9) 23 F rom the t wo e xpressions (5.8) and ( 5.9), an iden tit y at q = 0 follo ws immediately . (5.10) ∞ X k =0 V (1) 1 k ( c, − 1) V (1) 1 k ( c, 1) λ 2 k N 1 k = e 2 c π 2 1 ∞ X k =0 H k +1 (4 c ) k ( k + 2)! . It is str aigh tforward to obtain similar identi ties by comparing (5.8) and (5.9) in eac h p o wer of q . Moreo v er, explicit expression for the higher moments of the time to fixa- tion, conditional on fixation, are a v ailable (Maruy ama, 1972), and they pro duce similar iden tities. The densit y of the time to fixation of a single m utant ge ne conditional on fixation has in teresting prop erties. Let (5.11) T c 1 := inf { t ≥ 0; x p ( t ) = 1 } . Then, from a time-rev ersibilit y argum en t on the cond itional d iffusion pr o cess (Ewe ns (1973), Maruy ama and Kim ura (1974)), w e ha v e (5.12) lim q → 0 P [ T c 0 < t | T c 0 < ∞ ] = lim p → 0 P [ T c 1 < t | T c 1 < ∞ ] = P [ b ∞ ( t ) = 1] π 1 . The same densit y holds for a mutan t gene of allele A 1 and a mutan t gene of alle le A 2 . This p r op ert y has an in tuitiv e genealog ical interpretatio n. The conditional d ensit y is giv en b y the probability of the whole p opu lation b eing descended from a single real ancestral particle. Since there is no v ariation in th e p opulation, selection cannot ha v e an effect on it and consequently , the conditional d ensit y should n ot dep end o n the allelic type. Theorem 3.2 give s b oun ds for the densit y of the time to fixation of a single m utant gene conditional on th e fixation, and (5.13) 1 π 1 − η ( ∞ ; c ) e − λ 0 t + 2 c π 1 λ 0 ≤ lim q → 0 P [ T c 0 < t | T c 0 < ∞ ] ≤ 1 π 1 − η ( ∞ ; c ) e − λ 0 t + 2 c π 1 ( η ( ∞ ; c ) + 2 c ) . The b ound s are useless wh en c is large, sin ce the upp er and lo w er b ounds do not conv erge as c → ∞ . When c is small, the lo w er b ound is sharp. Figure 5a s ho ws the b ounds when c = 0 . 1. Fi gure 5b giv es the conditional den sit y an d the deriv ativ e of the lo wer 24 b ound π 1 − 1 η ( ∞ ; c ) exp( − λ 0 t ). It can b e seen that the ta il of the conditional density is w ell c haracterised b y the largest eigen v alue. It is w orth noting that the iden tit y (5.12) gives the follo wing ident it y in the distribu tion b n ( t ). Its in terpr etation in terms of the ancestral pro cess { b n ( t ); t ≥ 0 } is unclear. Remark 5.1. (5.12) gives (5.14) P [ b ∞ ( t ) = 1] = lim n →∞ e − 4 c n X k =1 ( − 1) k +1  n k  E [ b k ( t )] . Pr o of. (5.12) is equiv alent to (5.15) lim p → 0 f ( p, 1; t ) u 1 ( p ) = lim q → 0 f ( p, 0; t ) 1 − u 1 ( p ) = P [ b ∞ ( t ) = 1] π 1 , where lim p → 0 f ( p, 1; t ) u 1 ( p ) = lim p → 0 lim n →∞ E [ x p ( t ) n ] u 1 ( p ) = lim p → 0 lim n →∞ E [(1 − y q ( t )) n ] u 1 ( p ) = lim p → 0 lim n →∞ n X k =0 ( − 1) k  n k  E [ q b k ( t ) ] u 1 ( p ) = lim n →∞ e − 4 c n X k =1 ( − 1) k +1  n k  E [ b k ( t )] π 1 . (5.16)  In t he neutral W r igh t-Fisher diffusion, the densit y of time to fixation of a mutan t gene conditional on fi xation follo ws (5.17) lim q → 0 P [ T 0 0 < t | T 0 0 < ∞ ] = P [ a ∞ ( t ) = 1] , where T 0 0 is the time to fixation of a mutan t gene in the neutral W r igh t-Fisher diffusion. F rom (5.8), the exp ected time to fixation of a mutan t gene conditional on fixation has a simple form (5.18) lim q → 0 E [ T c 0 | T c 0 < ∞ ] = 4 π 1 ∞ X j =0 H j +1 (4 c ) j ( j + 2)! < lim q → 0 E [ T 0 0 | T 0 0 < ∞ ] = 2 , where the inequalit y holds from the follo wing lemma: 25 Lemma 5.2. The density of exp e cte d sojourn time of the al lele A 2 at fr e quency y in the p ath starting fr om fr e quency 0 and going to fixation satisfies (5.19) S ( y ) S (1 − y ) 2 cy (1 − y ) S (1) < 2 , 0 < y < 1 . Pr o of. The inequalit y is equiv alen t to (5.20) e 4 cy − 1 y e 4 c (1 − y ) − 1 1 − y < 4 c ( e 4 c − 1) , or (5.21) ∞ X i =0 i X j =0 (4 c ) i y j (1 − y ) i − j ( j + 1)!( i − j + 1)! < ∞ X i =0 (4 c ) i ( i + 1)! . The inequalit y follo ws from an inequalit y (5.22) i X j =0 y j (1 − y ) i − j ( j + 1)!( i − j + 1)! < 1 ( i + 1)! i X j =0 i ! y j (1 − y ) i − j j !( i − j )! = 1 ( i + 1)! , i = 0 , 1 , ...  As c b ecomes large, P [ b ∞ ( t ) = 1] decreases, while th e exp ected fixation time of a mutan t gene conditional on fixation decreases. It is straigh tforwa rd to show that the inequalit y for the exp ected fixation time (5.18) is e quiv alen t to an inequalit y (5.23) Z ∞ 0  P [ b ∞ ( t ) = 1] π 1 − P [ a ∞ ( t ) = 1]  dt > 0 . In the neutral pro cess, b oth the densit y of the w aiting time until the ancestral pro cess hits the state 1 and the densit y of th e conditional fixation time are given b y th e probabilit y that the a ncestral pro cess is a t the sta te 1 (1.3,5.1 7 ). It follo ws that (5.24) E [ W 0 ∞ , 1 ] = lim q → 0 E [ T 0 0 | T 0 0 < ∞ ] = Z ∞ 0 { P [ a ∞ ( t ) = 1] − 1 } dt = 2 . In con trast, in the pro cesses w ith directio nal selection, w e ha ve (5.25) E [ W c ∞ , 1 ] = 2 ∞ X j =0 (4 c ) j ( j + 1)( j + 1)! > 2 , 26 while (5.26) lim q → 0 E [ T c 0 | T c 0 < ∞ ] = Z ∞ 0  P [ b ∞ ( t ) = 1] π 1 − 1  dt = 4 π 1 ∞ X j =0 H j +1 (4 c ) j ( j + 2)! < 2 . 6. discu ssion In this article, the ancestral pro cess { b n ( t ); t ≥ 0 } , sp ecifying the num b er of b ranc hes in the a ncestral selec tion graph, was inv estigated b y e xploiting the moment dualit y b et w een this p ro cess and the W right-Fi sher diffusion w ith directional selecti on. An explicit formula for the p robabilit y distr ib ution of b n ( t ) w as d er ived. Although this expression cannot b e giv en in closed form, since it in v olv es eigen v alues and coefficient s wh ich are determined b y an intrac table three-term recursion relation, it is p ossible to expand the probabilit y distribution as a p erturbation series in 2 c . This expression is giv en in a closed form for eac h o rder of th e p erturbation and is accurate w hen c is small. Bounds for the probability that b n ( t ) is a t the state 1 is obtained. When c is small, one of the b ounds is sharp . T he densit y of time to fixation of a single mutan t ge ne cond itional on fixation wa s sh o wn to b e giv en b y the probabilit y of the whole p opulation b eing descended from a single real particle, regardless of the allelic type. Thus, the b ounds for the pr obabilit y that b n ( t ) is at the state 1 giv e b ounds for the density of th e conditional hitting time. It w as sho wn that the tail of the conditional hitting time is we ll c haracterised b y the largest eigenv alue. The probabilit y that the pro cess hits states larger than the initial state b efore the pr o cess hits the state 1 w as obtained. According to the formula, th e n umb er of branches in the ancestral selection graph gro ws rapidly when c is larger th an a critical v alue. On e of the difficulties of sim ulating the ancestral select ion graph is kee ping trac k of large num b ers of branc hes when c is large. Sp ecifically , when c is larger than the critical v alue, enormous n umb er of br anc hes emerge and it will b e difficult to sim ulate th e ancestral selec tion graph. If a sample consists only of typ e A 2 particles, the probabilit y distribution of the ancestral particles, all of which are A 2 , is b n ( t ) (see T heorem 2.1). If a sample con tains t yp e A 1 particles, the j oin t probability distribution of the num b er of the A 1 particles and 27 the num b er of the A 2 particles is interesting. Ho w ev er, it s eems that the expression of the momen ts in the W righ t-Fisher diffus ion (2.7) do es not giv e an y insigh ts int o the join t pr obabilit y distribution, except in the case wh en a sample consists of a single A 1 particle. Th e time-rev ersibilit y argumen t of the cond itional diffusion p ro cess giv es an iden tit y , whose interpretatio n in terms of the ancestral pro cess is unclear (see Remark 5.1). The in terpretation of the time-reversibilit y in terms of the ancestral p ro cess needs further in vestig ation. In this article, the fixation pro cess of a mutan t gene in the W right -Fisher diffusion with directional sele ction, whic h has b een well studied, w as interpreted in terms of t he con ver- gence of the ancestral p r o cess to its stationary measure. This approac h migh t b e u seful for other p opu lation genetica l mo dels for which le ss is kno w n ab out the fixation p ro cess. Examples includ e diffusion p ro cesses with frequency-dep endent selec tion, with multiple selected allel es, and of spatially- structur ed p opulations (A threya a nd Swa rt, 2005). Appendix A. The o bla te spheroid al w a ve function The oblate spheroidal wa ve fu nction V (1) 1 k ( c, z ) can b e represented b y expansions of the form (Stratto n et al. (194 1 )) (A.1) V (1) 1 k ( c, z ) = X l ≥ 0 ′ f k l ( c ) T 1 l ( z ) , k = 0 , 1 , ... This notation was used in Kimura (1955). It w as d enoted by V (1) 1 k ( − ic, z ) in Stratton et a l. (1941) and (1 − z 2 ) 1 2 S 1 k + 1 ( c, z ) in Flammer (1957). F rom the orthogonal p rop erties of the Gegen b auer function it is sho wn that (A.2) Z 1 − 1 (1 − z 2 ) V (1) 1 k ( c, z ) V (1) 1 l ( c, z ) dz = δ k ,l N 1 k , where N 1 k = 2 X l ≥ 0 ′ ( l + 1)( l + 2) (2 l + 3) ( f k l ( c )) 2 . 28 Note that (A.3) V (1) 1 k ( c, 1) = 1 2 X l ≥ 0 ′ ( l + 1)( l + 2) f k l ( c ) , V (1) 1 k ( c, − 1) = ( − 1) k V (1) 1 k ( c, 1) . The coefficien ts f k l ( c ) satisfy a three-term recursion in the form (A.4) A l +2 f k l +2 ( c ) + B l f k l ( c ) + C l − 2 f k l − 2 ( c ) = 0 , where A l = − ( l + 1)( l + 2) (2 l + 1)(2 l + 3) , B l = l ( l + 3) − b k c 2 − 2 l 2 + 6 l + 1 (2 l + 1)(2 l + 5) , C l = − ( l + 1)( l + 2) (2 l + 3)(2 l + 5) , and b k = 2 λ k − 2 − c 2 . f k l ( c ) = 0 f or o dd l if k is ev en and for ev en l if k is o dd. (A.4) can b e dev elop ed as a con tin ued fraction. f k l f k l +2 = − A l +2 B l − C l − 2 A l B l − 2 − · · · C 2 A 4 B 2 − A 2 B 0 l = 0 , 2 , ... − A l +2 B l − C l − 2 A l B l − 2 − · · · C 3 A 5 B 3 − A 3 B 1 l = 1 , 3 , ... (A.5) and (A.6) f k l +2 f k l = − C l B l +2 − A l +4 C l +2 B l +4 − · · · , l = 0 , 1 , .. b k is determined by the condition that the reciprocal of the r atio f l /f l +2 b y (A.5) must equal the v alue of f l +2 /f l obtained from (A.6). Then, the contin ued fractions pro vide a w a y to compute arb itrary coefficient . F or small c , th e eigenv alue can b e e xpand ed in to a p o w er series i n c . (A.7) λ k = ( k + 1)( k + 2) 2 + ( k + 1)( k + 2) (2 k + 1)(2 k + 5) c 2 + O ( c 4 ) . If w e set f k k ( c ) = 1, then (A.8) f k k +2 ( c ) = ( k + 1)( k + 2) 2(2 k + 3)(2 k + 5) 2 c 2 + O ( c 4 ) , f k k − 2 ( c ) = − ( k + 1)( k + 2) 2(2 k + 1) 2 (2 k + 3) c 2 + O ( c 4 ) , and other c o efficien ts are zero up to O ( c 4 ). 29 Ac kno w ledgmen ts. Th e author is grate ful for useful discussion with J otun Hein, h is group, Rob ert C . Griffiths and Ja y E. T a ylor. He is also grateful for v aluable comments and suggestio ns by anonymous referee s. Referen ces A threy a, S. R. and Sw art, J . M. (2005) Branc hing-coalescing particle systems. Pr ob. The- ory R elat. Fields 131 3 76–414. Erd´ elyi, A. Ed . (1954). T ables of Inte gr al T r ansforms, V ol. II . McGra w-Hill, New Y ork. Ethier, S . N. and Griffiths, R. C. (1990). On the t wo -lo cus samplin g distribution. J. Math. Biol. 29 131–159. Ethier, S. N. and Kurtz, T. G. (1986) Markov Pr o c esses: Char acterization and Conver- genc e. Jonh Wile y & Sons, Ne w Y ork. Ew ens, W. J. (1973). Conditional diffusion pro cess in p opulation genetics. The or. Pop. Biol. 4 2 1–30. F earnhead, P . (200 2) The common ancestor at a nonneutral lo cus. J. Appl. Pr ob. 39 38–54. Flammer, C. (1 957). Spher oidal Wave F unctions . S tanford Univ ersity P ress, Stanford. Gan tmac h er, F. R. (1 959). Matrix The ory, V ol. II . Chelsea, New Y ork. Griffiths, R. C. (1979). Exact sampling distrib utions f r om the infinite neutral allel e mo del. A dv. Appl. Pr ob. 11 326 –354. Griffiths, R. C. (1980). Line of descen t in the diffu sion appro ximation of neutral W righ t- Fisher models. The or. Pop. Biol. 17 3 7–50. Griffiths, R. C. (1991). T he t w o-lo cus ancestral graph, in “Selected Pro ceedings of the Symp osiu m on Applied Probabilit y , Sh effield, 1989” (I. V. Basa wa and R. L. T a ylor, Eds.), pp. 100–117, IMS Lecture Not es–Monograph Series, V ol. 18, Institute o f Mathe- matical Statisti cs. Karlin, S. and T aylo r, H. M. (1975 ). A First Course in Sto chastic Pr o c esses, 2nd. e d . Academic Press, S an Die go. 30 Kim ura, M. (1955). Sto c hastic pro cess a nd distribution of gene f requencies under natural selection. Cold Spring H arb or Symp osia on Quantitative Biolo gy 20 33–53. Kim ura, M. (1957 ). Some pr ob lems of sto chastic pro cesses in genetics. Ann. M ath. Stat. 28 88 2–901. Kim ura, M. and Oh ta, T. (1969) The a ve rage num b er of generations unti l fixation of a m utan t gene in a finite p opulation. Genetics 61 763–7 71. Kingman, J. F. C. (19 82). On the genealogy of large p opulations. J. Appl. Pr ob ab. 19 27–43 . Kingman, J. F. C. (1982 ). Th e c oalescen t. Sto chastic Pr o c e ss. Appl. 13 235–2 48. Krone, S. M. and Neuhaus er, C. (1997 ). Ancestral pro cess with selection. The or. Pop. Biol. 51 210–237. Maruy ama, T. (1972). Th e a ve rage n umber and the v ariance of generations at particular gene frequency in the cours e of fixation of a m u tan t gene in a finite p opulation. Genet. R es. Camb. 19 109– 113. Maruy ama, T. and Kim ura, M. (1974). A n ote on the s p eed of gene fr equency changes in rev erse directions in a finite p opulation. Evolution 28 161–1 63. Shiga, T. (1981 ) Diffusion pro cesses in p opulation genetics. J. Math. Kyoto U niv. 21 133–1 51. Stratton, J. A., Morse, P . M. Chu, L. J. and Hunter, R. A. (19 41). El liptic Cylinder and Spher oidal Wave F unctions . John Wiley and Sons, New Y ork. T av ar ´ e, S. (1984). Line-of-descen t and genealo gical pro cesses, and their applications in p opulation genetic s. The or. Pop. Biol. 26 119– 164. 31 Figure 1. A realiza tion of the ancestral selection graph em b edd ed in a diagram of a sample path of x p ( t ). Figure 2. η ( ∞ ; c ) (dots) and 3 exp ( − c ) (line). Figure 3. 1 − P [ b 10 ( t ) = 1] (line) and the upp er and the lo wer b ound s giv en by Theorem 3.2 (dotted lin es). Figure 4. T he probabilit y th at the ancestral pro cess { b n ( t ); t ≥ 0 } h its states m ( > n ) b efore the process hits sta te 1 (ultimat e ancestor). Figure 5. (a) The cum ulativ e pr obabilit y of the d en sit y of time to fixation of a single m utan t gene conditional o n the fixation (5.12) (line) and the upp er and t he lo w er b ounds giv en b y (5.13) (dotted lines). (b) The d ensit y of the conditional fixation time (line) and π − 1 1 η ( ∞ ; c ) exp( − λ 0 t ) (dot ted line). 32 x p 0 1 t Figure 1. 33 0 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 6 7 8 3exp(-c) P S f r a g r e p l a c e m e n t s c η ( ∞ , c ) Figure 2. 34 0 0.5 1 1.5 2 2.5 0 2 4 6 8 10 (a) c=0 P S f r a g r e p l a c e m e n t s t 0 0.5 1 1.5 2 2.5 0 2 4 6 8 10 (b) c=0.1 P S f r a g r e p l a c e m e n t s t 0 0.5 1 1.5 2 2.5 0 2 4 6 8 10 (c) c=1 P S f r a g r e p l a c e m e n t s t 0 0.5 1 1.5 2 2.5 0 2 4 6 8 10 (d) c=8 P S f r a g r e p l a c e m e n t s t Figure 3. 35 0 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 (a) n=10 m=11 m=15 m=20 P S f r a g r e p l a c e m e n t s c 0 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 (b) n=50 m=51 m=55 m=60 P S f r a g r e p l a c e m e n t s c c Figure 4. 36 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 0 2 4 6 8 10 (a) P S f r a g r e p l a c e m e n t s t 0 1 2 3 4 5 6 0 2 4 6 8 10 (b) P S f r a g r e p l a c e m e n t s t t Figure 5.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment