Evolutionary dynamics of tumor progression with random fitness values

Most human tumors result from the accumulation of multiple genetic and epigenetic alterations in a single cell. Mutations that confer a fitness advantage to the cell are known as driver mutations and are causally related to tumorigenesis. Other mutat…

Authors: Rick Durrett, Jasmine Foo, Kevin Leder

Ev olutionary dynamics of tumor pro g re s s i on with ran d o m fitness v alues Ric k Durrett 1 , ∗ , Jasmine F oo 2 , † , Kevin Leder 2 , ‡ , John Ma yb erry 1 , § ¶ , and F ranzisk a Mic hor 2 , k 1 Department of Mathema tics, Cornell Uni v er sit y , Ithaca, NY 1 4 853 2 Comput ation a l Biolo gy Pro gram , Memoria l Sl oan-Kett ering Cancer C enter, New Y ork , NY 100 65 Octob er 26, 202 1 Abstract Most h uman tumors result from th e accum u lation of multiple genetic and epige- netic alteratio ns in a single cell. Mutations that confer a fitness adv an tage to the cell are kno wn as driver m utations and are causally related to tumorigenesis. O th er m uta- tions, ho wev er, do not c hange the ph enot yp e of the cell or ev en decrease cellular fitness. While muc h exp erimental effort is b eing dev oted to the id en tification of the d ifferen t functional effects of in dividual m utations, mathematical mo deling of tum or p rogres- sion generally considers constan t fitness incr ements as muta tions are accum u lated. In this p ap er w e stud y a mathematical mo del of tumor progression with random fitness incremen ts. W e analyze a m ulti-t yp e branching pro cess in whic h cells accumulat e mu- tations whose fitness effects are c hosen from a distribu tion. W e determine the effect of the fitness distribu tion on the growth k in etics of the tumor. This w ork con tributes to a qu an titativ e u nderstanding of the accum ulation of mutations leading to cancer phenot y p es. Keyw ords: cancer ev olut io n, branc hing pr o cess, fitness distribution, beneficial fitness ef- fects, mutational landscap e ∗ Partially supp orted by NSF g rant DMS 0 7 04996 from the pr obability progr am. † Partially supp orted by NIH g r ant R01 CA13823 4. ‡ Partially supp orted by NIH g r ant U54 CA14379 8 . § Corresp onding Author. Ema il: jm858@co r nell.edu, T el.: + 1 6 07 255 8262, F ax: +1 607 25 5 7 1 49. ¶ Partially supp orted by NSF R TG gra n t DMS 07391 64. k Partially supp orted by NIH gr a nt s R01 CA13823 4 and U54CA14379 8, a L e o n Levy F oundation Y oung Inv estigato r Award, a nd a Gerstner Y oung In vestigator Award. 1 1 In tro duct ion T umors result from an ev olutiona r y pro cess o ccurring within a tissue (No well, 1976). F rom an ev olutionary p oin t of view, t umor s can b e considered as collections of cells that accum ula te genetic and epigenetic a lterations. The phenot ypic c hanges that these alterations confer to cells are sub jected t o t he se lection pressures within the tissue and lead to a daptations suc h as the ev olution of more aggressiv e cell types, the emergence of resistance, induction of a ng iogenesis, ev asion of the immune system, and colonization of distan t organs with metastatic growth. Adv antageous heritable alterations can cause a r a pid expansion o f the cell clone harb oring suc h changes, since these cells a re capable of outcomp eting cells that ha ve not ev olved similar adaptations. The inv estigation o f the dynamics of cell growth, the sp eed of accum ulating m utations, and the distribution of differen t cell ty p es a t v arious timep oin ts during tumorigenesis is imp ortant for an understanding of the nat ural history of tumors. F urther, suc h knowle dge aids in the pro g nosis of newly diag nosed tumors, since the presence of cell clones with aggressiv e phenotypes lead to less optimistic predictions for tumor progression. F inally , a kno wledge of the comp osition of tumors allows for t he choice of optim um therap eutic in terv entions, as tumors harb oring pre-existing resistan t clones should b e treated differen t ly tha n drug-sensitiv e cell p opulations. Mathematical mo dels hav e led to many imp orta n t insigh ts into the dynamics of tumor progression a nd the ev olution of resistance (G oldie and Coldman, 1983 and 1984; Bo dmer and T omlinson, 1 995; Coldman and Murray , 2 000; K nudson, 2001; Maley and F orrest, 2001; Mic hor et al., 20 0 4; Iw asa et al., 2005; Ko maro v a and W o darz, 2005 ; Mic hor et al., 2006; Mic hor and Iw asa, 20 0 6; F rank 2007; W o darz and Komarov a, 2007). These mathematical mo dels generally fall in to one of t wo classes: (i) constan t p opulation size mo dels, and (ii) mo dels describing expo nentially growing p opulations. Man y theoretical in v estigations of ex- p onen tially grow ing p opulations emplo y m ulti- t yp e branc hing pro cess mo dels (e.g., Iw asa et al., 20 06; Haeno et al., 2007; Durrett and Moseley , 20 09), while others use p opulation genetic mo dels for homogeneously mixing exp onen tia lly g ro wing p opulations (e.g., Beeren winke l et al., 20 07; D urrett and Mayberry , 2 0 09). In this pap er, we fo cus on branc hing pro cess mo dels. In these mo dels, cells with i ≥ 0 mutations are denoted as t yp e- i cells, and Z i ( t ) sp ecifies the n umber of t yp e- i cells a t time t . T yp e- i cells die at r a te b i , giv e birth to o ne new t yp e- i cell at rate a i , and giv e birth to one new ty p e-( i + 1) cell at rate u i +1 . In an alternate vers ion, m utations o ccur with probability µ i +1 during birth ev ents whic h o ccur at rate α i . These t wo v ersions are equiv alent pro vided u i +1 = α i µ i +1 and a i = α i (1 − µ i +1 ). How eve r, the relationship b et w een the parameters mus t b e k ept in mind when comparing results b et w een differen t for m ulations of the mo del. One biolog ically unrealistic a sp ect of this mo del as presen ted in the literat ure is that all t yp e- i cells are assumed to ha v e the same birth and death rates. This assumption describ es situations during tumorigenesis in whic h the order of m utatio ns is predetermined, i.e. the genetic changes can only b e a ccum ulated in a particular sequence and all other combinations of mutations lead to lethalit y . F urthermore, in this in terpretation of the mo del, there cannot b e an y v ariabilit y in phenot yp e among cells with the same n um b er of m utatio ns. In man y 2 situations arising in biolog y , ho w ev er, there is marke d heterogeneit y in phenoty p e ev en if genetically , the cells are iden tical (Elo witz et al., 2002; Becs k ei et al., 2005; Kaern et al., 2005; F einerman et a l., 2008). This v ariabilit y ma y be driv en b y sto c hasticit y in gene expression or in p ost-transcriptional or p ost-translationa l mo difications. In this pap er, we mo dify the branc hing process model so that m utat io ns alter cell birth rates by a random amoun t. An important consideration for this endea v or is the c ho ice of the m ut a tional fitness distribution. The exponential distribution has b ecome the preferred candidate in theoretical studies of the genetics of ada ptation. Th e first theoretical justification of this choice w as giv en by Gillespie (1983, 1984 ) , who argued that if the num b er of p ossible alleles is larg e and the curren t allele is close to the top of the rank ordering in fitness v alues, then extreme v alue theory should provide insight into the distribution o f the fitness v alues of m utatio ns. F or man y distributions inc luding the normal, Gamma, and lognormal distributions, the maxim um of n indep enden t draw s, when prop erly scaled, conv erges to the Gumbel or double exp o nen tial distribution, Λ( x ) = exp ( − e − x ). In the biological literature, it is g enerally noted that this class of distributions only excludes exotic distributions lik e the Cauch y distribution, whic h has no moments. How ev er, in realit y , it eliminates all distributions with P ( X > x ) ∼ C x − α . F or distributions in the domain of attraction of the Gum b el distribution, and if Y 1 > Y 2 · · · > Y k are the k largest observ a tions in a sample of size n , then there is a sequence of constan t s b n so that t he spacings Z i = i ( Y i − Y i +1 ) /b n con ve r ge t o independent exp onentials with mean 1, see e.g., W eissman (1978). F ollowing up on Gillespie’s w o rk, O r r (2003 ) added the observ ation that in this setting, the distribution of the fitness increases due to b eneficial m utations has the same distribution a s Z 1 indep enden t o f the r ank i of the wild type cell. T o infer the distribution of fitness effects of newly emerged b eneficial m utat io ns, sev eral exp erimental studies w ere p erfo rmed; for examples, see Imhoff a nd Sc hlotterer (2001), San- juan et al. (200 4), and Kassen and Bata illo n ( 2006). The data from these exp erimen ts is generally consisten t with an exp onen tia l distribution of fitness effects. How ev er, t here is an exp erimental ca ve at tha t cannot b e neglected (Rozen et al., 2002) : if only those mutations are considered that reach 100% frequency in the p opulation, then the exp onen tial distribu- tion is m ultiplied by the fixation probability . By this op eratio n, a distribution with a mo de at a p o sitive v alue dev elops. In a study of a quasi-empirical mo del of RNA ev olutio n in whic h fitness w as based on secondary structures, Cowperthw ait e et al. (2005) found that fitnesses of ra ndomly selected genoty p es app eared t o follow a Gumbel-type distribution. They also disco v ered that the fitness distribution of beneficial m utations app eared exponential only when the v ast ma jorit y of small-effect mutations w ere ignored. F urthermore, it was deter- mined that the distribution of b eneficial m utations dep ends o n the fitness of the parental genot yp e (Co wp erth w aite et al., 2005; MacLean and Buc kling, 2009). Ho we v er, since the exceptions to this conclusion a rise when the fitness of the wild t yp e cell is lo w, these findings do not con t r adict the picture based on extreme v a lue theory . In contrast to the evidence ab ov e, recen t w ork of Rokyta et al. (2008) has sho wn that in tw o sets of b eneficial mu tations arising in the bacteriophage ID1 1 and in the phage φ 6 – for whic h the m utations w ere iden tified b y sequencing – b eneficial fitness effects are not 3 exp o nen tial. Using a statistical metho d deve lop ed b y Biesal et al. (2007 ), they tested the n ull h yp othesis that the fitness distribution ha s an exp onen tial tail. They found that the n ull h yp othesis could b e rejected in fav or of a distribution with a rig ht truncated tail. Their data also violated the common assumption that small-effect m utations greatly outnum b er those of lar g e effect, as they w ere consisten t with a uniform distribution of b eneficial effects. A p ossible explanation for the b ounded fitness distribution ma y b e found in the culture con- ditions utilized in the exp erimen ts: they ev olve d ID11 on E.c oli at an elev ated temp erature (37 o C instead of 33 o C). There ma y b e a limited n umber of mutations that will enable ID11 to surviv e in increased temp eratures. The latter situation ma y b e similar t o scenarios arising during tumorigenesis, where, in order to dev elop resistance to a drug or to progress to a more aggressiv e stage, the conformation o f a particular protein m ust b e changed o r a certain regulat ory netw o rk m ust b e disrupted. If there is a finite, but la r ge, n umber o f p ossible b eneficial m utations, then it is con v enien t to use a contin uous distribution a s an appro ximatio n. In this pap er, w e consider b o t h b ounded distributions and un b ounded distributions for the fitness adv ance and deriv e asy mptotic results for the num b er of ty p e- k individuals at time t . W e determine the effects o f t he fitnes s distribution on the growth kinetics of the p opulation, and inv estigate t he rates of expansion for b oth b o unded and un b o unded fitness distributions. T his mo del pro vides a framew ork to in v estigate the accum ulation of m uta tions with random fitness effects. The r emainder o f this section is dedicated to statemen ts a nd discussion of our main results. Pro ofs of these results can b e found in Sections 2-5. 1.1 Bounded distributions Let us consider a m ulti-type bra nching pro cess in whic h type- i cells ha ve accum ulated i ≥ 0 adv antageous m utat io ns. Supp o se the initial p opulation consists entirely of t yp e-0 cells that giv e birth at rate a 0 to new type-0 cells, die a t rate b 0 < a 0 , and giv e birth to new ty p e-1 cells at ra t e u 1 . The parameters a 0 , b 0 , and u 1 denote the birth rate, death rate, and mutation rate for t yp e-0 cells. T o simplify computations, w e will appro ximate the n um b er of t yp e-0 cells by Z 0 ( t ) = V 0 e λ 0 t , where λ 0 = a 0 − b 0 > 0. If the initia l cell p opulatio n Z 0 (0) = V 0 ≫ 1 / λ 0 , then the branc hing pro cess giving the n um b er of 0’s is almost deterministic and this appro ximation is accurate. When a new ty p e-1 cell is b orn, we c ho ose x > 0 according to a con tin uous probabilit y distribution ν . The new ty p e 1-cell and its descendan ts then ha ve birth rate a 0 + x , death rate b 0 , and m utation ra te u 2 . In general, t yp e- k cells with birth rate a mutate to ty p e-( k + 1) cell at r a te u k +1 and when a m utatio n o ccurs, the new t yp e-( k + 1) cell and its descendan ts ha v e an increased birth rate a + x where x > 0 is drawn according to ν . W e let Z k ( t ) denote the total n umber o f type- k cells in the p opulation at time t . When w e refer to the k th generation of mutan ts, w e mean the set of all type- k cells. W e b egin b y considering situations in which the distribution of the increase in the birth rate is concen trated on [0 , b ]. In particular, sup p ose tha t ν has densit y g with supp o rt in 4 [0 , b ] and assume that g satisfies: ( ∗ ) g is contin uous at b , g ( b ) > 0 , g ( x ) ≤ G f or x ∈ [0 , b ] Our first result describes the mean n um b er o f first g eneration mutan ts at time t , E Z 1 ( t ). Theorem 1. If ( ∗ ) holds, then E Z 1 ( t ) ∼ V 0 u 1 g ( b ) bt e ( λ 0 + b ) t wher e a ( t ) ∼ b ( t ) m e an s a ( t ) /b ( t ) → 1 . The next result sho ws that the actual gro wth r a te of t yp e-1 cells is slow er tha n t he mean. Here, and in what fo llows, w e use ⇒ to indicate con ve r gence in distribution. Theorem 2. If ( ∗ ) holds a nd p = b/λ 0 , then for θ ≥ 0 , E exp( − θ t 1+ p e − ( λ 0 + b ) t Z 1 ( t )) → exp( − V 0 u 1 θ λ 0 / ( λ 0 + b ) c 1 ( λ 0 , b )) , (1.1) wher e c 1 ( λ 0 , b ) is an explicit c onstant whose va lue wil l b e given in (3.8) . In p articular, we have t 1+ p e − ( λ 0 + b ) t Z 1 ( t ) ⇒ V 1 , wher e V 1 has L apla c e tr an sform given by the righthand s ide of (1.1) . Theorem 2 is similar to Theorem 3 in Durrett and Moseley (2009) whic h assumes a deter- ministic fit ness distribution so that all t yp e-1 cells hav e gro wth rate λ 1 = λ 0 + b . There, the a symptotic growth r ate of the first generation is exp( λ 1 t ). In contrast, t he con tinuous fitness distribution w e consider here has the effect of slowing down the growth rate of the first generation by the p olynomial factor t 1+ p . T o explain this difference, w e note that the calculation of the mean giv en in Section 3 show s that the dominan t con tributio n to Z 1 ( t ) comes from growth rates x = b − O (1 /t ). Ho w eve r, mutations with this growth ra te a re un- lik ely un til the nu m b er of type-0 cells is O ( t ), i.e., roughly a t time r 1 = (1 /λ 0 ) log t . Th us a t time t , the n um b er of t yp e-1 cells will b e ro ughly exp(( λ 0 + b )( t − r 1 )) = exp(( λ 0 + b ) t ) /t 1+ p . T o pro v e Theorem 2, we lo ok at mutations as a p oin t pr o cess in [0 , t ] × [0 , b ]: there is a p oint a t ( s, x ) if there w as a mutan t with birth rate a 0 + x at time s . This allow s us to deriv e the follow ing explicit expression for the Lapla ce transform of Z 1 ( t ): E ( e − θ Z 1 ( t ) ) = exp  − u 1 Z b 0 dx g ( x ) Z t 0 ds V 0 e λ 0 s (1 − ˜ φ x,t − s ( θ ))  where ˜ φ x,r ( θ ) = E e − θ ˜ Z x r and ˜ Z x r is a con tin uous-time branchin g pro cess with birth ra t e a 0 + x , death ra te b 0 , and initia l p opulation ˜ Z x 0 = 1. In Figure 1 , w e compare the exact Laplace transform of t 1+ p exp( − ( λ 0 + b ) t ) Z 1 ( t ) with the results of sim ulations and the limiting La pla ce transform from Theorem 2, illustrating the con vergence as t → ∞ . 5 Notice that t he Laplace transform of V 1 has the form exp( C θ α ) where α = λ 0 / ( λ 0 + b ) whic h implies t ha t P ( V 1 > v ) ∼ v − α as v → ∞ (see, for example, the argumen t in Section 3 of Durrett and Moseley (2009)). T o ga in some insight in t o how this limit comes ab out, we giv e a second pro of of the con vergence that tells us the limit is the sum of p oints in a nonhomo g eneous P oisson pro cess. Eac h p oin t in the limiting pro cess repres en ts the con tribution of a differen t mutan t lineag e to Z 1 ( t ). Theorem 3. V 1 = lim t →∞ t 1+ p e − ( λ 0 + b ) t Z 1 ( t ) is the sum of the p oin ts of a Poisson pr o c ess on (0 , ∞ ) w ith me an me asur e µ ( z , ∞ ) = A 1 ( λ 0 , b ) u 1 V 0 z − λ 0 / ( λ 0 + b ) . A similar result can b e obtained for deterministic fitness distributions, see the Corollary to Theorem 3 in Durrett and Moseley (200 9). How ev er, the new result shows that the p oin t pro cess limit is not an artifact of assuming that all first generation m utants hav e the same gro wth rate. Ev en when the fitness adv a nces are ra ndo m, differen t mutan t lines contribute to t he limit. This result is consisten t with observ ations of Maley et al. (2006 ) and Shah et al. (2009) that tumors con tain cells with differen t m uta t io nal haplot yp es. Theorem 3 also giv es quantitativ e predictions a b out the relative contribution o f differen t m utations to the total p o pula t ion. Thes e implications will b e explored further in a follow-up pap er curren tly in prog r ess. With the b ehavior of the first generation ana lyzed, w e are ready to pro ceed to the study of further generations. The computation of the mean is straightforw ard. Theorem 4. If ( ∗ ) holds, then E Z k ( t ) ∼ V 0 · u 1 · · · u k · g ( b ) k t k b k k ! e ( λ 0 + k b ) t As in the k = 1 case, the mean in v olv es a p o lynomial correction to the exp onen tial g ro wth and aga in, do es not giv e the correct g ro wth rate for the n umber of type- k cells. T o state the correct limit theorem describing the g r o wth rate of Z k ( t ), w e will define p k and u 1 ,k b y k + p k = k − 1 X j = 0 λ 0 + k b λ 0 + j b and u 1 ,k = k Y j = 1 u λ 0 / ( λ 0 +( j − 1) b ) j for all k ≥ 1 . Theorem 5. If ( ∗ ) holds, then for θ ≥ 0 E exp( − θ t k + p k e − ( λ 0 + k b ) t Z k ( t )) → exp( − c k ( λ 0 , b ) V 0 u 1 ,k θ λ 0 / ( λ 0 + k b ) ) t k + p k e − ( λ 0 + k b ) t Z k ( t ) ⇒ V k W e prov e this result by lo oking at the mutations t o type-1 individuals as a three di- mensional Pois son p oint pro cess: there is a p oin t at ( s, x, v ) if there was a t yp e-1 m utant with birth rate a 0 + x at time s and the nu m b er of its t yp e-1 descendan ts at time t , Z s,x 1 ( t ), 6 has e − ( λ 0 + x )( t − s ) Z s,x 1 ( t ) → v with v > 0. T o study Z k ( t ) we will let Z s,x,v k ( t ) b e the type- k descendan ts at time t of the 1 mutan t at ( s, x, v ). Z s,x,v k is the same as a pro cess in whic h the initial type (here ty p e-1 cells) b eha ves lik e v e ( λ 0 + x )( t − s ) instead o f Z 0 ( t ) = V 0 e λ 0 t , so the result can b e pro ved b y induction. T o explain the form of the result w e cons ider the case k = 2. Breaking things dow n according to the t imes and the sizes of the mutational c hanges, w e ha ve E Z 2 ( t ) = Z b 0 dx 1 g ( x 1 ) Z b 0 dx 2 g ( x 2 ) Z t 0 ds 1 Z t s 1 ds 2 V 0 e λ 0 s 1 u 1 e ( λ 0 + x 1 )( s 2 − s 1 ) u 2 e ( λ 0 + x 1 + x 2 )( t − s 2 ) As in the result fo r Z 1 ( t ) the dominant con tribution comes from x 1 , x 2 = b − O (1 / t ) and as in the discussion preceding the statemen t of Theorem 2 , the time o f the first mutation to b − O (1 /t ) is ≈ r 1 = (log t ) / λ 0 . The descendan ts o f this mutation g r ow at exp onen tial rate λ 0 + b − O (1 /t ), so t he time o f the first mutation to 2 b − O (1 /t ) is ≈ r 2 = r 1 + (log t ) / ( λ 0 + b ). Noticing tha t exp(( λ 0 + 2 b )( t − r 1 − r 2 )) = exp(( λ 0 + 2 b ) t ) t − ( λ 0 +2 b ) /λ 0 − ( λ 0 +2 b ) / ( λ 0 + b ) tells us what to guess for the p o lynomial term: t − (2+ p 2 ) where 2 + p 2 = λ 0 + 2 b λ 0 + λ 0 + 2 b λ 0 + b In Figure 2, w e compare the asymptotic Laplace transform from Theorem 5 with the results of sim ulations in the case k = 2. T o explain the slo w con vergenc e to the limit, we note that if w e t a k e accoun t of the mutation ra t es u 1 , u 2 in the heuristic f rom the previous paragraph (whic h b ecomes imp ortant when u 1 , u 2 are small), then the first time we see a t yp e-1 cell with gro wth rate b − O (1 /t ) will not o ccur un til time λ − 1 0 log( t/u 1 ) when the t yp e-0 cells reac h O ( t/u 1 ) and so the first t yp e- 2 cell with gr owth rate 2 b − O (1 / t ) will not b e b orn un til time r = λ − 1 0 log( t/u 1 ) + ( λ 0 + b ) − 1 log( t/u 2 ) when the descen dan ts of the t yp e-1 cells with grow t h rate b − O (1 /t ) reach size O ( t/u 2 ). When u 1 = u 2 = 10 − 3 , λ 0 = . 1, and b = . 01, r ≈ 223. The m uta t ions created a t this p oin t will need some time to grow a nd b ecome dominant in the p opulation. It w ould b e in t eresting to compare sim ulations at time 300, but we hav e not b een a ble to do this due to the large n umber of differen t gro wth rates in generation 1. 1.2 Un b ounded distributions Let us now consider situations in whic h the fitness distribution is un b ounded. Supp ose that the fitness increase follo ws a generalized F rec het distribution, P ( X > x ) = x β e − γ x α (1.2) 7 for some p o sitiv e γ , α and an y β ∈ R . There is a t wo-fold purpose for considering su c h distributions. First, if i.i.d. random v ariables ζ 1 , . . . , ζ n ha ve a p ow er law tail, i.e. P ( ζ i > y ) ∼ cy − α as y → ∞ , then their ma xima a nd the spacings b etw een order statistics con ve rge to a limit of the form (1 .2 ) with β = 0. Second, this c hoice allow s us to consider t he g amma( β + 1 , γ ) distribution whic h has α = 1 and the normal distribution, whic h asymptotically has this form with α = 2 , β = − 1. T o analyze this situation, w e will again ta ke a Poisson pro cess view p oin t and lo ok at the con tributio n from a mutation at time s with increased growth rate x . A m utation that increases the gro wth rate by x a t time s will, if it do es not die out, grow to e ( λ 0 + x )( t − s ) ζ at time t where ζ has an exp onen tial distribution. The gr owth rate ( λ 0 + x )( t − s ) ≥ z when x ≥ z t − s − λ 0 . Therefore, µ ( z , ∞ ) ≡ E (# mutations with ( λ 0 + x )( t − s ) ≥ z ) = V 0 u 1 Z t 0  z t − s − λ 0  β e λ 0 s exp  − γ  z t − s − λ 0  α  ds = V 0 u 1 Z t 0  z t − s − λ 0  β exp( φ ( s, z )) ds where φ ( s, z ) = λ 0 s − γ  z t − s − λ 0  α . (1.3) The size of this integral can b e f o und b y maximizing the exp onen t φ ov er s for fixed z . Sinc e ∂ φ ∂ s ( s, z ) = λ 0 − α γ  z t − s − λ 0  α − 1 z ( t − s ) 2 (1.4) and ∂ 2 φ ∂ s 2 ( s, z ) = − α ( α − 1) γ  z t − s − λ 0  α − 2 z 2 ( t − s ) 4 − α γ  z t − s − λ 0  α − 1 2 z ( t − s ) 3 (1.5) w e can see t hat ∂ 2 φ/∂ s 2 ( s, z ) < 0 when αz > λ 0 ( t − s ) so that f o r all z in this ra nge, φ ( s, z ) is concav e as a function of s and a c hiev es its maxim um at a unique v alue s z . When α = 1, it is easy to set (1.4) to 0 a nd solv e for s z . This in turn leads to a n asymptotic form ula for µ ( z , ∞ ) and allows us to deriv e the follo wing limit theorem for Z 1 ( t ). Theorem 6. Supp os e α = 1 and let c 0 = λ 0 / 4 γ . Then t − 2 log Z 1 ( t ) → c 0 and 1 t  log Z 1 ( t ) − c 0 t 2  1 + (2 β + 1) log t λ 0 t  ⇒ y ∗ wher e y ∗ is the rightmost p oin t in the p oint pr o c ess with intensity g iven by (2 c 0 ) β ( π /λ 0 ) 1 / 2 V 0 u 1 exp( γ λ 0 − λ 0 y / 2 c 0 ) . (1.6) 8 When α 6 = 1, solving for s z b ecomes more difficult, but w e are still able t o prov e the follo wing limit theorem f or Z 1 ( t ). Theorem 7. S upp os e α > 1 is an inte ger. Th er e e xist explicitly c alculable c onstants c k = c k ( α, γ ) , 0 ≤ k < α , a nd κ = κ ( β , α, γ ) so that t − ( α +1) /α log Z 1 ( t ) → c 0 and 1 t 1 /α " log Z 1 ( t ) − c 0 t ( α +1) /α 1 + X 1 ≤ k < α c k t − k /α + κ log t t !# ⇒ y ∗ wher e y ∗ is the rightmost p article in a p oint p r o c ess wi th explicitly c alculab le intensity. The complicated form of the result is due to the fa ct that the fluctuations a re only of order t 1 /α , so w e hav e to b e ve r y precise in lo cating the maximum . The explicit formulas f or the constan ts and the in tensity of t he p oin t pro cess are giv en in (5.12) and (5.13). With more w ork this result could b e pro ved for a general α > 1, but w e hav e not tried to do this or pro ve Conjecture 1 b elo w b ecause the sup er-exp onen tial gro wth ra tes in the un b ounded case are to o fast to b e realistic. W e conclude this section with t w o commen ts. First, the pro of of Theorem 7 sho ws that in con trast to the b ounded case, in the un b ounded case, most type-1 individuals a r e descendan ts of a single m utan t. Second, the pro of show s that the distribution of the mutan t with the largest growth rate is b orn at time s ∼ t/ ( α + 1) (see Remark 1 at the end of Section 5) and has gro wth rate z = O ( t ( α +1) /α ). The intuition b ehind this is that since t he type-0 cells hav e gro wth rate e λ 0 s and the distribution of the increase in fitness has tail ≈ e − γ x α , the larg est adv ance x a ttained by time t should o ccur when s = O ( t ) and satisfy e C λ 0 t e − γ x α = O (1) or x = O ( t 1 /α ) . The gr o wth ra t e of its fa mily is then ( λ 0 + x )( t − s ) = O ( t ( α +1) /α ). Since the type-1 cells gro w at expo nential rate c 1 t ( α +1) /α , if w e apply this same reasoning to type-2 mutan ts, then the largest additional fitness adv a nce x attained b y type-2 individuals should satisfy e c 1 t ( α +1) /α e − γ x α = O (1) or x = O ( t 1 /α +1 /α 2 ) . and the gr o wth rate of its family will b e O ( t 1+1 /α +1 /α 2 ). Extrap olating f rom the first t wo generations, we mak e the following Conjecture 1. L et q ( k ) = P k j = 0 α − j . As t → ∞ , 1 t q ( k ) log Z k ( t ) → c k Note that in the case of the exp onen tial distribution, q ( k ) = k + 1. The rest of t he pap er is orga nized as fo llows. Sections 2-5 are devoted to pro ofs of our main results. After some preliminary notation and definitions in Section 2 , Theorems 1-3 are prov ed in Section 3, Theorems 4-5 in Section 4, and Theorems 6-7 in Section 5. W e conclude with a discussion of our results in Section 6. 9 2 Preliminaries This section contains some preliminary no t a tion and definitions whic h w e will need fo r the pro ofs of o ur main results. W e denote b y N ( t ) the p o ints in a t wo dimensional Poiss o n pro cess o n [0 , t ] × [0 , ∞ ) with mean measure V 0 e λ 0 s dsν ( d x ) where in Sections 3-4, ν ( dx ) = g ( x ) dx with g satisfying ( ∗ ) and in Section 5, ν ha s t a il ν ( x, ∞ ) = x β e − γ x α . In other w ords, we ha ve a p oin t at ( s, x ) if there was a mutan t with birth rate a 0 + x at time s . Define a collection of indep enden t birth/death branc hing pro cesses Z s,x 1 ( t ) indexed by ( s, x ) ∈ N ( t ) with Z s,x 1 ( s ) = 1, individual birth rate a 0 + x , and death rate b . Z s,x 1 ( t ) is the con tribution of the m uta t ion at ( s, x ) and Z 1 ( t ) = X ( s,x ) ∈N ( t ) Z s,x 1 ( t ) . It is w ell known that e − ( λ 0 + x )( t − s ) Z s,x 1 ( t ) → b a 0 + x δ 0 + λ 0 + x a 0 + x ζ where ζ ∼ exp(( λ 0 + x ) / ( a 0 + x )) (see, for example, equation (1) in D urrett and Moseley (2009)). In sev eral results, we shall mak e use of the three dimensional P oisson pro cess M ( t ) on [0 , t ] × [0 , ∞ ) × (0 , ∞ ) with in t ensity V 0 e λ 0 s ν ( dx )  λ 0 + x a 0 + x  2 e − v ( λ 0 + x ) / ( a 0 + x ) dv . In words, ( s, x, v ) ∈ M ( t ) if there w as a m utant with birth rate a 0 + x at time s and the n umber of its descendan ts at time t , Z s,x 1 ( t ), has Z s,x 1 ( t ) ∼ v e ( λ 0 + x )( t − s ) . It is also con ve nien t to define t he mapping z : [0 , ∞ ) × [0 , t ] → [0 , ∞ ) whic h maps a p oint ( s, x ) ∈ N ( t ) to t he gro wth rate of the induced bra nc hing pro cess if it surviv es: z ( s, x ) = ( λ 0 + x )( t − s ) and let µ ( A ) = E |{ ( s, x ) ∈ N ( t ) : z ( s, x ) ∈ A }| for A ⊂ [0 , ∞ ) . W e shall use C do denote a generic constan t whose v alue ma y c ha nge from line to line. W e write f ( t ) ∼ g ( t ) if f ( t ) /g ( t ) → 1 as t → ∞ and f ( t ) = o ( g ( t )) is f ( t ) /g ( t ) → 0. f ( t ) ≫ ( ≪ ) g ( t ) means t ha t f ( t ) /g ( t ) → ∞ ( resp. 0 ) as t → ∞ and f ( t ) = O ( g ( t )) means | f ( t ) | ≤ C g ( t ) fo r all t > 0. W e also shall use the nota t ion f ( t ) ≃ g ( t ) if f ( t ) = g ( t ) + o (1) as t → ∞ . 10 3 Bounde d dis tributio ns, Z 1 In this section, w e prov e Theorems 1 - 3. Pr o of of T he o r em 1. Mutations to 1 ’s o ccur at ra te V 0 e λ 0 s so E Z 1 ( t ) = u 1 Z t 0 Z b 0 e ( t − s )( λ 0 + x ) g ( x ) dxV 0 e λ 0 s ds = u 1 V 0 e λ 0 t Z b 0 dx g ( x ) Z t 0 e ( t − s ) x ds (3.1) = u 1 V 0 e λ 0 t Z b 0 dx g ( x ) e tx − 1 x . W e b egin b y showin g tha t the con tr ibutio n from x ∈ [0 , b − (1 + k ) log t ) /t ] can b e ignored for any k ∈ [0 , ∞ ). The Mean V alue theorem implies that e tx − 1 x ≤ te tx (3.2) Using this and the fact that R d c te tx dx ≤ e td for any c < d , w e can see that t k e − bt Z b − (1+ k )(log t ) /t 0 dx g ( x ) e tx − 1 x ≤ Gt k e − (1+ k ) log t → 0 (3.3) T o handle the o ther piece of the inte gral, w e tak e k = 1 a nd note that Z b b − (2 log t ) / t dx g ( x ) e tx − 1 x ∼ g ( b ) b e bt Z b b − 2 log t/ t e t ( x − b ) dx After changing v a riables y = ( b − x ) t , dx = − dy /t , the last inte gral = 1 t Z 2 log t 0 e − y dy ∼ 1 /t whic h prov es the result. The ab o v e pro of tells us tha t the dominan t con tributio n to t he 1’s come from m utations with fitness increase x ≥ b t = b − 2 lo g t/ t . T o describ e the times at whic h the dominant con tributions occur, let S ( t ) = (2 /b ) lo g lo g t . Then the con tr ibutio n to the mean from x ∈ [ b t , b ] and s ≥ S ( t ) is b y (3 .1) ≤ Gu 1 V 0 e ( λ 0 + b ) t 2(log t ) t Z ∞ S ( t ) e − sb t ds ≤ Gu 1 V 0 e ( λ 0 + b ) t 2(log t ) tb t e − b t S ( t ) Since b t S ( t ) ≥ 2 log lo g t , this quantit y is o ( t − 1 e ( λ 0 + b ) t ). In w ords, the dominant con tributio n to the mean comes fro m p oints close to (0 , b ) or more precisely from [0 , (2 /b ) log log t ] × [ b − (2 log t ) /t, b ]. 11 Pr o of of T he o r em 2. It suffices to prov e (1.1). The computation in ( 3 .3) with k = 2 + p implies that the con tributio n from mutations with x ≤ b t = b − (3 + p )(log t ) /t can b e ignored. Therefore, we ha ve E exp( − θ Z 1 ( t )) ≃ E (exp( − θZ 1 ( t )); A t ) where A t = { ( s, x ) ∈ N ( t ) : x > b t } . By Lemma 2 of Durrett a nd Moseley (2009), w e hav e E ( e − θ Z 1 ( t ) ; A t ) = exp  − u 1 Z b b t dx g ( x ) Z t 0 ds V 0 e λ 0 s (1 − ˜ φ x,t − s ( θ ))  where ˜ φ x,r ( θ ) = E e − θ ˜ Z x r and ˜ Z x r is a birth/death bra nc hing pro cess with birth rat e a 0 + x , death rat e b 0 , and initial p opulatio n ˜ Z x 0 = 1. Using e − ( λ 0 + b ) t = e − ( λ 0 + x )( t − s ) e − ( λ 0 + x ) s e − ( b − x ) t (3.4) w e hav e E  exp( − θ Z 1 ( t ) e − t ( λ 0 + b ) t 1+ p ); A t  = exp  − u 1 V 0 Z b b t dx g ( x ) Z t 0 ds e λ 0 s { 1 − ˜ φ x,t − s ( θ e − ( λ 0 + x )( t − s ) e − ( λ 0 + x ) s e − ( b − x ) t t 1+ p ) }  Changing v ariables s = r x + r where r x = 1 λ 0 + x log( t 1+ p ) on the inside integral, y = ( b − x ) t , dy /t = − dx on the outside, and contin uing to write x as short hand for b − y /t , the ab ov e = exp  − u 1 V 0 Z (3+ p ) log t 0 dy t g ( x ) t (1+ p ) λ 0 / ( λ 0 + x ) Z t − r x − r x dr e λ 0 r { 1 − ˜ φ x,t − r − r x ( θ e − ( λ 0 + x )( t − r − r x ) e − ( λ 0 + x ) r e − y ) }  (3.5) F orm ula (2 0) in Durrett and Moseley (200 9 ) implies that as u → ∞ , 1 − ˜ φ x,u ( θ e − ( λ 0 + x ) u ) → λ 0 + x a 0 + x · θ θ + λ 0 + x a 0 + x (3.6) and therefore, letting t → ∞ and using (1 + p ) λ 0 / ( λ 0 + b ) = 1, w e can see that the expression in (3.5) → exp − u 1 V 0 g ( b ) Z ∞ 0 dy λ 0 + b a 0 + b Z ∞ −∞ dr e λ 0 r θ e − ( λ 0 + b ) r e − y θ e − ( λ 0 + b ) r e − y + λ 0 + b a 0 + b ! Changing v ariables r = 1 λ 0 + b { q + lo g[ θ e − y ( a 0 + b ) / ( λ 0 + b )] } , dr = d q / ( λ 0 + b ) giv es = exp  − u 1 V 0 g ( b ) θ λ 0 / ( λ 0 + b )  λ 0 + b a 0 + b  b/ ( λ 0 + b ) Z ∞ 0 dy e − y λ 0 / ( λ 0 + b ) Z ∞ −∞ dq λ 0 + b e q λ 0 / ( λ 0 + b ) e − q e − q + 1  12 T o simplify the first integral w e note that Z ∞ 0 dy e − y λ 0 / ( λ 0 + b ) = λ 0 + b λ 0 F or the second inte gral, w e pro v e Lemma 1. If 0 < c < 1 Z ∞ −∞ dq e q c e − q e − q + 1 = Γ( c )Γ ( 1 − c ) (3.7) Pr o of. W e can rewrite the integral as Z ∞ −∞ dq e q c Z ∞ 0 dx e − x e − q exp( − e − q x ) so that after in terchanging the order of in tegra tion and changing v ariables w = e − q x , dw = − dq e − q x so that w /x = e − q , dw / x = − dq e − q , w e hav e = Z ∞ 0 dx Z ∞ 0 dw x ( w /x ) − c e − x e − w = Z ∞ 0 dx x − 1+ c e − x Z ∞ 0 dw w − c e − w whic h is = Γ( c )Γ(1 − c ). T aking c = λ 0 / ( λ 0 + b ) and letting c 1 ( λ 0 , b ) = g ( b ) λ 0 + b λ 0 · 1 λ 0 + b  a 0 + b λ 0 + b  − b/ ( λ 0 + b ) Γ( λ 0 / ( λ 0 + b ))Γ(1 − λ 0 / ( λ 0 + b )) (3.8) w e hav e pro ved Theorem 2. Recall that we hav e assumed Z 0 ( t ) = V 0 e λ 0 t is deterministic. This assumption can b e relaxed to obtain t he following generalization of Theorem 2 whic h is used in Section 4. Lemma 2. Supp ose that Z 0 ( t ) is a sto ch astic pr o c ess with Z 0 ( t ) ∼ e λ 0 t V 0 for so me c onstant V 0 as t → ∞ . Then the c onclusions of The or em 2 r emain v alid. T o see wh y this is true, w e can use a v ariant of Lemma 2 from Durrett and Moseley (2009) to conclude that E  e − θ Z 1 ( t ) |F 0 t  = exp  − u 1 Z b 0 dx g ( x ) Z t 0 dsZ 0 ( s )  1 − ˜ φ x,t − s ( θ )   , where F 0 t is the σ -field generated by Z 0 ( s ) f or s ≤ t . Therefore, E  e − θ Z 1 ( t )  = E exp  − u 1 Z b 0 dx g ( x ) Z t 0 dsZ 0 ( s )  1 − ˜ φ x,t − s ( θ )   , 13 Giv en ε > 0 , we can choose t ε > 0 so tha t     Z 0 ( t ) V 0 exp( λ 0 t ) − 1     < ε for a ll t > t ε . Since the con tribution from t ≤ t ε will no t affect the limit a nd the term inside the exp ectatio n is b ounded, the rest o f the pro of can b e completed in the same manner a s the pro of of Theorem 2. W e conclude this section with the Pr o of of The or em 3. Let M ( t ) b e the three dimensional P oisson pro cess defin ed in Se c- tion 2. Using ( 3 .4), w e see tha t in o r der for the contribution of Z s,x 1 ( t ) t o the limit of t 1+ p e − ( λ 0 + b ) t Z 1 ( t ) to b e > z we need v > z t − (1+ p ) e ( b − x ) t e ( λ 0 + x ) s Therefore, the exp ected n umber of m ut a tions that contribute more than z to the limit is u 1 V 0 Z b 0 dx g ( x ) Z t 0 ds e λ 0 s λ 0 + x a 0 + x exp  − λ 0 + x a 0 + x · z t − (1+ p ) e ( b − x ) t e ( λ 0 + x ) s  In order to turn the big exp onential into e − r w e c hange v ariables: s = 1 λ 0 + x log r z t − (1+ p ) e ( b − x ) t λ 0 + x a 0 + x ! ds = dr /r ( λ 0 + x ) to g et u 1 V 0 Z b 0 dx g ( x ) z − λ 0 / ( λ 0 + x )  λ 0 + x a 0 + x  x/ ( λ 0 + x ) · t (1+ p ) λ 0 / ( λ 0 + x ) e − ( b − x ) tλ 0 / ( λ 0 + x ) Z β ( x,t ) α ( x,t ) dr λ 0 + x r − x/ ( λ 0 + x ) e − r where α ( x, t ) = z t − (1+ p ) e ( b − x ) t ( λ 0 + x ) / ( a 0 + x ) and β ( x, t ) = α ( x, t ) e ( λ 0 + x ) t . As in the previous pro of, the main con tribution comes from x ∈ [ b t , b ] so when w e change v ariables y = ( b − x ) t , dx = − dy /t , replace the x ’s by b ’s and use 1 = (1 + p ) λ 0 / ( λ 0 + b ) we conv ert the ab ov e into g ( b ) z − λ 0 / ( λ 0 + b ) u 1 V 0 λ 0 + b  λ 0 + b a 0 + b  b/ ( λ 0 + b ) Z ∞ 0 dy e − y λ 0 / ( λ 0 + b ) Z ∞ 0 r − b/ ( λ 0 + b ) e − r dr P erforming the in tegra ls g iv es the result with A 1 ( λ 0 , b ) = g ( b ) 1 λ 0  λ 0 + b a 0 + b  b/ ( λ 0 + b ) Γ( λ 0 / ( λ 0 + b )) 14 4 Bounde d dis tributio ns, Z k W e now mo ve on to the pro of s of Theorems 4 and 5. Recall that we ha ve defined p k b y the relation k + p k = k − 1 X j = 0 λ 0 + k b λ 0 + j b . Pr o of of The or em 4. Breaking things down according to the times and t he sizes of the m utationa l changes w e ha v e E Z k ( t ) = Z b 0 dx 1 g ( x 1 ) · · · Z b 0 dx k g ( x k ) Z t 0 ds 1 · · · Z t s k − 1 ds k V 0 e λ 0 s 1 u 1 e ( λ 0 + x 1 )( s 2 − s 1 ) · · · u k e ( λ 0 + x 1 + ··· + x k )( t − s k ) = Z b 0 dx 1 g ( x 1 ) · · · Z b 0 dx k g ( x k ) Z t 0 ds 1 · · · Z t s k − 1 ds k V 0 u 1 · · · u k e λ 0 t e x 1 ( t − s 1 ) · · · e x k ( t − s k ) . (4.1) The first step is to sho w Lemma 3. L et b t = b − (2 k + 1 ) (log t ) /t . The c on tribution to E Z k ( t ) fr om p oints ( x 1 , . . . x k ) with som e x i ≤ b t is o ( t − 2 k e ( λ 0 + k b ) t ) . Pr o of. (3.2) implies that Z t s j − 1 ds j e ( x j + ··· + x k )( t − s j ) = e ( x j + ··· + x k )( t − s j − 1 ) − 1 x j + · · · + x k ≤ te ( x j + ··· + x k )( t − s j − 1 ) . Applying this and w o rking back w ards in the ab ov e expression for E Z k ( t ), we get E Z k ( t ) ≤ t k V 0 u 1 · · · u k Z b 0 dx 1 g ( x 1 ) · · · Z b 0 dx k g ( x k ) e ( λ 0 + x 1 + ··· + x k ) t and the desired result follows. With the Lemma established, when we w ork bac kw ards Z t s j − 1 ds j e ( x j + ··· + x k )( t − s j ) = e ( x j + ··· + x k )( t − s j − 1 ) − 1 x j + · · · + x k ∼ e ( x j + ··· + x k )( t − s j − 1 ) ( k − j + 1) b F rom this a nd induction, w e see that t he con tribution from p oin t s ( x 1 , . . . x k ) with x i ∈ [ b t , b ] for all i is ∼ V 0 u 1 · · · u k b k k ! g ( b ) k Z b b t dx 1 · · · Z b b t dx k e ( λ 0 + x 1 + ··· + x k ) t 15 Changing v ariables y i = t ( b − x i ) the ab ov e ∼ V 0 u 1 · · · u k g ( b ) k b k t k k ! e ( λ 0 + k b ) t whic h prov es the desired result. In the pro of of the last result, w e show ed that the dominan t con tribution comes from m utations with x i > b t . T o pro v e our limit theorem w e will also need a result regarding the times at whic h the m utations to the dominant t yp es o ccur. Lemma 4. L et α k = 2 k + 1 k b . The c ontribution to E Z k ( t ) fr om p oints with s 1 ≥ α k log t is o ( t − 2 k e ( λ 0 + k b ) t ) . Pr o of. Replace the x i ’s in the exp onen ts b y b ’s , w e can see from (4.1) that the expected con tribution from p oints with s 1 ≥ α k log t is ≤ b k G k V 0 u 1 · · · u k Z t α k log t ds 1 Z t s 1 ds 2 · · · Z t s k − 1 ds k e λ 0 t e b ( t − s 1 ) · · · e b ( t − s k ) ≤ C e λ 0 t Z t α k log t e k b ( t − s 1 ) ds 1 ≤ C e ( λ 0 + k b ) t t − α k k b and the desired result follows. Recall that k + p k = k − 1 X j = 0 λ 0 + k b λ 0 + j b . F or t he induction used in the next pro of, w e will also need the corresp onding quan tity with λ 0 replaced by λ 0 + x and k b y k − 1 k − 1 + p k − 1 ( x ) = k − 2 X j = 0 λ 0 + x + ( k − 1) b λ 0 + x + j b whic h means p k − 1 ( x ) = k − 2 X j = 0 ( k − 1 − j ) b λ 0 + x + j b The limit will dep end on the m utat ion rat es through u 1 ,k = k Y j = 1 u λ 0 / ( λ 0 +( j − 1) b ) j 16 Again w e will need the corresp onding quantit y with k − 1 terms u 2 ,k ( x ) = k − 1 Y j = 1 u ( λ 0 + x ) / ( λ 0 + x +( j − 1) b ) j + 1 . W e shall write u 2 ,k = u 2 ,k ( b ) and note that u 1 ,k = u 1 u λ 0 / ( λ 0 + b ) 2 ,k (4.2) Pr o of of The or em 5. W e shall prov e the result under the more general assumption that Z 0 ( t ) ∼ V 0 e λ 0 t for some constant V 0 . The result then holds for k = 1 b y Lemma 2. W e shall pro ve the g eneral result b y induction on k . T o this end, supp ose t he result holds for k − 1. Let Z s,x,v k ( t ) b e the t yp e- k descendan ts at time t of the 1 mutan t at ( s, x, v ) ∈ M ( t ). Since Z s,x 1 ( t ) ∼ v e ( λ 0 + x )( t − s ) compared to Z 0 ( t ) ∼ V 0 e λ 0 t , it f ollo ws from the induction h yp othesis that E exp  − θ ( t − s ) k − 1+ p k − 1 ( x ) e − ( λ 0 + x +( k − 1) b )( t − s ) Z s,x,v k ( t )  → exp  − c k − 1 ( λ 0 + x, b ) v u 2 ,k ( x ) θ ( λ 0 + x ) / ( λ 0 + x +( k − 1) b )  (4.3) In tegrating ov er the contributions from the three-dimensional p oint pro cess w e ha v e E exp( − θ Z k ( t )) = exp  − Z b 0 dx g ( x ) Z t 0 ds u 1 V 0 e λ 0 s Z ∞ 0 dv  λ 0 + x a 0 + x  2 exp  − λ 0 + x a 0 + x v  (1 − φ k − 1 x,v ,t − s ( θ ))  where φ k − 1 x,v ,t − s ( θ ) = E exp( − θ Z 0 ,x,v k ( t − s )). T o prov e the desired result w e need to replace θ b y θ t k + p k e − ( λ 0 + k b ) t . Doing this with ( 4 .3) in mind we ha ve E exp( − θ t k + p k e − ( λ 0 + k b ) t Z k ( t )) = exp  − Z b 0 dx g ( x ) Z t 0 ds u 1 V 0 e λ 0 s Z ∞ 0 dv  λ 0 + x a 0 + x  2 exp  − λ 0 + x a 0 + x v   1 − φ k − 1 x,v ,t − s ( θ t k + p k e − ( λ 0 + x +( k − 1) b )( t − s ) e − ( b − x ) t e − ( λ 0 + x + b ( k − 1)) s )   By Lemmas 3 and 4, w e can restrict atten tion to x ∈ [ b t , b ] and s ≤ α k log t . The first restriction implies that a ll of the x ’s except t he one in ( b − x ) can b e set equal to b and the second that w e can replace t b y t − s . Sinc e ( k + p k ) − ( k − 1 + p k − 1 ( b )) = ( λ 0 + k b ) /λ 0 , the term in the exp onen tial is = − Z b b t dx g ( x ) Z α k log t 0 ds u 1 V 0 e λ 0 s Z ∞ 0 dv  λ 0 + b a 0 + b  2 exp  − λ 0 + b a 0 + b v   1 − φ x,v ,t − s ( θ ( t − s ) k − 1+ p k − 1 ( b ) e − ( λ 0 + k b )( t − s ) t ( λ 0 + k b ) /λ 0 e − ( b − x ) t e − ( λ 0 + k b ) s )  17 Changing v ariables s = R ( t ) + r where R ( t ) = (1 /λ 0 )(log t ), and y = ( b − x ) t , dy = − tdx the ab ov e b ecomes = − g ( b ) Z (2 k +1) l og t 0 dy Z ∞ 0 dv  λ 0 + b a 0 + b  2 exp  − λ 0 + b a 0 + b v  Z α k log t − R ( t ) − R ( t ) dr u 1 V 0 e λ 0 r  1 − φ k − 1 x,v ,t − s ( θ ( t − s ) k − 1+ p k − 1 ( b ) e − ( λ 0 + k b )( t − s ) e − y e − ( λ 0 + k b ) r )  Using (4.3) no w w e ha ve that the 1 − φ term con v erges to 1 − exp  − c k − 1 ( λ 0 + b, b ) v u 2 ,k [ θ e − y ] ( λ 0 + b ) / ( λ 0 + k b ) e − ( λ 0 + b ) r  T o simplify the exp onen tial w e let r = 1 λ 0 + b ( q + Q ( v , y )) whe re Q ( v , y ) = log  c k − 1 ( λ 0 + b, b ) v u 2 ,k [ θ e − y ] ( λ 0 + b ) / ( λ 0 + k b )  dr = dq / ( λ 0 + b ). Plugging this in to e λ 0 r results in e q λ 0 / ( λ 0 + b ) ( c k − 1 ( λ 0 + b, b ) v u 2 ,k ) λ 0 / ( λ 0 + b ) θ λ 0 / ( λ 0 + k b ) e − y λ 0 / ( λ 0 + k b ) so the exp onen tial con verges to − g ( b ) c k − 1 ( λ 0 + b, b ) λ 0 / ( λ 0 + b ) λ 0 + b V 0 u 1 u λ 0 / ( λ 0 + b ) 2 ,k θ λ 0 / ( λ 0 + k b ) Z ∞ 0 dv  λ 0 + b a 0 + b  2 v λ 0 / ( λ 0 + b ) exp  − λ 0 + b a 0 + b v  Z ∞ 0 dy e − y λ 0 / ( λ 0 + k b ) Z ∞ −∞ dq λ 0 + b e q λ 0 / ( λ 0 + b ) (1 − exp( − e − q )) T o clean this up, w e note that letting w = v ( λ 0 + b ) / ( a 0 + b ), d w = dv ( λ 0 + b ) / ( a 0 + b ) Z ∞ 0 dv  λ 0 + b a 0 + b  2 v λ 0 / ( λ 0 + b ) exp  − λ 0 + b a 0 + b v  =  a 0 + b λ 0 + b  − 1+ λ 0 / ( λ 0 + b ) Γ(1 + λ 0 / ( λ 0 + b )) (4.4) The second in tegra l is easy: Z ∞ 0 dy e − y λ 0 / ( λ 0 + k b ) = λ 0 + k b λ 0 (4.5) The third one lo oks we ird but when y ou put x = e − q , dx = − e − q dq , or dq = − dx/x it is = Z ∞ 0 dx x − 1 − λ 0 / ( λ 0 + b ) (1 − e − x ) dx 18 then integrating b y parts f ( x ) = 1 − e − x , g ′ ( x ) = x − 1 − λ 0 / ( λ 0 + b ) , f ′ ( x ) = e − x , g ( x ) = x − λ 0 / ( λ 0 + b ) ( λ 0 + b ) /λ 0 turns it in to λ 0 + b λ 0 Γ(1 − λ 0 / ( λ 0 + b )) (4.6) Putting this all together and using (4.2), we ha ve c k − 1 ( λ 0 + b, b ) λ 0 / ( λ 0 + b ) · g ( b ) λ 0 + k b λ 0 · V 0 u 1 ,k θ λ 0 / ( λ 0 + k b ) · 1 λ 0  a 0 + b λ 0 + b  − 1+ λ 0 / ( λ 0 + b ) Γ(1 + λ 0 / ( λ 0 + b ))Γ(1 − λ 0 / ( λ 0 + b )) Setting c k ( λ 0 , b ) equal to the quan tity in the last displa y divided b y V 0 u 1 ,k θ λ 0 / ( λ 0 + k b ) w e hav e pro ved the result. T o w ork out an explicit form ula for the constant and to compare with D urrett and Moseley (2009 ), it is useful to let λ j = λ 0 + j b , a j = a 0 + j b and c h,j = 1 λ j − 1  a j λ j  − 1+ λ j − 1 /λ j Γ(1 + λ j − 1 /λ j )Γ(1 − λ j − 1 /λ j ) F rom this w e see tha t c k ( λ 0 , b ) = c k − 1 ( λ 1 , b ) λ 0 /λ 1 g ( b ) λ k λ 0 c h, 1 = c k − 2 ( λ 2 , b ) λ 0 /λ 2 ·  g ( b ) λ k − 1 λ 0 c h, 2  λ 0 /λ 1 · g ( b ) λ k λ 0 c h, 1 and hence c k ( λ 0 , b ) = k Y j = 1  g ( b ) λ k − j +1 λ 0 c h,j  λ 0 /λ j − 1 In Durrett and Moseley (2009) if w e let F k − 1 b e the σ -field generated by Z j ( t ) for j ≤ k and all t ≥ 0 then E ( e − θ V k |F k − 1 ) = exp( − u k V k − 1 c h,k θ λ k − 1 /λ k ) Iterating we ha ve E ( e − θ V k |F k − 2 ) = E (exp( − u k V k − 1 c h,k θ λ k − 1 /λ k ) |F k − 2 ) = exp  − u k − 1 u λ k − 2 /λ k − 1 k V k − 2 c h,k − 1 c λ k − 2 /λ k − 1 h,k θ λ k − 2 /λ k  and hence E ( e − θ V k | V 0 ) = exp( − c θ , k V 0 u 1 ,k θ λ 0 /λ k ) where c θ , k = Q k j = 1 c λ 0 /λ j − 1 h,j . 19 5 Pro o fs for un b ou nded distri butions In this Section, w e prov e Theorem 7. The first step is to show that unlik e in the case of b ounded m utational adv ances, for unbounded distributions, the main con t r ibution to the limit is giv en by the descendan ts of a single m utations. The largest gr owth rate will come from z = O ( t ( α +1) /α ) so the next result is enough. Recall that the mean n umber of m utatio ns with growth rate la rger than z has µ ( z , ∞ ) = V 0 u 1 Z t 0  z t − s − λ 0  β e λ 0 s exp  − γ  z t − s − λ 0  α  ds = V 0 u 1 Z t 0  z t − s − λ 0  β exp( φ ( s, z )) ds where φ is as in (1.3). Lemma 5. L et ¯ z > λ 0 t . Then E   X ( s,x ): z ( s,x ) ≤ ¯ z Z s,x 1 ( t )   ≤ C V 0 u 1 ¯ z e λ 0 t + ¯ z as t → ∞ . Pr o of. The exp ected n umber of individuals pro duced b y m utations with growth rates ≤ ¯ z is V 0 u 1 Z t 0 Z ¯ z t − s − λ 0 0 e λ 0 s · y β e − γ y α · e z ( s,y ) dy ds. Changing v ariables y 7→ u = z ( s, y ), that is y = u/ ( t − s ) − λ 0 , dy = d u/ ( t − s ), and using F ubini’s theorem to switc h the order of in tegratio n, we can see that the ab o ve is ≤ V 0 u 1 e λ 0 t + ¯ z Z ¯ z 0 Z t 0 ( u/ ( t − s ) − λ 0 ) β exp  − γ  u t − s − λ 0  α  ds ( t − s ) du. (5.1) But then if we change v ariables s 7→ r = u / ( t − s ) − λ 0 , dr = uds/ ( t − s ) 2 , w e can see that the inner in tegr a l is ≤ Z ∞ − λ 0 r β r + λ 0 e − γ r α dr ≤ C yielding t he desired b ound. T o motiv ate the pro of of the g eneral result, w e b egin with the case when α = 1 . Pr o of of T he o rme 6. Since Z 1 ( t ) = X ( s,x ) ∈N ( t ) Z s,x 1 ( t ) = X ( s,x ): z ( s,x ) ≤ z Z s,x 1 ( t ) + X ( s,x ): z ( s,x ) >z Z s,x 1 ( t ) 20 for any z > 0, we ha v e 1 t log Z 1 ( t ) ∼ 1 t   log   X ( s,x ): z ( s,x ) ≤ z Z s,x 1 ( t )   ∨ log   X ( s,x ): z ( s,x ) >z Z s,x 1 ( t )     as t → ∞ . Lemma 5 tells us that if there is a mutation with growth rate z = O ( t 2 ), then the con tribution f rom mutations with growth rates smaller than z − ε can b e ignored so it suffices to describ e the distribution o f the la rgest growth rates. W e will show that µ ( z , ∞ ) →    4 c β 0 ( π /λ 0 ) 1 / 2 V 0 u 1 exp( γ λ 0 − 2 λ 0 x/ 2 c 0 ) if z = c 0 t 2  1 + (2 β +1) log t λ 0 t + x c 0 t  0 if z ≫ c 0 t 2  1 + (2 β +1) log t λ 0 t  (5.2) so that the largest growth r a te is O ( t 2 ) and comes from the rig h tmost particle in the p oint pro cess with inte nsit y giv en by (1.6). T o pro ve (5 .2), we first need to lo cate the maxim um of φ . Let z > λ 0 t so that there exists a unique maxim um s z . Solving φ s ( s, z ) = 0 and using t he expression for φ s in (1.4) yields s z = t − a 0 z 1 / 2 where a 0 = ( γ /λ 0 ) 1 / 2 = (4 c 0 ) − 1 / 2 whic h leads to t he expression φ ( s z , z ) = λ 0 t − λ 0 ( t − s z ) − γ  z t − s − λ 0  = λ 0 t − λ 0 a 0 z 1 / 2 − γ z 1 / 2 /a 0 + γ λ 0 = λ 0 ( t − 2 a 0 z 1 / 2 ) + γ λ 0 . (5.3) If we tak e z x = c 0 t 2  1 + κ lo g t t + x c 0 t  =  t 2 a 0  2  1 + κ lo g t t + 4 a 2 0 x t  in (5.3) and use (1 + y ) 1 / 2 = 1 + y / 2 + O ( y 2 ), w e o btain φ ( s z x , z x ) = − λ 0 κ lo g t 2 − 2 λ 0 a 2 0 x + γ λ 0 + o (1 ) (5.4) as t → ∞ . F urthermore, (1.5) implies that φ ss ( s z x , z x ) = − 2 γ z x ( t − s z x ) 3 = − 2 γ a 3 0 z 1 / 2 x = − a t + o (1 ) φ sss ( s z x , z x ) = − 6 γ z x ( t − s z x ) 4 = − 6 γ a 4 0 z = − 24 γ a 2 0 t 2 + o (1 ) as t → ∞ with a = 4 γ /a 2 0 . Since φ s ( s z , z ) = 0, taking a T a ylor expansion a round s z yields φ ( s, z x ) = φ ( s z x , z x ) − a 2 t ( s − s z x ) 2 + g ( s, z x ) (5.5) 21 where | g ( s, z ) | ≤ C | s − s z | 3 /t 2 for all s . Also note t ha t letting ψ ( s, z ) =  z t − s − λ 0  β w e hav e ψ ( s z x , z x ) =  z x t − s z x − λ 0  β = z β / 2 x /a β 0 + o ( z β / 2 x ) = (2 c 0 ) β t β + o ( t β ) so that ψ ( s, z x ) = (2 c 0 ) β t β + g 2 ( s, z x ) where | g 2 ( s, z ) || s − s z | − 1 t − β = o (1 ) . W rit e Z t 0 ψ ( s, z x ) e φ ( s,z x ) ds = Z A ψ ( s, z x ) e φ ( s,z x ) ds + Z A c ψ ( s, z x ) e φ ( s,z x ) ds where A = { s : | s − s z x | ≤ C ( t log t ) 1 / 2 } ∩ [0 , t ]. Since conca vit y implies that fo r s ∈ A c and C sufficien tly large, w e hav e exp( φ ( s, z x )) ≤ 1 t 2+ β exp( φ ( s z x , z x )) the contribution of the second in tegral is negligible. After the c hang e of v ariables s = s z x + ( t/a ) 1 / 2 r , when t is large, the first integral b ecomes Z A ψ ( s, z x ) e φ ( s,z x ) ds = ((2 c 0 ) β t β + o (1 )) e φ ( s z x ,z x ) Z C (log t ) 1 / 2 − C (log t ) 1 / 2 e g ( s,z x ) e − r 2 / 2 ( t/a ) 1 / 2 dr . and therefore since | g ( s, z x ) | ≤ C ( t log t ) 3 / 2 /t 2 when s ∈ A , we ha v e µ ( z x , ∞ ) = V 0 u 1 Z t 0 ψ ( s, z x ) e φ ( s,z x ) ds ∼ bV 0 u 1 t β +1 / 2 e φ ( s z x ,z x ) (5.6) where b = (2 c 0 ) β p 2 π /a = (2 c 0 ) β ( π /λ 0 ) 1 / 2 . Since φ ( s z x , z x ) = − κλ 0 log t 2 − 2 λ 0 a 2 0 x + γ λ 0 w e can conclude that µ ( z x , ∞ ) → ( V 0 u 1 b exp( γ λ 0 − 2 λ 0 a 2 0 x ) = V 0 u 1 b exp ( γ λ 0 − 2 λ 0 x/ 2 c 0 ) if κ = 2 β +1 λ 0 0 if κ > 2 β +1 λ 0 whic h pro ves (5.2) since this argumen t remains true eve n if κ = κ ( t ) and lim inf κ ( t ) > 2 β +1 λ 0 . 22 When α 6 = 1, w e no lo ng er ha v e an explicit form ula for the maxim um v a lue s z whic h com- plicates the pro cess of iden tif ying the largest growth rate. W e shall assume for con venie nce that α > 0 is an integer. Pr o of of The or em 7. As in the pro of of Theorem 6, it suffices to describe the distribution for the largest growth ra t es. Let z > λ 0 t so the maxim um s z exists. T o find a useful expression for the v a lue of φ ( s z , z ), w e write φ ( s, z ) = λ 0 t − λ 0 ( t − s ) − γ  z t − s − λ 0  α . Using the definition of s z as the solution to φ s ( s z , z ) = 0 yields the condition that ( t − s z ) α +1 = αγ λ 0 z α (1 − λ 0 t − s z z ) α − 1 i.e., t − s z =  αγ λ 0  1 / ( α +1) z α/ ( α +1)  1 − λ 0 t − s z z  ( α − 1) / ( α +1) . If we substitute the right side of this equation back in for t − s z in the paren thesis, then writing a 0 = ( αγ /λ 0 ) 1 / ( α +1) , w e hav e t − s z = a 0 z α/ ( α +1) 1 − λ 0 a 0 z − 1 / ( α +1)  1 − λ 0 ( t − s z ) z  α − 1 α +1 ! α − 1 α +1 = a 0 z α/ ( α +1)   1 − λ 0 a 0 z − 1 / ( α +1) 1 − λ 0 a 0 z − 1 / ( α +1)  1 − λ 0 ( t − s z ) z  α − 1 α +1 ! α − 1 α +1   α − 1 α +1 W e rep eat this α times and then use the approximation (1 − x ) n = 1 − nx + O ( x 2 ) rep eatedly with n = ( α − 1) / ( α + 1) to o btain t − s z = z α/ ( α +1) α X j = 0 a j z − j / ( α +1) + O ( z − 1 ) ! (5.7) where a j = a 0  λ 0 a 0 ( α − 1) α + 1  j for j ≥ 1. The error term is O ( z − 1 ) b ecause 0 < (1 − λ 0 ( t − s ) /z ) ≤ 1 23 for a ll z > λ 0 t and s ≤ t . F a ctoring out a 0 in (5.7) and using (1 + x ) − 1 = P ( − x ) j when | x | < 1, w e hav e t ha t z t − s − λ 0 = a − 1 0 z 1 /α +1 1 − α X i 1 =1 a − 1 0 a i 1 z − i 1 / ( α +1) + α X i 1 ,i 2 =1 a − 2 0 a i 1 a i 2 z − ( i 1 + i 2 ) / ( α +1) − · · · + ( − 1) α α X i 1 ,...,i α =1 a − α 0 α Y j = 1 a i j z − P α j =1 i j / ( α +1) + O ( z − 1 ) ! − λ 0 z 1 / ( α +1) z − 1 / ( α +1) = z 1 / ( α +1) α X j = 0 b j z − j / ( α +1) + O ( z − 1 ) ! (5.8) for large z where the b j are given b y b 0 = 1 /a 0 b 1 = − a 1 /a 2 0 − λ 0 b 2 = − ( a 2 − a 2 1 ) /a 3 0 b 3 = − ( a 4 − 2 a 1 a 3 − a 2 2 − 3 a 2 1 a 2 + a 4 1 ) /a 4 0 and in general, b i = α X k =1 X i 1 ,...,i k : i 1 + ··· + i k = i ( − a 0 ) − ( k + 1) k Y j = 1 a i j . (5.8) implies that − γ  z t − s − λ 0  α = − γ z α/ ( α +1)  b α 0 + α b α − 1 0 b 1 z − 1 / ( α +1) +  αb α − 1 0 b 2 +  α 2  b α − 2 0 b 2 1  z − 2 / ( α +1) + · · · +  αb α − 1 0 b α + · · · + b α 1  z α/ ( α +1) + O ( z − 1 )  and therefore, φ ( s z , z ) = λ 0 t + λ 0 ( t − s ) − γ  z t − s − λ 0  α = λ 0 t + α X j = 0 d j z α − j α +1 + O ( z − 1 / ( α +1) ) (5.9) 24 where the d j can b e calculated explicitly , for example: d 0 = − λ 0 a 0 − γ b α 0 d 1 = − λ 0 a 1 − γ αb α − 1 0 c 1 d 2 = − λ 0 a 2 − γ  αb α − 1 0 b 2 +  α 2  b α − 2 0 b 2 1  d 3 = − λa 3 − γ  αb α − 1 0 b 3 +  α 2  b α − 2 0 b 1 b 2 +  α 3  b α − 3 0 b 3 1  . T o figure out the distribution o f the grow t h r a te for the la r gest m uta n t, we let c 0 = ( − λ 0 /d 0 ) ( α +1) /α and then searc h for κ j , j = 1 , ..., α − 1 and κ so tha t plugging z x = c 0 t ( α +1) /α 1 + α − 1 X j = 1 κ j t − j /α + x c 0 t + κ lo g t t ! in to (5.9) yields φ ( s z x , z x ) = k 1 − k 2 x − k 3 log t (5.10) for some constan ts k 1 , k 2 , k 3 . Subs tituting z x in to (5.9) a nd writing κ 0 = 1, κ α = x/c 0 to ease the notation w e obtain φ ( s z x , z x ) = λ 0 t + α X j = 0 d j  − λ 0 t d 0  ( α − j ) /α α X j = 0 κ j t − j /α + κt − 1 log t ! ( α − j ) / ( α +1) + O ( t − 1 /α ) . Since λ 0 t + d 0 ( − λ 0 t/d 0 ) = 0, the first order terms in this expansion is t ( α − 1) /α and aft er using the T a ylor series expansion (1 + x ) p = 1 + px + p ( p − 1) x 2 / 2 + · · · + p ( p − 1) · · · ( p − α + 1) x α /α ! + O ( x α +1 ) w e obtain φ ( s z 0 , z 0 ) = α X j = 1 ρ j t ( α − j ) /α + ρ log t + O ( t − 1 /α log t ) (5.11) where ρ = d 0  − λ 0 d 0   α α + 1  κ = − αλ 0 α + 1 κ ρ 1 = d 0  − λ 0 d 0   α α + 1  c 1 + d 1  − λ 0 d 0  ( α − 1) /α ρ 2 = d 0  − λ 0 d 0   α α + 1 c 2 + α α + 1  α α + 1 − 1  c 2 1  + d 1  − λ 0 d 0  ( α − 1) /α  α − 1 α  c 1 + d 2  − λ 0 d 0  ( α − 2) /α 25 and in general ρ j = j X i =0 d i  − λ 0 d 0  ( α − i ) /α j − i X k =1 k Y ℓ =1  α − i α + 1 − ℓ + 1  κ i ℓ j = 1 , 1 , ..., α where for eac h i and k , in the inner pro duct, i 1 , ..., i k are alw a ys c hosen to satisfy i 1 + i 2 + · · · + i k = j − i . Since ρ j dep ends only on κ i , i ≤ j , then after no t ing that the co efficien t of κ j in ρ j is − αλ 0 / ( α + 1), w e can use forward substitution to solv e the system ρ j = 0, j = 1 , 2 , ..., α − 1 for κ j to obtain the recursiv e formu las c j ≡ κ j = − α + 1 αλ 0  ρ j − − αλ 0 α + 1 κ j  (5.12) for i = 1 , 2 , ..., α − 1. Setting ρ = − k 3 yields κ = ( α + 1) k 3 αλ 0 and fo r this choice of c j , κ , w e obtain (5.10) with k 2 = − α α + 1 d 0 c 0  − λ 0 d 0  = αλ 0 ( α + 1) c 0 and k 1 = − ( ρ α − k 2 x ) . Since  z x t − s z x − λ 0  β = z β / ( α +1) x /a β 0 + o ( z β / ( α +1) x ) = c 1 / ( α +1) 0 a 0 ! β t β /α + o ( z β / ( α +1) x ) c ho o sing k 3 = (2 β /α + 1) / 2 replaces (5.4) in the pro of of Theorem 6. 26 No w substituting (5.7) a nd (5.8) in (1.5) yields φ ss ( s z , z ) = − α ( α − 1) γ z α − 2 α +1 α X j = 0 b j z − j / ( α +1) + O ( z − 1 ) ! α − 2 × z 2 z 4 α/ ( α +1)  P α j = 0 a j z − j / ( α +1) + O ( z − 1 )  4 − αγ z α − 1 α +1 α X j = 0 b j z − j / ( α +1) + O ( z − 1 ) ! α − 1 × 2 z z 3 α/ ( α +1)  P α j = 0 a j z − j / ( α +1) + O ( z − 1 )  3 = [ − α ( α − 1) γ b α − 2 0 /a 4 0 − αγ b α − 1 0 /a 3 0 ] z − α/ ( α +1) + o ( z − α/ ( α +1) ) = − α 2 γ a α +2 0 z − α/ ( α +1) + o ( z − α/ ( α +1) ) where in the second to last line w e ha ve used the fa ct that b 0 = a − 1 0 . When z = z x , this b ecomes φ ss ( s z x , z x ) = − a t + o ( t − 1 ) where a = α 2 γ a α +2 0 c α/ ( α +1) 0 . Since φ s ( s z , z ) = 0 and a calculation similar to the o ne abov e sho ws that φ sss ( s z x , z x ) = O ( t − 2 ), w e hav e φ ( s, z x ) = φ ( s z x , z x ) − a 2 ( s − s z x ) 2 + g ( s z x , z x ) where | g ( s, z ) | ≤ C | s − s z | 3 /t 2 for a ll s . This r eplaces (5.5) from the α = 1 pro of a nd the rest of the pro of is the same. Note tha t the intensit y fo r the limiting p oint pro cess is given b y c 1 / ( α +1) 0 a 0 ! β p 2 π /a exp( k 1 − k 2 x ) . (5.13 ) Remark 1. F r om (5.7) , we have t − s z x ∼ a 0 ( c 0 t ( α +1) /α ) ( α +1) /α = αt α + 1 which tel ls us that the time at which the m utant with la r ges t gr owth r ate is b orn is ∼ t/ ( α + 1) . 27 6 Discuss ion In this pap er, w e ha v e analyzed a m ulti-t yp e branching pro cess mo del of tumor prog r ession in whic h m utations increase the birth ra tes of cells b y a ra ndom amoun t . W e studied b oth b ounded a nd unbounded distributions for the random fitness adv ances and calculated the asymptotic r a te of expansion f or the k th generation of m utants. In the b ounded setting, we f ound that there are only tw o para meters of the distribution that affect the limiting growth rate of the k th generation (see Theorems 1, 2, 4, and 5): the upp er b ound for the supp ort of the distribution and the v alue of its densit y a t the upp er b ound. This is a rather in t uitiv e result since one w ould exp ect that in the lo ng run, the k th generation will b e dominated by m utan ts with the maxim um p ossible fitness. In a dditio n, w e found that there is a p olynomial correction to the exp onen tia l gro wth of the k th generation. This correction is not presen t in the case where the fitness adv ances are deterministic. W e ha ve discussed this point in further detail in Section 1.1 and after the pro of of Theorem 5 in Section 4. Finally , w e show ed that the limiting p opulation is descended f r o m sev eral differen t mutations (see Theorem 3). In the unbounded setting, we assumed that the distribution o f the fitness adv ance has the form P ( X > x ) = x β e − γ x α where α , β , and γ are parameters. W e f ound that the p opulation of cells with a single m utation grows asymptotically at a sup er-exp onen tia l rate exp( t ( α +1) /α ) (see Theorems 6 and 7) and at large times, most of the first generation is deriv ed f rom a single m utation ( see Lemma 5). The sup er- expo nen tial growth rat e suggests that the exp onen tial distribution, whic h is o ften used fo r the fitness adv ances of an organism due to natural selection, is not a go o d ch oice for mo deling the m utatio nal adv ances in the progression to cancer where there is v ery lit t le evidence for p opulations grow ing at a sup er-exp onential rate. These conclusions pro vide sev eral interes ting contributions to the existing literature on ev olutionary mo dels of cancer progression. First, our mo del g eneralizes previous m ulti- t yp e branc hing mo dels o f tumor progression by allo wing for random fitness adv ances as mutations are accumulated and pro vides a mathematical fra mework for f ur t her inv estigations in to t he role pla ye d by the fitness distribution of m uta t io nal adv ances in driving tumorigenesis. Sec- ond, w e ha v e disco vered that b ounded distributions lead to exp onen tial gro wth whereas un b ounded distributions lead to sup er-exp onen tial growth. This dic hoto m y migh t provide a new metho d for testing whether a tumor p opulat ion has evolv ed with an unbounded distri- bution of m uta tional adv ances. Third, we observ e that in the case of b ounded distributions, the grow th rate of the tumor is somewhat ‘robust’ with resp ect to the m utational fitness distribution and dep ends only on its upp er endp oin t. Finally , o ur calculations of the growth rates for the k th generation of m utants serv e as a gr o undw ork for studying the evolution a nd role of heterogeneity in tumorigenesis. These implications will b e explored further in future w ork. 28 References Becsk ei A., Kaufmann B.B., and v an Oudenaarden A. (2005) Con tributions of lo w molecule n umber and chromosomal p ositioning t o sto chastic gene expression. Natur e Genetics 9, 937– 944. Beeren wink el, N., An ta l, T., D ingli, D., T r a ulsen, A., Kinzler, K.W., V elculescu, V.E., V o- gelstein, B., and Now ak, M.A. (2007) Genetic progression and the waiting time t o cancer. PL oS Com putational Biolo gy. 3, pap er e225 Beisel, C.J., Ro kyta, D.R., Wichm an, H.A., and Joyce , P . (2007) T esting the extreme v alue domain o f attraction for distributions of b eneficial fitness effects. Genetics. 176, 2441–2 449 Bo dmer, W., and T omlinson, I. (1995) F ailure of programmed cell death and differen tiation as causes of tumors: some simple mat hematical mo dels. Pr o c Natl A c ad Sci USA 92, 11 130– 11134. Coldman, A.J., a nd Murray , J.M. (2000 ) Optimal con tr ol for a sto chastic mo del of cancer c hemotherap y . Mathematic al Bio scienc es 168, 18 7–200. Co wp erthw aite, M.C., Bull, J.J., and Mey ers, L.A. (2005 ) Distributions of b eneficial fitness effects in RNA. Gentics. 170 , 1449–1 457 Durrett, R., and Ma yb erry , J. (2009) T r av eling w av es of selectiv e sw eeps. Durrett, R ., and Moseley , S. (2009 ) Ev o lution of resistance and progression to disease during clonal expansion of cancer. The or. Pop. Biol. , to app ear Elo witz, M.B. et al. (2 002) Sto chastic gene expres sion in a single cell. Scienc e 297 , 1183–118 6. F einerman, O. et al. (2008) V ariability and ro bustness in T cell activ at ion from regulated heterogeneit y in protein lev els. Scie nc e 3 2 1, 1081. F rank, S.A. (2007) D ynamics of Canc er: Incide nc e , In heritanc e and Evolution. Princeton Series in Ev olutio nary Biology . Gillespie, J.H. (1983) A simple sto c hastic gene substitution mo del. The or. Pop. Biol. 23, 202–215 Gillespie, J.H. (1984) Molecular ev olution o v er the m utational landscape. Evolution. 3 8, 1116–112 9 Goldie, J.H., and Coldman, A.J. (198 3) Quan titativ e mo del for multiple leve ls of drug resis- tance in clinical tumors. Canc er T r e atment R ep orts 67, 9 23–931. Goldie, J.H., and Coldman, A.J. (19 8 4) The genetic orig in of drug resistance in neoplasms: implications for systemic therap y . Canc er R ese ar ch 44, 3643 –3653. Haeno, H., Iw asa, Y., and Mic hor , F. (2007) The ev olutio n of tw o m uta t io ns during clonal expansion. Genetics. 177, 2209–222 1 29 Iw asa, Y., Mic hor, F., Komoro v a, N.L., and Now ak, M.A. (2 0 05) Population genetics of tumor suppressor genes. J. The or. Biol. 233, 15–23 Iw asa, Y., Now ak, M.A., and Mic hor, F . (2006) Evolution of resistance during clonal expan- sion. Genetics. 172, 25 57–2566 Kassen, R., and Bataillon, T. (20 06) D istribution of fitness effects among b eneficial mutations b efore selection in exp erimen tal p opulations of bacteria. Natur e Genetics. 38, 484–488 Kaern, M. et al. (200 5) Sto c hasticity in gene expression: from theories to phenot yp es Natur e R ev iews Genetics 6 , 45 1 . Kn udson, A.D. (2001) Tw o genetic hits (more or less) to cancer. Natur e R eview s Can c er. 1, 157–162 Komarov a, N.L., and W o darz, D . (20 05) Drug resistance in cancer: principles of emergence and preve n tion. Pr o c Natl A c a d Sci USA 102 , 9714–97 19. Maley , C.C. et al. (2006) Genetic clonal dive resit y predicts progression to esophageal adeno- carcinoma. Natur e Genetics. 38 , 468–47 3 Maley , C.C., a nd F orrest (2001) Exploring the relationship b et w een neutral and selectiv e m utations in cancer. Art i f Life 6, 325–345. Mic hor, F., Iwasa, Y., a nd Now a k, M.A. (2004) Dynamics of cancer progression. Natur e R ev iews Canc er 4 , 197–20 5 Mic hor, F., No wak, M.A., and Iw asa, Y. (2006) Sto c hastic dynamics o f metastasis formation. J The or Biol 240 , 521–53 0 . Mic hor, F., and Iw a sa, Y. (2006) Dynamics of metastasis suppressor gene inactiv ation. J The or Biol 241, 67 6–689. No wak M.A., Mic hor F, and Iwasa Y (2006) Genetic instability and clonal expansion. J The or Biol 241, 26–3 2 . No we ll P .C. (1976) The cloncal ev olution o f tumor cell p o pula t io ns. S cienc e 194, 23 – 28. Orr, H.A. (2003) The distribution of fitness effects among b eneficial m utations. Genetics. 163, 151 9–1526 Otto, S.P ., a nd Jones, C.D. (200 2 ) D etecting the undetected: Estimating the to t al num b er of lo ci underlying a quantitativ e trait. Genetics. 1 5 6, 2093– 2 107 Rokyta, D.R., Beisel, C.J., Jo yce, P ., F erris, M.T., Burch, C.L., and Wic hman, H.A. (2 008) Beneficial fitness effects are no t exp onen tial in tw o viruses. J. Mol. Evol. 6 7 , 368–37 6 Rozen, D.E., de Visser, J.A.G.M., and Gerrish, P .J. (200 2 ) Fitness effects of fixed b eneficial m utations in microbial p opulations. Curr et B iolo gy. 12, 1 0 40–1045 Sanju´ an, R., Moy a, A., and Elena, S.F. (20 0 4) The distribution of fitness effects caused b y single-n ucleotide substitutions in an R NA virus. Pr o c. Natl A c ad. S ci., USA. 101, 83 96–8401 30 Shah, S.P ., et al. (2009) Mutational ev olution in a lobular breast tumour profiled at single n ucleotide resolution. Natur e. 4 61, 809– 813 W eissman, I. (1978 ) Estimation o f para meters and large quantiles based on the k largest observ ations. j. Amer. Stat. Asso c. 7 3 , 812–81 5 W o darz, D., and Komarov a, N.L. (2007) Can loss of ap optosis protect against cancer? T r ends Genet. 23 , 232–23 7. 31 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 θ exact LT (t=60) exact LT (t=80) exact LT (t=100) exact LT (t=120) limiting LT (t= ∞ ) MC simulations Figure 1: Plot o f the exact La place transform (L T) for t (1+ p ) e − ( λ 0 + b ) t Z 1 ( t ) at times t = 60 , 80 , 100 , 120, the approximations fr o m Mon te Carlo ( MC) sim ulations a t the corresp onding times, and the asymptotic Laplace transform from Theorem 2. P arameter v a lues: a 0 = 0 . 2, b 0 = 0 . 1 , b = 0 . 01, and u 1 = 10 − 3 . g is uniform on [0 , . 0 1 ]. 32 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 θ limiting LT (t= ∞ ) MC simulation (t=80) MC simulation (t=100) MC simulation (t=120) Figure 2: Plot of the approx imations to the Laplace tra nsfor m o f t 2+ p 2 e − ( λ 0 +2 b ) t Z 2 ( t ) from Mon te Carlo (MC) sim ulations at times t = 80 , 100 , 120 along with the asymptotic Laplace transform from Theorem 5 . P arameter v alues: a 0 = 0 . 2 , b 0 = 0 . 1, b = 0 . 01, and u 1 = u 2 = 10 − 3 . g is uniform on [0 ,0.01]. 33

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment