Spatio-temporal Functional Regression on Paleo-ecological Data
The influence of climate on biodiversity is an important ecological question. Various theories try to link climate change to allelic richness and therefore to predict the impact of global warming on genetic diversity. We model the relationship betwee…
Authors: ** Liliane Bel*, Avner Bar‑Hen, Rahid Cheddadi
Spatio-temp oral F untional Regression on P aleo-eologial Data Liliane Bel ∗ , UMR 518 A gr oParisT e h/INRA,16, rue Claude Bernar d - 75231 Paris Ce dex 05 A vner Bar-Hen, Université R ené Des artes, MAP5, 45 rue des Saints Pèr es, 75270 Paris e dex 06 Ra hid Cheddadi, ISEM, ase p ostale 61, CNRS UMR 5554, 34095 Montp el lier, F r an e Rém y P etit, UMR 1202 INRA , 69 r oute d'A r ahon 33612 Cestas Ce dex, F r an e Abstrat The inuene of limate on bio div ersit y is an imp ortan t eologial question. V ari- ous theories try to link limate hange to alleli ri hness and therefore to predit the impat of global w arming on geneti div ersit y . W e mo del the relationship b e- t w een geneti div ersit y in the Europ ean b ee h forests and urv es of temp erature and preipitation reonstruted from p ollen databases. Our mo del links the geneti measure to the limate urv es through a linear funtional regression. The in tera- tion in limate v ariables is assumed to b e bilinear. Sine the data are georeferened, our metho dology aoun ts for the spatial dep endene among the observ ations. The pratial issues of these extensions are disussed. Key wor ds: F untional Data Analysis; Spatio-temp oral mo deling; Climate hange; Bio div ersit y Preprin t submitted to Elsevier No v em b er 18, 2018 1 In tro dution 1 Climate reords sho w that the earth has reorded a suession of p erio ds of 2 ma jor w arming and o oling at dieren t time windo ws and sales [ 5 , 12℄. During 3 the last p ost-glaial p erio d (18000 y ears b efore the presen t), Europ e reorded 4 a 15 ◦ C to 20 ◦ C w arming dep ending on the area. A t the same p erio d there w as 5 an expansion of all forest biomes and an up w ard mo v emen t of the tree-lines 6 that rea hed an altitude 300 m higher than to da y . Although there is a w ealth 7 of paleo data and detailed limate reonstrution for the Holo ene p erio d, w e 8 still la k some kno wledge as to ho w the w arming w as reorded and what the 9 v egetation feedba ks w ere that aeted lo al or regional past limates. V arious 10 theories try to link limate hange to alleli ri hness and therefore to predit 11 the impat of global w arming on geneti div ersit y . 12 In the reen t literature there ha v e b een a lot of theoretial results for regres- 13 sion mo dels with funtional data. Based on this framew ork, w e used a linear 14 funtional mo del to mo del the relationship b et w een geneti div ersit y in Euro- 15 p ean b ee h forests (represen ted b y a p ositiv e n um b er) and urv es of temp er- 16 ature and preipitation reonstruted from the past. The lassial funtional 17 regression mo del has b een extended in t w o w a ys to aoun t for our sp ei 18 problem. First, as the eets of temp erature and preipitation are far from 19 indep enden t w e inluded an in teration term in our mo del. This in teration 20 term app ears as a bilinear funtion of the t w o preditors. Seond, sine w e 21 ha v e spatial data there is dep endene among the observ ations. T o tak e in to 22 aoun t with dep endene the o v ariane matrix of the residuals is estimated 23 in a spatial framew ork and plugged in to generalized least-squares to estimate 24 the parameters of the mo del. The pratial diulties of these extensions will 25 b e disussed. 26 In Setion 2, w e presen t the geneti and limate data. The funtional regression 27 mo del is studied in Setion 3. Results are presen ted and disussed in Setion 4 28 ∗ Corresp onding author. Email addr ess: Liliane.Belagroparisteh.fr (Liliane Bel). 2 and onluding remarks are giv en in Setion 5. 1 2 Data 2 P ollen reords are imp ortan t pro xies for the reonstrution of limate param- 3 eters sine v ariations in the p ollen assem blages mainly resp ond to limate 4 hanges. Based on the fossil and surfae p ollen data from p ollen databases, 5 w e used mo dern analogue te hnique (MA T) to reonstrut limate v ariables. 6 Climate reonstrution is aomplished b y mat hing fossil biologial assem- 7 blages to reen tly dep osited (mo dern) p ollen assem blages for whi h limate 8 prop erties are kno wn. The relatedness of fossil and mo dern assem blages is usu- 9 ally measured using a distane metri that resales m ultidimensional sp eies 10 assem blages in to a single measure of dissimilarit y . The distane-metri metho d 11 is widely used among paleo eologists and paleo eanographers [8℄. T emp erature 12 and preipitation w ere reonstruted at 216 lo ations from the presen t ba k 13 to a v ariable date dep ending on a v ailable data. The p ollen dataset w as used 14 to reonstrut limate v ariables, throughout Europ e for the last 15 000 y ears 15 of the Quaternary . Due to the metho dology , ea h limate urv e is sampled at 16 irregular times for ea h lo ation. 17 Geneti div ersities w ere measured from v ariation at 12 p olymorphi isozyme 18 lo i in Europ ean b ee h ( F agus sylvati a L.) forests based on an extensiv e 19 sample of 389 p opulations distributed throughout the sp eies range. Based 20 on these data, v arious indies of div ersit y an b e omputed. They mainly 21 haraterize within or b et w een p opulation div ersit y . In this artile, w e fo us on 22 the H index, the probabilit y that t w o alleles sampled at random are dieren t. 23 This parameter is a go o d indiation of gene div ersit y [ 3℄. 24 The t w o datasets w ere olleted indep enden tly and their lo ations do not 25 oinide. 26 3 3 F untional Regression 1 The funtional linear regression mo del with funtional or real resp onse has 2 b een the fo us of v arious in v estigations [1, 6, 7, 11℄. W e w an t to estimate the 3 link b et w een the real random resp onse y i = d ( s i ) , the div ersit y at site s i and 4 ( θ i ( t ) , π i ( t )) t> 0 the temp erature and preipitation funtions at site s i . There 5 are t w o p oin ts to onsider for the mo deling: (i) funtional linear mo dels need 6 to b e extended to inorp orate in teration b et w een limate funtions; (ii) sine 7 w e ha v e spatial data, observ ations annot b e onsidered as indep enden t and 8 w e also need to extend funtional mo deling to aoun t for spatial orrelation. 9 W e assume that the temp erature and preipitation funtions are square in- tegrable random funtions dened on some real ompat set [0 , T ] . The v ery general mo del an b e written as: Y = f (( θ ( t ) , π ( t )) T >t> 0 ) + ε f is an unkno wn funtional from L 2 ([0 , T ]) × L 2 ([0 , T ]) to R and ε is a spatial 10 stationary random eld with orrelation funtion ρ ( . ) . 11 W e assume here that the funtional f ma y b e written as the sum of linear 12 terms in θ ( t ) and π ( t ) and a bilinear term mo deling the in teration 13 f ( θ , π ) = µ + Z [0 ,T ] A ( t ) θ ( t ) d t + Z [0 ,T ] B ( t ) π ( t ) d t + Z Z [0 ,T ] 2 C ( t, u ) θ ( t ) π ( u ) dudt = µ + h A ; θ i + h B ; π i + h C θ ; π i b y the Riesz represen tation of linear and bilinear forms. 14 A and B are in L 2 ([0 , T ]) and C is a k ernel of L 2 ([0 , T ]) . 15 Let ( e k ) k > 0 b e an orthonormal basis of L 2 ([0 , T ]) . Expanding all funtions on this basis w e get θ i ( t ) = + ∞ X k =1 α i k e k ( t ) π i ( t ) = + ∞ X k =1 β i k e k ( t ) 4 A ( t ) = + ∞ X k =1 a k e k ( t ) B ( t ) = + ∞ X k =1 b k e k ( t ) C ( t, u ) = + ∞ X k ,ℓ =1 c k ℓ e k ( t ) e ℓ ( u ) and y i = µ + + ∞ X k =1 a k α i k + + ∞ X k =1 b k β i k + + ∞ X k ,ℓ =1 c k ℓ α i k β i ℓ + ε i If the sums are trunated at k = ℓ = K the problem results in a linear 1 regression Y = µ + X φ + ε with spatially orrelated residuals with 2 X = α 1 1 . . . α 1 K β 1 1 . . . β 1 K α 1 1 β 1 1 . . . α 1 K β 1 K . . . . . . . . . α n 1 . . . α n K β n 1 . . . β n K α n 1 β n 1 . . . α n K β n K dim ( X ) = n × (2 K + K 2 ) o v ( ε i , ε j ) = ρ ( s i − s j ) In order to estimate the regression and the orrelation funtion parameters w e 3 pro eed b y Quasi Generalized Least Squares: a preliminary estimation of φ is 4 giv en b y Ordinary Least Squares, φ ∗ = ( X t X ) − 1 X t Y , the orrelation funtion 5 is estimated from the residuals b ε = Y − X φ ∗ and the nal estimate of φ is 6 giv en b y plugging the estimated orrelation matrix b Σ in the Generalized Least 7 Squares form ula b φ = ( X t b Σ − 1 X ) − 1 X t b Σ − 1 Y . If b oth estimations of φ and Σ are 8 on v ergen t and assuming normal distribution of the residuals then [9℄: 9 √ n ( b φ − φ ) → N (0 , lim n →∞ n ( X t Σ − 1 X ) − 1 ) The estimation of Σ is on v ergen t under mild onditions [4℄ and the on v er- 10 gene of φ is assessed for example when the funtions are expanded on a splines 11 basis [1 ℄ or on a Karh unen expansion [10℄. 12 Signiane of the preditors an b e tested if the residuals are assumed to b e 13 Gaussian, within the lassial framew ork of linear regression mo dels. 14 Sev eral parameters need to b e set. The rst hoie is that of the orthonormal 15 5 basis. It an b e F ourier, splines, orthogonal p olynomials, w a v elets. Then the 1 order of trunation has to b e determined. The spatial orrelation funtion 2 of the residuals ma y b e of parametri form (exp onen tial, Gaussian, spherial 3 et.). These hoies will b e made b y minimizing a ross v alidation riterion: a 4 sample with no missing data for all v ariables is determined, and for ea h site of 5 the sample a predition of the div ersit y is omputed aording to parameters 6 estimated without the site in the sample. The global riterion is the quadrati 7 mean of the predition error. 8 4 Results 9 P ollen w as olleted throughout Europ e pro viding temp oral estimation of tem- 10 p eratures and preipitation. These estimations are not regularly spaed, and 11 ha v e v ery dieren t ranges from 1 Ky ears to 15 Ky ears. Bee h geneti indies 12 are reorded in forests and do not oinide with the p ollen lo ations. Figure 13 1 sho ws the lo ations of the t w o datasets. 14 −10 0 10 20 30 40 45 50 55 60 Measurement sites pollens genetics Figure 1. Lo ations of p ollen (bla k dots) and geneti (op en irles) reords. 6 Climate v ariables are on tin uous all o v er Europ e but b ee h forests ha v e sp ei 1 lo ations. In order to mak e our data to spatially oinide, temp erature and 2 preipitation urv es are rstly estimated on a regular grid of time from 15 3 Ky ears to presen t on sites where are olleted the geneti measures. 15 Ky ears 4 orresp onds to the b eginning of migration of plan ts on to areas made free b y 5 the retreating ie sheets. 6 The in terp olation is done b y a spatio-temp oral kriging assuming the o v ariane 7 funtion is exp onen tial and separable. Figure 2 sho ws for a partiular site the 8 estimated temp erature urv e together with some neigh b oring urv es issued 9 from olleted p ollen. 10 −15000 −10000 −5000 0 −25 −20 −15 −10 −5 0 5 10 time Spatio−temporal kriging of temperature Figure 2. Resulting temp erature urv e (thi k bla k urv e) from spatio-temp oral kriging of 20 neigh b oring temp eratures urv es from reorded p ollen. W e aim to predit geneti div ersit y with preipitation and temp erature urv es. 11 This orresp onds to a funtional regression mo del with geneti div ersit y as de- 12 p enden t v ariable and temp erature and preipitation urv es as preditor v ari- 13 able. The ross v alidation riterion giv es b etter results with an expansion of 14 the preditor v ariables on a F ourier basis of order 5. Figures 3 and 4 sho w the 15 o eien t funtions A , B , and k ernel C . 16 7 −15000 −10000 −5000 0 −0.005 0.000 0.005 0.010 Function A time −15000 −10000 −5000 0 −6e−05 −4e−05 −2e−05 0e+00 2e−05 4e−05 Function B time Figure 3. Co eien t funtion A of the temp erature and o eien t funtion B of the preipitation time time Kernel C Figure 4. Kernel C of the in teration temp erature-preipitation The shap e of the o eien t funtion A sho ws that the term h A, θ i will b e 1 higher when the gap b et w een p erio ds b efore 7.5 Ky ears and after 7.5 Ky ears 2 8 is higher (temp eratures b efore 7.5 Ky ears are mostly negativ e), mean while the 1 shap e of the o eien t funtion B sho ws that the term h B , π i will b e higher 2 when the preipitation b efore 7.5 Ky ears is higher (preipitation is p ositiv e). 3 The surfae of k ernel C is ob viously not the pro dut of t w o urv es in the t w o 4 o ordinates, sho wing an eet of in teration. 5 In Figure 5 the residual v ariogram graph exhibits some spatial dep endene. An 6 exp onen tial v ariogram is tted, and the resulting o v ariane matrix is plugged 7 in to the GLS form ula to up date the o eien ts and test the eets of the 8 temp erature, preipitation and in teration. 9 0 200 400 600 800 1000 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 Residuals variogram distance (km) semivariance nugget = 0.00037 sill = 0.00021 range = 81.5 Figure 5. Empirial and tted v ariogram on the residuals. The graphs in Figure 6 sho w that the mo del explains a part of the div ersit y 10 v ariabilit y . Ho w ev er it is far from explaining all the v ariabilit y as the R 2 is 11 equal to 0.31. 12 9 0.20 0.25 0.30 0.35 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 Observed response Predicted response 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 0.08 Predicted response Residuals Figure 6. Observ ed-Predited resp onse and Predited-Residuals graphs. T able 1 giv es the analysis of v ariane of the four nested mo dels: 1 Mo del 1: E ( Y ) = µ + h A ; θ i + h B ; π i + h C θ ; π i 2 Mo del 2: E ( Y ) = µ + h A ; θ i + h B ; π i 3 Mo del 3: E ( Y ) = µ + h A ; θ i 4 Mo del 4: E ( Y ) = µ + h B ; π i 5 T able 1 Analysis of v ariane mo dels of nested mo dels The p -v alues (2.2e-16) of the tests H 0 : mo del 3 (mo del 4) against H 1 : mo del 6 2 and (1.430e-07) of the test H 0 : mo del 2 against H 1 : mo del 1 sho w that the 7 in teration and the t w o v ariables ha v e a strong eet. 8 10 −15000 −10000 −5000 0 −12 −8 −6 −4 −2 0 genetic class <0.24 years Temperature sample average class average −15000 −10000 −5000 0 −12 −8 −6 −4 −2 0 genetic class [0.24 ; 0.26[ years Temperature sample average class average −15000 −10000 −5000 0 −12 −8 −6 −4 −2 0 genetic class [0.26;0.28[ years Temperature sample average class average −15000 −10000 −5000 0 −12 −8 −6 −4 −2 0 genetic class >0.28 years Temperature sample average class average Temperature Figure 7. P attern of temp erature urv es aording to the range of the predited resp onse. −15000 −10000 −5000 0 500 600 700 800 900 genetic class <0.24 years Precipitation sample average class average −15000 −10000 −5000 0 500 600 700 800 900 genetic class [0.24 ; 0.26[ years Precipitation sample average class average −15000 −10000 −5000 0 500 600 700 800 900 genetic class [0.26;0.28[ years Precipitation sample average class average −15000 −10000 −5000 0 500 600 700 800 900 genetic class >0.28 years Precipitation sample average class average Precipitation Figure 8. P attern of preipitation urv es aording to the range of the predited resp onse. T o giv e a b etter understanding of the regression mo del w e divide the predited 1 11 resp onse range in to 4 lasses: less than 0.24, ℄0.24;0.26℄, ℄0.26;0.28℄ and greater 1 than 0.28. Figures 7 and 8 sho w the shap es of temp erature and preipitation 2 urv es for ea h lass. When lo w ( < 0.24) div ersit y is predited, temp erature 3 urv es are globally higher than the a v eraged temp erature urv es on all the 4 sample. As the predited div ersit y b eomes higher the gap b et w een the t w o 5 p erio ds b efore 7.5 Ky ears and after 7.5 Ky ears gets more pronouned. This 6 eet is less eviden t for preipitation, lo w div ersit y is predited when the pre- 7 ipitation is higher on the rst p erio d than the a v eraged preipitation urv es 8 on all the sample. When the predited div ersit y is higher than 0.24 there seems 9 to b e no eet of preipitation on its the lev el. 10 When the hange of limate during the Holo ene (12 Ky ears to presen t) is 11 signian t the div ersit y is higher. This mostly onerns northern and w est- 12 ern Europ e. This is oheren t with previous studies [ 2℄. After 12 Ky ears and 13 throughout the Holo ene the limate w as no longer uniform all o v er Europ e. 14 The largest mismat h b et w een NW and SE Europ e o urred around 9 Ky ears 15 and 5 Ky ears. By 5 Ky ears, all deiduous tree taxa (su h as b ee h) w ere outside 16 their glaial refugia. 17 5 Conlusion 18 The lassial linear funtional mo del has b een extended in a straigh tforw ard 19 manner to the ase of t w o funtional preditors with an in teration term, and 20 with spatially orrelated residuals. Su h a mo del applied to omplex paleo e- 21 ologial and bio div ersit y data emphasizes an in teresting relationship b et w een 22 limate hange and geneti div ersit y: div ersit y is higher when the hange in 23 limate (mostly temp erature) during the Holo ene (12 Ky ears to presen t) 24 w as sizeable and lo w er when temp erature and preipitation are b oth globally 25 higher o v er the whole p erio d. This mo del ma y b e impro v ed in sev eral w a ys. 26 The spatial eet ma y b e handled in other w a ys, b y means of a mixed stru- 27 ture or with other kinds of orrelation matrix struture. In this rst attempt 28 w e ha v e negleted the random struture and the orrelation of the predi- 29 12 tors. T aking in to aoun t these t w o harateristis should giv e a b etter w a y 1 to understand the real eet of limate on bio div ersit y . 2 13 Referenes 1 [1℄ H. Cardot, F. F errat y , P . Sarda, F untional linear mo del, Statist. Probab. 2 Lett. 45 (1999) 11-22. 3 [2℄ R. Cheddadi, A. Bar-Hen, Spatial gradien t of temp erature and p oten tial 4 v egetation feedba k aross Europ e during the late Quaternary , Climate 5 Dynamis (in press). 6 [3℄ B. Comps, D. Gömöry , J. Letouzey , B. Thiébaut, R.J. P etit, Div erging 7 trends b et w een heterozygosit y and alleli ri hness during p ostglaial ol- 8 onization in the Europ ean b ee h, Genetis 157(2001) 389-397. 9 [4℄ N. Cressie, Statistis for spatial data, Revised Edition, Wiley , New-Y ork, 10 1993 11 [5℄ D. Dahl-Jensen, K. Mosegaard, N. Gundestrup, G.D. Clo w, S.J. Johnsen, 12 A.W. Hansen, N. Balling, P ast T emp eratures Diretly from the Greenland 13 Ie Sheet, Siene 282 (1998) 268-271. 14 [6℄ J. F an, J.T. Zhang, T w o-step estimation of funtional linear mo dels with 15 appliation to longitudinal data. J. R. Stat. So . Ser. B Stat. Metho dol. 16 62 (2000) 303-322. 17 [7℄ J.J. F ara w a y , Regression analysis for a funtional resp onse, T e hnometris 18 39 (1997) 254-261. 19 [8℄ J. Guiot, Metho dology of the last limati yle reonstrution in F rane 20 from p ollen data, P alaeogeograph y P alaeo limatology P alaeo-eology 80 21 (1990) 49-69. 22 [9℄ X. Guy on, Statistique et éonométrie, Ellipses Mark eting, P aris, 2001 23 [10℄ H.G. Müller, U. Stadtm üller, Generalized funtional linear mo dels, Ann. 24 Stat. 33 (2005) 774-806 25 [11℄ J.O. Ramsa y , B.W. Silv erman, F untional Data Analysis, Springer, New- 26 Y ork, 1997 27 [12℄ J. Seierstad, A. Nesje, S.O. Dahl, J.R. Simonsen, Holo ene glaier u- 28 tuations of Gro v abreen and Holo ene sno w-a v alan he ativit y reon- 29 struted from lak e sedimen ts in Groningstolsv atnet, w estern Norw a y , The 30 Holo ene 12:2 (2002) 211-222. 199 14
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment