Statistical power and prediction accuracy in multisite resting-state fMRI connectivity

Connectivity studies using resting-state functional magnetic resonance imaging are increasingly pooling data acquired at multiple sites. While this may allow investigators to speed up recruitment or increase sample size, multisite studies also potent…

Authors: Christian Dansereau, Yassine Benhajali, Celine Risterucci

Statistical power and prediction accuracy in multisite resting-state   fMRI connectivity
Statistical p o w er and prediction accuracy in m ultisite resting-state fMRI connectivit y Christian Dansereau a,b , Y assine Benha jali a,c , Celine Risterucci d , Emilio Merlo Pic h d, ∗∗ , Pierre Orban a , Douglas Arnold e , Pierre Bellec a,b, ∗ a Centr e de R e cher che de l’Institut Universitair e de G´ eriatrie de Montr´ eal, Montr´ eal, CA b Dep artment of Computer Scienc e and Op er ations R ese ar ch, University of Montr e al, Montr e al, CA c D´ ep artement d’anthr op ologie, Universit ´ de Montr´ eal, Montr´ eal, CA d Clinic al Imaging, pRED, F.Hoffman-L a R o che, Basel, CH e Neur oRx inc., Montr´ eal, CA Abstract Connectivit y studies using resting-state functional magnetic resonance imaging are increasingly p ooling data acquired at multiple sites. While this ma y allow in vestigators to sp eed up recruitmen t or increase sample size, m ultisite stud- ies also p oten tially introduce systematic biases in connectivity measures across sites. In this w ork, w e measure the in ter-site effect in connectivit y and its impact on our ability to detect individual and group differences. Our study was based on real, as opp osed to simulated, multisite fMRI datasets collected in N = 345 y oung, healthy sub jects across 8 scanning sites with 3T scanners and heteroge- neous scanning proto cols, drawn from the 1000 functional connectome pro ject. W e first empirically show that t ypical functional netw orks were reliably found at the group lev el in all sites, and that the amplitude of the in ter-site effects w as small to mo derate, with a Cohen’s effect size b elo w 0.5 on av erage across brain connections. W e then implemented a series of Mon te-Carlo simulations, based on real data, to ev aluate the impact of the m ultisite effects on detection pow er in statistical tests comparing tw o groups (with and without the effect) using a general linear model, as w ell as on the prediction of group labels with a support- v ector machine. As a reference, we also implemented the same simulations with fMRI data collected at a single site using an iden tical sample size. Simulations rev ealed that using data from heterogeneous sites only slightly decreased our abilit y to detect changes compared to a monosite study with the GLM, and had a greater impact on prediction accuracy . How ev er, the deleterious effect of m ultisite data po oling tended to decrease as the total sample size increased, to a p oin t where differences b et w een monosite and multisite simulations were small with N = 120 sub jects. T ak en together, our results supp ort the feasibility of ∗ Corresponding author: pierre.b ellec@criugm.qc.ca ∗∗ Present address: CNS T A U, T ak eda Dev elopment Centre Europ e, London, UK Email addr ess: pierre.bellec@criugm.qc.ca (Pierre Bellec) Pr eprint submitte d to Neur oimage January 30, 2017 m ultisite studies in rs-fMRI provided the sample size is large enough. Keywor ds: m ultisite, statistical p o wer, prediction accuracy, Monte-Carlo sim ulation, sample size, resting-state, fMRI connectivity, SVM Highligh ts • Small to mo derate systematic site effects in fMRI connectivity . • Small impact of site effects on the detection of group differences for sample size > 100. • Linear regression of the sites prior to multiv ariate prediction do not im- pro ve prediction accuracy . 1. In tro duction Main obje ctive. Multisite studies are b ecoming increasingly common in resting- state functional magnetic resonance imaging (rs-fMRI). In particular, some consortia hav e retrosp ectively p o oled rs-fMRI data from multiple indep endent studies comparing clinical cohorts with control groups, e.g. normal controls in the 1000 functional connectome pro ject (F CP) (Biswal et al., 2010), children and adolescents suffering from attention deficit h yp eractivit y disorder from the ADHD200 (Milham et al., 2012; F air et al., 2012), individuals diagnosed with autism sp ectrum disorder in ABIDE (Nielsen et al., 2013), individuals suffer- ing from schizophrenia (Cheng et al., 2015), or elderly sub jects suffering from mild cognitive impairment (T am et al., 2015). The rationale b ehind suc h initia- tiv es is to dramatically increase the sample size at the cost of decreased sample homogeneit y . The systematic v ariations of connectivity measures derived using differen t scanners, called site effects, ma y decrease the statistical pow er of group comparisons, and somewhat mitigate the b enefits of ha ving a large sample size (Bro wn et al., 2011; Jo vicich et al., 2016). In this w ork, our main ob jective w as to quantitativ ely assess the impact of site effects on group comparisons in rs-fMRI connectivit y . Gr oup c omp arison in rs-fMRI c onne ctivity. In this w ork, w e fo cused on the most common measure of individual functional connectivit y , which is the Pearson’s correlation co efficien t b etw een the av erage rs-fMRI time series of tw o brain re- gions. T o compare t wo groups, a general linear mo del (GLM) is typically used to establish the statistical significance of the difference in av erage connectiv- it y b et w een the groups. Finally a p -v alue is generated for each connection to quan tify the probability that the difference in av erage connectivity is signifi- can tly differen t from zero (W orsley and F riston, 1995; Y an et al., 2013). If the estimated p -v alue is smaller than a prescrib ed tolerable level of false-p ositive findings (see for more detail T able 1), generally adjusted for the num ber of tests p erformed across connections, say α = 0 . 001, then the difference in connectivity is deemed significan t. 2 Actual v alue Detected value patho no patho patho T rue Positiv e F alse Negative no patho F alse Positiv e T rue Negative T able 1: Confusion matrix. Statistic al p ower in gr oup c omp arisons at multiple sites. The statistical pow er of a group comparison study is the probability of finding a significant difference, when there is indeed a true difference. A careful study design inv olv es the selection of a sample size that is large enough to reach a set lev el of statistical p o w er, e.g. 80%. In the GLM, the statistical p ow er actually dep ends on a series of parameters (Desmond and Glo ver, 2002; Durnez et al., 2014): (1) the sample size (the larger the b etter); (2) the absolute size of the group difference (the larger the b etter), and, (3) the intrinsic v ariabilit y of measurements (the smaller the b etter) (4) the rejection threshold α for the null h yp othesis. Sour c es of variability: factors inher ent to the sc anning pr oto c ol. In a multisite (or multi-protocol) setting, differences in imaging or study parameters may add v ariance to rs-fMRI measures, e.g. the scanner make and mo del (F riedman et al., 2006, 2008), rep etition time, flip angle, v o xel resolution or acquisition volume (F riedman and Glov er, 2006), experimental design such as eyes-open/eyes-closed (Y an et al., 2009), experiment duration (V an Dijk et al., 2010), and scanning en vironment such as sound attenuation measures (Elliott et al., 1999), or head- motion restrain t tec hniques (Edw ard et al., 2000; V an Dijk et al., 2012), amongst others. These parameters can be harmonized to some exten t, but differences ar e una voidable in large m ultisite studies. The recent work of Y an et al. (2013) has indeed demonstrated the presence of significant site effects in rs-fMRI measures in the 1000 FCP . Site effects will increase the v ariabilit y of measures, and thus decrease statistical pow er. T o the b est of our knowledge, it is not yet known ho w imp ortan t this decrease in statistical p o wer may b e. Sour c es of variability: within-subje ct. The relative imp ortance of site effects in rs-fMRI connectivity depends on the amplitude of the many other sources of v ariance. First, rs-fMRI connectivit y only has moderate-to-go o d test-retest reliabilit y using standard 10-minute imaging proto cols (Shehzad et al., 2009), ev en when using a single scanner and imaging session. Differences in functional connectivit y across sub jects are also known to correlate with a m yriad of be- ha vioural and demographic sub ject characteristics (Anand et al., 2007; Sheline et al., 2010; Kilpatric k et al., 2006). T ak en together, these sources of v ariance reflect a fundamen tal volatilit y of h uman physiological signals. 3 Sour c es of variability: factors inher ent to the site. In addition to physiology , some imaging artefacts will v ary systematically from session to session, ev en at a single site. F or example, in tensity non-uniformities across the brain dep end on the p ositioning of sub jects (Caramanos et al., 2010). Ro om temperature has also b een sho wn to impact MRI measures (V anhoutte et al., 2006). Given the goo d consistency of key findings in resting-state connectivity across sites, suc h as the organization of distributed brain netw orks (Biswal et al., 2010), it is reasonable to hypothesize that site effects will b e small compared to the com bination of physiological and within-site imaging v ariance. Multivariate analysis. Another imp ortan t consideration regarding the impact of site effects on group comparison in rs-fMRI connectivity is the t yp e of metho d used to identify differences. The concept of statistical p o wer is v ery w ell estab- lished in the GLM framework, which tests one brain connection at a time (mass univ ariate testing). How ev er, multiv ariate methods that combine sev eral or all connectivit y v alues in a single prediction are also widely used and lik ely affected b y the site effects. A p opular multiv ariate technique in rs-fMRI is supp ort- v ector machine (SVM) (Cortes and V apnik, 1995). In this approach, the group sample is split into a training set and a test set. The SVM is trained to predict group lab els on the training set, and the accuracy of the prediction is ev aluated indep enden tly on the test set. The accuracy level of the SVM captures the qualit y of the prediction of clinical lab els from resting-state connectivity , but do es not explicitly tell which brain connection is critical for the prediction. The accuracy score can thus b e seen as a separability index b et w een the individuals of tw o groups in high dimensional space. Altogether, the ob jectiv es and mea- sures of statistical risk for SVM and GLM are quite different. Because SVM has the abilit y to combine measures across connections, unlike univ ariate GLM tests, w e hypothesized that the GLM and SVM will b e impacted differently b y site effects. Even though the accuracy is exp ected to b e low er for the multisite than the monosite configuration, it as b een shown that the generalizabilit y of a predictiv e mo del to unseen sites is greater for models trained on m ultisite than monosite datasets as sho wn by Abraham et al. (2016). Sp e cific obje ctives. Our first ob jective was to characterize, using real data, the amplitude of systematic site effects in rs-fMRI connectivit y measures across sites, as a function of within-site v ariance. W e based our ev aluation on images generated from indep endent groups at 8 sites equipp ed with 3T scanners, in a subset ( N = 345) of the 1000 F CP . Our second ob jective w as to ev aluate the impact of site effects on the detection p o w er of group differences in rs-fMRI connectivit y . T o answer this question directly , one w ould need to scan tw o dif- feren t cohorts of participants at least twice, once in a multisite setting and once in a monosite setting. Such an exp erimen t may b e too costly to implement for addressing a purely tec hnical ob jective. As a more feasible alternativ e, w e im- plemen ted a series of Mon te Carlo sim ulations, adding synthetic “pathological” effects in the 1000 FCP sample. One interesting feature of the ”1000 FCP” dataset is the presence of one large site of ∼ 200 sub jects and 7 small sites of 4 ∼ 20 sub jects per site. W e were therefore able to implement realistic scenarios follo wing either a monosite or a multisite design (with 7 sites), with the same total sample size. Our simulations ga v e us full control on critical aspects for the detection of group differences, such as the amplitude of the group difference, sample size, and the balancing of groups across sites. W e ev aluated the ability of detecting group differences b oth in terms of sensitivit y for a GLM and in terms of accuracy for a SVM mo del. 2. Metho d 2.1. Imaging sample char acteristics The full 1000 FCP sample includes 1082 sub jects, with images acquired ov er 33 sites spread across North America, Europ e, Australia and China. As the 1000 FCP is a retrosp ectiv e study , no effort was made to harmonize p opula- tion c haracteristics or imaging acquisition parameters (Biswal et al., 2010). A subset of sites was selected based on the following criteria: (1) 3T scanner field strength, (2) full brain cov erage for the rs-fMRI scan, and, (3) a minimum of 15 y oung or middle aged adult participants, with a mixture of males and females (4) samples drawn from a population with a predominan t Caucasian ethnicit y . In addition, only young and middle aged participants (18-46 years old) were included in the study , and we further excluded sub jects with excessive motion (see next Section). The final sample for our study thus included 345 cognitively normal y oung adults (150 males, age range: 18-46 y ears, mean ± std: 23.8 ± 5.14) with images acquired across 8 sites lo cated in Germany , the United Kingdom, Australia and the United States of America. The total time of av ailable rs- fMRI data for these sub jects ranged betw een 6 and 7.5 min and only one run w as av ailable p er sub ject. See T able 2 for more details on the demographics and imaging parameters at each site selected in the study . The exp erimen tal proto cols for all datasets as w ell as data sharing in the 1000 FCP w ere appro v ed b y the resp ectiv e ethics committees of each site. This secondary analysis of the 1000 FCP sample w as approv ed by the lo cal ethics committee at CRIUGM, Univ ersity of Montreal, QC, Canada. 2.2. Computational envir onment All exp erimen ts were p erformed using the NeuroImaging Analysis Kit, NIAK 1 (Bellec et al., 2011) v ersion 0.12.18, under CentOS version 6.3 with Octav e 2 v ersion 3.8.1 and the Minc toolkit 3 v ersion 0.3.18. Analyses were executed in parallel on the Mammouth sup ercomputer 4 , using the pip eline system for Oc- ta ve and Matlab, PSOM (Bellec et al., 2012) version 1.0.2. The scripts used 1 http://simexp.github.io/niak/ 2 http://gnu.octave.org/ 3 http://www.bic.mni.mcgill.ca/ServicesSoftware/ServicesSoftwareMincToolKit 4 http://www.calculquebec.ca/index.php/en/resources/compute- servers/ mammouth- serie- ii 5 Site Magnet Scanner Channels N Nfinal Sex Age TR #Slices #F rames Baltimore, USA 3T Philips Achiev a 8 23 21 8M/15F 20-40 2.5 47 123 Berlin, DE 3T Siemens Tim T rio 12 26 26 13M/13F 23-44 2.3 34 195 Cambridge, USA 3T Siemens Tim T rio 12 198 195 75M/123F 18-30 3 47 119 Newark, USA 3T Siemens Allegra 12 19 17 9M/10F 21-39 2 32 135 NewY ork b, USA 3T Siemens Allegra 1 20 18 8M/12F 18-46 2 33 175 Oxford, UK 3T Siemens Tim T rio 12 22 20 12M/10F 20-35 2 34 175 Queensland, AU 3T Bruker 1 19 17 11M/8F 20-34 2.1 36 190 SaintLouis, USA 3T Siemens Tim T rio 12 31 31 14M/17F 21-29 2.5 32 127 T able 2: Sites selected from the 1000 F unctional Connectome Pro ject. for processing can b e found on Gith ub 5 . Prediction was p erformed using the LibSVM library (Chang and Lin, 2011). Visualization w as implemen ted using Python 2.7.9 from the Anaconda 2.2.0 6 distribution, along with Matplotlib 7 (Hun ter, 2007), Seab orn 8 and Nilearn 9 for brain map visualizations. 2.3. Pr epr o c essing Eac h fMRI dataset was corrected for slice timing; a rigid-b ody motion was then estimated for eac h time frame, both within and b et ween runs, as well as b et w een one fMRI run and the T1 scan for each sub ject (Collins et al., 1994). The T1 scan was itself non-linearly co-registered to the Montreal Neurological Institute (MNI) ICBM152 stereotaxic symmetric template (F ono v et al., 2011), using the CIVET pipeline (Ad-Dab’bagh et al., 2006). The rigid-b ody , fMRI- to-T1 and T1-to-stereotaxic transformations w ere all com bined to re-sample the fMRI in MNI space at a 3 mm isotropic resolution. T o minimize artifacts due to excessive motion, all time frames sho wing a frame displacement, as defined in P ow er et al. (2012), greater than 0.5 mm w ere remo ved and a residual motion estimated after scrubbing. A minimum of 50 unscrubb ed v olumes p er run was required for further analysis (13 sub jects w ere rejected). The following n uisance co v ariates were regressed out from fMRI time series: slow time drifts (basis of discrete cosines with a 0.01 Hz highpass cut-off ), av erage signals in conserv ative masks of the white matter and the lateral v entricles (av erage P earson correlation across all sub jects is 0.242 b et ween gra y matter and white matter signals, and 0.031 b et w een gray matter and v en tricles signals) as well as the first principal comp onen ts (accoun ting for 95% v ariance) of the six rigid-bo dy motion parame- ters and their squares (Gio ve et al., 2009; Lund et al., 2006). The fMRI volumes w ere finally spatially smo othed with a 6 mm isotropic Gaussian blurring kernel. 5 https://github.com/SIMEXP/Projects/tree/master/multisite 6 http://docs.continuum.io/anaconda/index 7 http://matplotlib.org/ 8 http://stanford.edu/ ~ mwaskom/software/seaborn/index.html 9 http://nilearn.github.io/ 6 A more detailed description of the pip eline can be found on the NIAK website 10 and Gith ub 11 . 2.4. Inter-site bias in r esting-state c onne ctivity F unctional c onne ctomes. W e compared the functional connectivity measures deriv ed from different sites of the 1000 F CP . A functional brain parcellation with 100 regions was first generated using a bo otstrap analysis of stable clusters (Bellec et al., 2010b), on the Cambridge cohort of the 1000 FCP ( N = 195), as describ ed in Orban et al. (2015). F or a given pair of regions, the connectivit y measure w as defined b y the Fisher transformation of the Pearson’s correlation co efficien t b et w een the av erage temp oral rs-fMRI fluctuations of the tw o regions. F or each sub ject, a 100 × 100 functional connectome matrix w as thus generated, featuring the connections for ev ery p ossible pair of brain regions. Inter-site effe cts. The in ter-site effects at a particular connection were defined as the absolute difference in av erage connectivity b et ween t wo sites. In order to formally test the significance of the inter-site effects, we used a GLM including age, sex and residual motion as cov ariates (corrected to hav e a zero mean across sub jects), as well as dumm y v ariables co ding for the a verage connectivity at eac h site. F or each site, a “contrast” v ector w as co ded to measure the difference in av erage connectivit y b et ween this site and the grand av erage of functional connectivit y combining all other sites. A p -v alue w as generated for eac h connec- tion to quantify the probability that the observ ed effect using this con trast was significan tly different from zero (W orsley and F riston, 1995). The num ber of false disco very was also controlled ( q = 0 . 05) using a Benjamini-Hoch b erg false disco very rate (FDR) pro cedure (Benjamini and Ho c hberg, 1995). T o quan tify the severit y of inter-site effects, we derived Cohen’s d effect size measure for eac h connection: | β c | / ˆ σ , with β c b eing the weigh t asso ciated with the con trast. The standard deviation from the noise ˆ σ w as calculated as ˆ σ = p P e 2 / ( N − K ), e b eing the residuals from the GLM, N the sample size and K the num b er of co v ariates in the mo del. As secondary analyses, t -tests w ere also implemen ted in the GLM to v alidate that age, sex as well as residual motion made significan t con tributions to the mo del. 2.5. Simulations Data gener ation pr o c ess. W e implemented Monte-Carlo simulations to assess the detection sensitivity of group differences in rs-fMRI connectivity . The sim- ulations were based on the 1000 FCP sample, with 8 sites totaling 345 sub jects. The multisite sim ulations were sampled from 148 sub jects, a v ailable across S = 7 sites. The monosite simulations w ere sampled from 195 sub jects a v ailable at S = 1 site (Cambridge). F or eac h sim ulation, a subset of sub jects of a giv en size N was selected randomly and stratified b y site. F or each site, a ratio W 10 http://niak.simexp- lab.org/pipe_preprocessing.html 11 https://github.com/SIMEXP/ 7 of the selected sub jects was randomly assigned to a s o-called “patient” group. W e fo cus our analysis on connections showing a fair-to-go od test-retest relia- bilit y based on a previous study rep orting 11 connections lik ely impacted by Alzheimer’s disease, see Orban et al. (2015) for details. F or eac h connection, a “pathology” effect was added to the connectivity measures of the sub jects b e- longing to the “patien t” group. This additiv e shift in connectivit y for “patien ts” w as selected as to achiev e a specified effect size, defined b elow. Effe ct size (Cohen ’s d ). The Cohen’s d was used to quantify the effect size. F or a group comparison, Cohen’s d is defined as the difference µ b et w een the means of the t wo groups, divided b y the standard deviation of the measures within eac h group, here assumed to b e equal. F or a giv en connection b et ween brain regions i and j , let y i,j b e the functional connectivity measure for a particular sub ject of the 1000 FCP sample. If the sub ject w as assigned to the “patient” group in a particular sim ulation, an effect was added to generate a sim ulated connectivity measure y ∗ i,j equal to y i,j + µ . F or a sp ecified effect size d , the parameter µ w as set to d × s i,j , where s i,j is the standard deviation of connectivity b etw een region i and j . The parameter s i,j w as estimated as the standard deviation of connectivit y measures across sub jects in the mono-site sample (Cambridge), without an y “pathological” effect simulated. GLM tests. In order to detect c hanges betw een the sim ulated groups at each connection, a GLM was estimated from the simulated data, using age, sex and frame displacemen t as confounds (corrected to hav e a zero mean across sub- jects). T o accoun t for site-specific effects, S − 1 dumm y v ariables (binary v ectors co ding for each site) w ere added to the mo del, with S being the total num b er of sites used in the study , in addition to an intercept accounting for the global a v er- age. Finally , one dummy v ariable co ded for the “patien t” group. The regression co efficien ts of the linear mo del w ere estimated with ordinary least squares, and a t -test, with associated p -v alue, was calculated for the co efficien t of the “pa- tien t” v ariable. A significan t pathology effect w as detected if the p v alue was smaller than a prescrib ed α level. The α lev el needs to be adjusted for multiple comparisons (in our case 11 connections, but this would dep end on the num ber of connections selected in a particular study), which can b e done in an adaptive manner using FDR. When connections are pre-sp ecified, such as in e.g. W ang et al. (2012), a more lib eral threshold can b e applied. In our case, since we w anted to hav e a constant b eha vior indep endent of the effect size, w e tested differen t t ypical v alues for α in { 0 . 001 , 0 . 01 , 0 . 05 } . F or eac h simulation sample b and eac h connection, we deriv ed a p -v alue p ( ∗ b ) , and the effect was deemed detected if p ∗ b w as less than α . The sensitivity of the test for a particular con- nection w as ev aluated b y the frequency of p ositiv e detections o ver all sim ulation samples. Pr e diction ac cur acy. In addition to mass univ ariate GLM tests, w e also inv esti- gated a linear SVM (Cortes and V apnik, 1995) using a Mon te-Carlo simulation of the prediction of clinical lab els based on cross-v alidation. F or SVM simulations, 8 all p ossible connections b et w een the 100 brain regions were used sim ultaneously to predict the presence of the simulated pathology in a given sub ject. F or a participan t assigned to the “patient” group, a “pathology” effect w as only sim- ulated in a set p ercen tage of connections, whic h were randomly selected. The prop ortion of connections with a non-null effect was denoted as π 1 . F or a given sim ulation at sample size N , the SVM model w as trained on N sub jects selected randomly and stratified by site. The accuracy of the model w as ev aluated on a separate sample consisting of the remaining sub jects, unused during train- ing. F or example, for a multisite simulation with N = 80 sub jects for training, the mo del accuracy was tested on the remaining 68 sub jects: 148 (av ailable sub jects) minus 80 (sub jects in the training set). During training, a 10-fold cross-v alidation was used to optimize the hyper-parameters of the SVM inde- p enden tly for each sim ulation. The mean and standard deviation of accuracy scores across all samples w ere derived for each simulation scenario. Simulation exp eriments. All the sim ulation parameters ha ve b een summarized b elo w: • Sample size N . • P atient allo cation ratio W . • Num b er of sites S . • The t yp e of detection metho d, either GLM or SVM. • F or GLM tests, the false-p ositiv e rate α . • F or SVM tests, the prop ortion of “pathological” connections π 1 . • The effect size d . F or a giv en set of sim ulation parameters, w e generated B = 10 3 Mon te-Carlo samples to estimate either the sensitivity (for GLM test) or the accuracy (for SVM prediction) of the metho d. F or all experiments, w e inv estigated effect sizes d ∈ { 0 , 2 } with a step of 0 . 01 and α ∈ { 0 . 001 , 0 . 01 , 0 . 05 } . The num b er of site(s) w as S = 1 for the monosite analysis and S = 7 for the multisite analysis. W e implemen ted the following exp eriments: • ( E 1 ) T est the impact of the sample size on GLM N ∈ { 40 , 80 , 120 } , with a fixed allo cation ratio W = 0 . 5. • ( E 2 ) T est the impact of the allo cation ratio on GLM W ∈ { 0 . 5 , 0 . 3 , 0 . 15 } for a fixed sample size N = 120. • ( E 3 ) T est the impact of multisite correction (regressing out the site effects using dummy v ariables co ding for eac h site) and affected connection v ol- ume ( π 1 ) on the prediction accuracy . F or the prediction scenario, we used a range of π 1 ∈ { 0 . 1 , 1 , 5% } , and tw o sample sizes N ∈ { 80 , 120 } sub jects for training, with mo del accuracy estimated on N = 68 and N = 28, resp ectiv ely . 9 3. Results 3.1. Inter-site effe cts in fMRI c onne ctivity Site effe cts in the default-mo de network. W e first focused on the connections as- so ciated with a seed region lo cated in the p osterior cingulate cortex, a k ey no de of the default-mode net work (DMN), which is one of the most widely studied resting-state net w orks (Greicius et al., 2004). The connections were based on the Cam bridge 100 parcellation, and w ere represen ted as a connectivity map, (Figure 1). Figure 1A sho ws the p osterior cingulate cortex connectivity map, a veraged across all sub jects and all sites. The k ey regions of the DMN are easily iden tifiable, and include the p osterior cingulate cortex, precuneus, inferior pari- etal lobule, anterior cingulate cortex, medial pre-frontal cortex (dorsal, anterior and ven tral), sup erior frontal gyri and the medial temp oral lob e (Damoiseaux et al., 2006; Dansereau et al., 2014; Y an et al., 2013). The av erage connectivity map of the DMN was then extracted for each site, Figure 1A. Qualitatively , the DMN maps were consisten t across sites, as exp ected based on the literature. W e then tested for the significance of the site effects (Figure 1B), i.e. the differ- ence in a verage connectivit y at a giv en site and the a verage connectivit y at all remaining sites. The statistical maps w ere corrected for m ultiple comparisons across the brain with FDR at q ≤ 0 . 05 (Benjamini and Ho c hberg, 1995). A sig- nifican t site effect for at least one connection could be iden tified for every site, without exception, Figure 1B. Figure 1C shows ho w reproducible the significant site effects were in connectivit y across the brain and sites. The identified signif- ican t connections were quite v ariable across sites, most of them b eing identified at less than three sites. Site effe cts acr oss the c onne ctome. In order to extend these observ ations out- side of the DMN, we derived the en tire connectome using the Cam bridge 100 parcellation. Figure 2A shows the av erage connectome, p ooling all sub jects and sites together. The regions hav e b een re-ordered based on a hierarchical clus- tering with W ard criterion. A netw ork structure is clearly visible as squares of high connectivit y on the diagonal of the connectome (as outlined by blac k lines). Each diagonal square corresponds to the intra-net w ork connectivity for a partition into 7 net works (Figure 2A). These 7 netw orks 12 w ere consistent with the ma jor resting-state netw orks reported using a cluster analysis in previous w orks (e.g. v an den Heuvel et al., 2008; Bellec et al., 2010a; Y eo et al., 2011; P ow er et al., 2011): the DMN, visual, sensorimotor, dorsal and v en tral atten- tional netw orks, mesolim bic and cereb ellar net works were iden tified (Figure 2B). Figure 2C shows how this large-scale connectome organization v aried from site to site. The av erage connectivity per site as well as significant differences with the av erage of the remaining sites ( q ≤ 0 . 05) is shown in Figure 2C. Visually , consisten t with our previous observ ations in the DMN, the organization of the 12 http://neurovault.org/images/39184/ 10 Figure 1: Panel A: map of the DMN obtained using a seed in the p osterior cingulate cortex, av eraging all sub jects and sites together (first row) and then a veraging all sub jects for eac h of the 8 sites (subsequent rows). Panel B shows the n umber of sites with a significant inter-site difference for each brain re- gion (first ro w) and the significant differences b etw een the a verage functional connectivit y maps of one site versus all the others (subsequent rows). 11 Figure 2: Panel A shows the av erage functional connectomes for 8 sites of the 1000 F CP . Colors next to the x and y axis corresp ond to differen t netw orks in a 7-cluster solution of the matrix, obtained from a hierarchical clustering (W ard criterion). P anel B presen ts the corresp onding 7 brain net w orks, along with lab els. P anel C shows av erage connectomes for individual sites, as well as connections with a significant site effect. Panel D shows the num b er of sites at whic h a giv en connection w as detected as significan t. ML: mesolimbic, CB: cereb ellar, VIS: visual, vA TT: ven tral atten tional, dA TT: dorsal attentional, DMN: default mo de netw ork, SM: sensorimotor. 12 a verage connectome into large-scale resting-state netw orks was preserv ed across all sites. Some significant site effects were still detected in the connectivity b oth within eac h netw ork, as well as b et w een netw orks. By counting the n umber of sites sho wing a significant effect for eac h pair of regions, it was apparen t that sig- nifican t site effects were quite v ariable in their lo calization and spread across the full connectome (Figure 2D). Concerning the asso ciation with the other confounding v ariables in the mo del (sex, age and motion) many connections w ere found to b e significantly asso ciated with motion, see Supplementary Ma- terial Figure S5, although v ery few connections were found to b e significantly asso ciated with the sex and age, see Supplementary Material Figure S6 and S7. W e also chec ked that the analysis w as not predominantly driven by the larger Cambridge site. W e thus ran the same analysis excluding that site (see Supplemen tary Material Figure S8). The num b er of significan t pairs remained v ery similar, although the spatial lo cation of half of the significant connectivity pairs c hanged when the large Cam bridge site w as remov ed from the analysis. Those findings do not qualitativ ely change our conclusion, but they influence the lo cation of the significant connections. These differences may b e due to the in trinsic v ariabilit y in the statistical test, and not just the size of the Cam bridge site. In summary , those findings supp ort the inclusion of age, sex and motion parameters in a GLM in order to remo ve their confounding effects in addition to site effects. Site effe cts vs. within-site variations acr oss subje cts. W e measured the ampli- tude of inter-site effects, represented as violin plots across connections using either the absolute difference in a verage connectivity (Figure 3A,C) or Cohen’s d effect size measures (Figure 3B,D). The violin plots include either every con- nection from the BASC Cambridge parcellation (Figure 3A,B), or only the 11 connections selected for Monte-Carlo simulations (Figure 3C,D). F or absolute differences, the distributions were mostly consistent across sites, with a me- dian around 0.06, 5% p ercen tile near 0 and 95% p ercentiles in the 0.08- 0.1 range. F or Cohen’s d , the distributions were also consisten t across sites, with a median around 0.33, 5% p ercentile near 0 and 95% p ercentiles in the 0.4- 0.6 range. These effect sizes are typically deemed small-to-mo derate (Cohen, 1992), although such a qualitative assessment needs to b e refined based on each application. This result thus suggests that the impact of additiv e in ter-site ef- fects on statistical tests will be limited. Similar findings w ere observed across all p ossible connections, or across the 11 pairs of connections selected in the sim ulation study . Differ enc es in standar d deviation acr oss sites. W e also inv estigated the site dif- ferences in standard deviation of connectivity across sub jects, see Supplemen- tary Figure S1 for the DMN, Supplemen tary Material S2 for the connectomes. The standard group GLM assumes equal v ariance of resting-state connectiv- it y across all sub jects, or “homoscedasticit y”. Significant differences in across- sub ject standard deviation b et w een sites violates the homoscedastic assump- 13 Figure 3: Effect size of the inter-site effects from a subset of 8 sites from the 1000 F CP . Panels A,C sho w the distribution of absolute differences in functional con- nectivit y , while panels B,D show Cohen’s d measures of inter-site effects. P anels A,B sho w violin plots across ev ery connections in the BASC Cam bridge 100 par- cellation, while Panels C,D fo cus on the selected 11 functional connections used in sim ulations, only . 14 tion, and may jeopardize the v alidit y of the false-p ositiv e rates of the mo del. Qualitativ ely , w e first observed that the sites showing the larger num b er of dif- ferences w ere the one with the most temp oral v ariance among connections see Supplemen tary Figure S3. W e then ran a White’s test aimed at rejecting ho- moscedasticit y at each connection, indep enden tly . The White’s tests resulted in a family of p-v alues, whic h was corrected for m ultiple comparisons using FDR ( q < 0 . 05). The homoscedastic hypothesis w as rejected in a large p ortion of connections. This was exp ected due to the large ov erall num ber of sub jects and consequen tly large statistical p o wer of White’s pro cedure. How ev er, despite reac hing significance, the absolute difference in the av erage standard deviation b et w een tw o sites was 19% of the grand av erage standard deviation, on av erage across pairs of sites. Such a small departure from homoscedasticity lik ely has only a mild impact on the GLM, which we formally inv estigated using Mon te- Carlo sim ulations. 3.2. Multisite Monte-Carlo simulations V alidity of the c ontr ol of false p ositives in the GLM. An excellent control of the false p ositiv e rate was observed at all nominal levels α ∈ { 0 . 001 , 0 . 01 , 0 . 05 } , b oth in monosite sim ulations or in multisite sim ulations, when site co v ariates w ere included in the GLM, see Figure 4. This means that the nominal, user- sp ecified, false p ositiv e rate matched precisely with the effectiv e false p ositiv e rate measured in the sim ulations. This observ ation held for any com bination of allo cation ratio, W ∈ { 15% , 30% , 50% } , and sample size, N ∈ { 40 , 80 , 120 } . By con trast, when no site co v ariates w ere included in the GLM, the false p os- itiv e rate was not controlled appropriately , sometimes b y a wide margin. In the absence of site co v ariates, the pro cedure w as sometimes too conserv ative, e.g. W = 50%, and sometimes very liberal, e.g. N = 120 , W = 15%. This exp erimen t show ed that, despite the mild departure from homoscedasticity re- p orted ab o v e, the GLM do es control for false-p ositiv e rate at eac h connection v ery precisely , if and only if site cov ariates are included in the mo del. Statistic al p ower and effe ct size. Figure 5A sho ws the r elationship betw een effect size and a GLM detection pow er in exp erimen t ( E 1 ), i.e. for a fixed allo cation ra- tio ( W = 50%) and three different sample sizes, N ∈ { 40 , 80 , 120 } . The av erage and std of detection p ow er was plotted across the 11 selected connections. The v ariations of statistical p ow er across connections were very small for monosite sim ulations, as the effect size was adjusted based on the standard deviation of each connection within that sample. As exp ected, the sensitivit y increased with sample size, quite markedly . In multisite simulations ( S = 7), for a large effect size ( d = 1), the detection p ow er was 20% with 40 sub jects , 80% with 80 sub jects and 95% with 120 sub jects. The sensitivity was larger with a single site than a multisite sample, yet the difference b et ween the t wo decreased as sample size increased. With N = 40 and d = 1, the detection p o wer w as close to 30% for a single site sample, compared to 20% for the m ultisite sample. With N = 120 and d = 1, the difference in sensitivity was only of a few p ercen t. The 15 Figure 4: Monte-Carlo simulation of the false p ositiv e rate in the absence of group differences ( d = 0), either for a monosite ( S = 1, left), a multisite ( S = 7) with (middle) or without (right) site cov ariates included in the GLM. In panel A, three differen t α v alues ha ve b een tested, α ∈ { 0 . 001 , 0 . 01 , 0 . 05 } with a fixed sample size and patient allocation ratio ( N = 120 , W = 50%). In panel B, three differen t sample sizes ha ve b een tested, N ∈ { 40 , 80 , 120 } with a fixed patien t allo cation ratio ( W = 50%) (Exp erimen t ( E 1 )). In panel C, three different patien t allo cation ratios ha ve been tested, W ∈ { 50% , 30% , 15% } with a fixed sample size ( N = 120) (Exp eriment ( E 2 )). 16 Figure 5: Monte-Carlo simulation of detection p ow er as a function of the effect size d ∈ [0 , 2], either for a monosite ( S = 1, in red) or a m ultisite ( S = 7, in blue) sample, when testing differences b etw een t wo groups with a GLM and a false-p ositiv e rate α = 0 . 001. The plain curv es are the av erage statistical p ow er across 11 connections, and the shaded area represen ts ± 1 standard deviation across connections. In panel A, the patient allo cation ratio is fixed ( W = 50%) and three differen t sample sizes hav e b een tested, N ∈ { 40 , 80 , 120 } (Exp eriment ( E 1 )). In panel B, the sample size is fixed ( N = 120) and three different patient allo cation ratios hav e b een tested W ∈ { 15% , 30% , 50% } (Experiment ( E 2 )). 17 same trend was apparent for all tested effect sizes as well as for α ∈ { 0 . 01 , 0 . 05 } (not sho wn). Figure 6: Effect size detectable at 80% sensitivity as a function of sample size, for differen t false-p ositiv e rate α ∈ { 0 . 05 , 0 . 01 , 0 . 001 } (exp erimen t ( E 1 )). All sim ulations used a balanced patient allocation ratio W = 50%. The monosite p erformance is sho wn in red and the m ultisite in blue. The dotted blac k line sho ws the detectable effect size for a classical parametric t -test. Statistic al p ower and gr oup al lo c ation r atio. Figure 5B shows the relationship b et w een effect size and a GLM detection pow er in exp eriment ( E 2 ), i.e. for a fixed sample size ( N = 120) and three differen t patient allo cation ratio, W ∈ { 15% , 30% , 50% } . Overall, we found that the detection p o wer increased with W . F or example, with d = 1, the detection p o w er w as 65% for W = 15%, and increased to 90% with W = 30%, and finally 95% for W = 50%. The impact of W w as observed in b oth monosite and multisite samples, with an optimal allo cation ratio of W = 50% for b oth. This observ ation was also made for α ∈ { 0 . 01 , 0 . 05 } (not shown). Dete ctable effe ct size, as a function of sample size. An alternative summary of exp erimen t ( E 1 ) is to represen t the effect size that can be detected with 80% sensitivit y , as a function of sample size for monosite and m ultisite configurations, see Figure 6. As a reference, w e computed the same curve for parametric t -test comparisons, under assumptions of normality . As expected, the detectable effect size for parametric t -tests closely follow ed the monosite estimation. F or a small sample size ( N = 40), the detectable effect size was notably larger in multisite configurations than in a monosite configuration (difference of ab out 0.25 in Cohen’s d for α = 0 . 001). How ev er, the difference decreased for large sample sizes to b ecome smaller than 0.1 with N = 120 and α = 0 . 001. The low est detectable effect size for a sensitivit y of 80% at α = 0 . 05 was about d = 0 . 8, ac hieved in a monosite configuration with N = 120. At this sample size, the difference b et w een single and multisite configurations w as marginal, with only a few p ercen t’s of difference in detectable effect sizes. 18 Figure 7: Prediction accuracy of patient vs. con trols as a function of effect size. Three sim ulation settings are presented on each plot: monosite (red curve), m ultisite with regression of site effects ( S = 7, blue curve), and m ultisite without regression of site effects ( S = 7, black curve). Accuracy was estimated ov er B = 10 3 sim ulation samples with a patient allo cation ratio W = 50% and 3 v olumes of affected connections π 1 = 0 . 1% (left column), π 1 = 1% (middle column) and π 1 = 5% (right column). Two sample sizes w ere tested: N = 120 randomly selected sub jects for training, with the remaining N = 28 to estimate accuracy (first ro w), and N = 80 randomly selected sub jects for training, with the remaining N = 68 to estimate accuracy (second row). 19 Pr e diction ac cur acy. In exp eriment ( E 3 ), we examined the impact of effect size and the volume of affected connections on prediction accuracy in a SVM, see Figure 7. The volume of changes π 1 had a ma jor impact on prediction accu- racy . At π 1 = 0 . 1% (around 5 connections) the accuracy lev el was at chance lev el across all tested effect sizes, (Figure 7A). With π 1 = 1%, accuracy slightly increased, but effect sizes larger than d = 2 were still required to reac h o v er 80% accuracy (Figure 7B). With π 1 = 5%, 95% accuracy w as ac hieved at the same effect size (ab out d = 1 . 5) for monosite and multisite simulations, although the accuracy in multisite simulations w as notably low er than for monosite simula- tions across most effect sizes (Figure 7C). The relationship b et ween effect size and accuracy follo wed a sigmoidal curv e in b oth settings, yet a sharp er, and later transition b et ween very low and very high accuracy was observ ed in mul- tisite simulations. In terestingly , correcting for site effects by regressing out the dumm y v ariable b efore running the SVM classifier had no impact on accuracy lev els. The sample size ( N = 80 vs N = 120 for training) did hav e a moderate effect on prediction accuracy: for π 1 = 5% and d = 1 and monosite simulations, accuracy w as ab out 85% with N = 120 (Figure 7C) and 75% with N = 80 (Figure 7F). 4. Discussion and conclusions Inter-site effe cts in rs-fMRI c onne ctivity. Typical resting-state netw orks, such as the DMN, the atten tional, visual and sensorimotor netw orks, w ere reliably found across sites. This was strongly expected given the relativ e consistency of their distribution across individuals, studies, prepro cessing approac hes or even metho ds used to extract net w orks (e.g. Damoiseaux et al., 2006; v an den Heuv el et al., 2008; Bellec et al., 2010b; Y eo et al., 2011; P ow er et al., 2011). W e ho w- ev er found that significan t differences in a v erage connectivit y existed b et ween sites, as previously rep orted b y Y an et al. (2013). These site effects in con- nectivit y ma y undermine the generalization of the results derived at a single site. The in ter-sub ject (in tra-site) standard deviation of the connections was found to b e more than twice as large as the inter-site absolute effect, on av er- age across brain connections. This effect size measured in Cohen’s d w ould b e deemed small-to-mo derate, whic h suggests that the impact of additive in ter-site effects on statistical tests will b e limited. This is a reassuring finding supp ort- ing the feasibilit y of statistical tests p ooling fMRI data across multiple sites. Previous studies (Sutton et al., 2008; Brown et al., 2011) had reported inter-site v ariance up to 10 times smaller than in ter-sub ject v ariabilit y , but these studies had m uc h more homogeneous scanning en vironmen ts than ours and also used differen t fMRI outcome measures. In our case, we still inv estigated only 3T scanners, mostly Siemens, and inter-site effects ma y b e larger when considering other man ufacturers or field strengths. Statistic al p ower and multisite rs-fMRI. After accoun ting for site-related addi- tiv e effec ts in a GLM, the multisite simulation po oling 7 sites together sho wed 20 detection p o w er close to that of a monosite simulation with equiv alen t sample size. The difference w as noticeable for small sample size (total N = 40), and b ecame very small for a sample size N = 120. Another observ ation was that, for a giv en detection p o wer, the lo west effect size that w e were able to detect w as more v ariable across connections for a lo w sample size. W e demonstrated that a parametric group GLM does control precisely for the rate of false positive disco veries, ev en in multisite settings, as long as site cov ariates are included in the model. T ak en together, these observ ations suggest to use sample sizes larger than 100 sub jects for GLM m ultisite studies. This conclusion ma y dep end on the num b er of sites p o oled in the study and the actual n umber of sub jects in eac h of those sites, which we could not test in this work due to the size of the a v ailable sample. Mo deling site effe cts as r andom variables. W e mo deled the effect of eac h site on the av erage connectivity b et w een any given pair of regions as a fixed effect. This means that the prop osed GLM inference does apply only to collection of sites included in a given analysis. The linear mixed-effects mo del (Chen et al., 2013) would allow more p ow erful inferences: by mo deling site effects as random v ariables, follo wing a sp ecific distribution (e.g. Gaussian), we would b e able to generalize observ ations p oten tially to any collection of sites, provided our assumptions are accurate. The sample of sites av ailable for this study (7 at most) is how ev er to o small in our view to correctly estimate the v ariabilit y of effects across sites. This work would also require to formulate and in v estigate empirically as well as on simulations different mo dels for the distribution of in ter-site v ariations of site effects (e.g. Gaussian distribution). Site heter osc e dasticity. W e observed mild heteroscedasticity across sites. Our sim ulations show ed that this does not compromise the con trol of false p ositive rate in the GLM, ev en under homoscedastic assumptions, with the range of con trasts w e in vestigated. Regression mo dels more robust to heteroscedasticity ma y be inv estigated in the future, e.g. weigh ted least squares regres sion or linear mixed-effects mo deling (Chen et al., 2013). Statistic al p ower and sample size. F or a medium effect size, e.g. d = 0 . 5, the sensitivit y was low (below 20%), ev en for monosite sim ulations with N = 120 sub jects. This sob ering result supp orts the current trend in the literature to p ool multiple data samples to increase sample size, at the cost of decreased homogeneit y . W e also found that resting-state studies based on 40 sub jects or less, even at a single site, are seriously underp o wered, except for extremely large effect sizes (Cohen’s d greater than 1.5). Finally , un balanced patient allo cation ratio in site samples greatly reduces sensitivity , even in monosite studies. Bal- anced datasets, i.e. with equal n um b ers of patients and contr ols at eac h site, should therefore b e fav ored. Pr e diction. Comparing the monosite and the multisite accuracy curves rev eals a substan tial drop in accuracy from monosite to multisite across a broad range 21 of effect sizes. Ho wev er, it should b e noted that classifiers trained across multi- ple data sources will likely generalize b etter to new observ ations, which is likely a critical feature in most applications and reflects the true p otential clinical utilit y of this t yp e of technique. Our conclusions are consistent with the w ork of Nielsen et al. (2013), which compares the prediction of a clinical diagnosis of autism in monosite vs. m ultisite settings. The authors concluded that the prediction accuracy for the m ultisite sample w as significantly smaller than for the monosite sample. A somewhat surprising observ ation in our analysis w as that linear correction for site-specific effects did not improv e accuracy of pre- diction using SVM. The SVM mo del seems to learn features that are in v arian t across sites, maybe fo cusing on connections with the smallest site effect, or lo oking at differences b etw een connections similarly impacted by a site effect. Finally , an imp ortan t conclusion of our sim ulations w as that the volume of brain connections affected by a disease impacts accuracy as m uch as the effect size p er connection. This suggests that feature reduction and/or selection is a very imp ortan t step to impro ve sensitivity to small effect sizes. Beyond additive site effe ct. An important limitation to our study is that w e only in vestigated the impact of additive effects in brain connectivity across sites. Areas of future w ork include interactions b etw een site effects and pathology , p ossibly in the form of polynomial and non-linear interactions. W e hop e that, in the future, fMRI data acquired on clinical cohorts at tens of sites will b ecome a v ailable, whic h will enable researchers to test empirically the presence of such in teraction effects. Other typ es of multisite data. Another limitation of our study is that w e only in vestigated multisite data featuring roughly equal sample sizes with fairly bal- anced patient allo cation ratios at eac h site. Multisite studies including a very large n umber of sites with sometimes only a few sub jects per site are how ever quite common, e.g. the Alzheimer’s disease neuroimaging initiative (ADNI) (Mueller et al., 2005) and man y pharmaceutical clinical trials at phase I I and I II 13 . In this type of design, the m ultisite effect ma y pla y a m uc h more pro- nounced role than in our simulations as it cannot be mo deled in the GLM, and will b ecome an in trinsic added source of in ter-sub ject v ariance (F easter et al., 2011). Unfortunately , this type of design could not b e tested with the cur- ren t dataset due to the limited num ber of sites a v ailable. This represen ts an imp ortan t a ven ue of future w ork. Underlying c auses of the site effe cts. Not all sites seemed to b e equally impacted b y the site effects, with sites like Berlin or Sain t-Louis showing a small num b er of connections significan tly differen t then the grand a v erage connectivit y matrix, while sites like Baltimore, Queensland and Oxford show ed many more connec- tions affected by the site effects. Interestingly this can p otentially be due to 13 http://www.rochetrials.com/trialDetailsGet.action?studyNumber=BP28248 22 temp oral v ariance of the connections (see Supplementary Figure S3) partly ex- plained b y the scanner make since Queensland and Baltimore site used scanners from differen t mak ers (namely Bruk er and Philips) th an the rest of the sites used in this study (Siemens scanners). This ma y suggest that scanners SNR (signal to noise ratio) may partly explain the v ariance of connectivit y . These differences ma y not be statistically significan t, or they may reflect real differences due to proto col, scanner characteristics at these sites or differences in sampling across sites. Multiple causes ma y b e in teracting together to pro duce the site effects, as rep orted b y Y an et al. (2013), although some of these sources of v ariance could b e b etter controlled like the scanner parameters, paired with the use of a phan tom to promote more homogeneous configurations across sites (F riedman et al., 2006; F riedman and Glo v er, 2006; Glov er et al., 2012; F riedman et al., 2008). Ev en in standardized exp erimen ts, it should be noted that differences in scanner proto cols remain (Brown et al., 2011). A muc h larger multisite sample with systematically v arying parameters could e nable a data-driven identifi ca- tion of the critical parameters impacting site effects. The v arious releases made b y the INDI initiativ e ma y fill that gap in the literature in the future, as the scanner proto cols are muc h b etter describ ed in recent releases, suc h as CoRR (Zuo et al., 2014), than they w ere in the initial F CP release. These findings stress the need for more work to find the source of that v ariance rather than ad-ho c pro cedures to correct for them. 5. Ac kno wledgments P arts of this work were presented at the 2013 ann ual meeting of the Orga- nization for Human Brain Mapping, as w ell as the 2013 Alzheimer’s Asso cia- tion International Conference (AAIC) (Dansereau et al., 2013). The authors are grateful to the members of the 1000 functional connectome consortium for publicly releasing their datasets. The computational resources used to p er- form the data analysis were pro vided b y ComputeCanada 14 and CLUMEQ 15 , whic h is funded in part by NSERC (MRS), FQRNT, and McGill Universit y . This pro ject w as funded by NSERC gran t num ber RN000028 and the Canadian Consortium on Neuro degeneration in Aging (CCNA), through a grant from the Canadian Institute of Health Research and funding from sev eral partners in- cluding SANOFI-AD VENTIS R&D. PB is supp orted b y a salary a ward from “F onds de rec herche du Qu´ eb ec – Sant ´ e” and the Courtois F oundation. References Abraham, A., Milham, M., Di Martino, A., Craddo ck, R. C., Samaras, D., Thirion, B., V aro quaux, G., 2016. Deriving repro ducible biomark ers from m ulti-site resting-state data: An autism-based example. NeuroImage. 14 https://computecanada.org/ 15 http://www.clumeq.mcgill.ca/ 23 Ad-Dab’bagh, Y., Einarson, D., Lyttelton, O., Muehlb oeck, J. S., Mok, K., Iv ano v, O., Vincen t, R. D., Lepage, C., Lerc h, J., F ombonne, E., Ev ans, A. C., 2006. The CIVET Image-Pro cessing environmen t: A fully automated comprehensiv e pip eline for anatomical neuroimaging research. In: Corb etta, M. (Ed.), Pro ceedings of the 12th Annual Meeting of the Human Brain Map- ping Organization. Neuroimage, Florence, Italy . Anand, A., Li, Y., W ang, Y., Gardner, K., Low e, M. J., 2007. Recipro cal ef- fects of antidepressan t treatment on activity and connectivity of the mo o d regulating circuit: an fmri study . The Journal of neuropsychiatry and clinical neurosciences. Bellec, P ., Carb onell, F., Perlbarg, V., Ev ans, A. C., 2010a. A neuroimaging analysis kit for Octa ve and Matlab. URL http://code.google.com/p/niak/ Bellec, P ., Carb onell, F. M., Perlbarg, V., Lepage, C., Lyttelton, O., F onov, V., Janke, A., T ohk a, J., Ev ans, A. C., 2011. A neuroimaging analysis kit for Matlab and Octav e. In: Pro ceedings of the 17th International Conference on F unctional Mapping of the Human Brain. pp. In Press+. Bellec, P ., Lav oie-Courc hesne, S., Dic kinson, P ., Lerch, J. P ., Zijden b os, A. P ., Ev ans, A. C., 2012. The pip eline system for Octav e and Matlab (PSOM): a ligh tw eigh t scripting framework and execution engine for scientific workflo ws. F ron tiers in neuroinformatics 6. URL http://dx.doi.org/10.3389/fninf.2012.00007 Bellec, P ., Rosa-Neto, P ., Lyttelton, O. C., Benali, H., Ev ans, A. C., Jul. 2010b. Multi-lev el b ootstrap analysis of stable clusters in resting-state fMRI. Neu- roImage 51 (3), 1126–1139. URL http://dx.doi.org/10.1016/j.neuroimage.2010.02.082 Benjamini, Y., Ho c hberg, Y., 1995. Controlling the false-discov ery rate: a prac- tical and p ow erful approac h to m ultiple testing. J. Ro y . Statist. So c. Ser. B 57, 289–300. Bisw al, B. B., Mennes, M., Zuo, X.-N. N., Gohel, S., Kelly , C., Smith, S. M., Bec kmann, C. F., Adelstein, J. S., Buc kner, R. L., Colcom be, S., Dogonowski, A.-M. M., Ernst, M., F air, D., Hampson, M., Hoptman, M. J., Hyde, J. S., Kiviniemi, V. J., K¨ otter, R., Li, S.-J. J., Lin, C.-P . P ., Low e, M. J., Mac k a y , C., Madden, D. J., Madsen, K. H., Margulies, D. S., Ma yb erg, H. S., McMa- hon, K., Monk, C. S., Mostofsky , S. H., Nagel, B. J., P ek ar, J. J., Peltier, S. J., P etersen, S. E., Riedl, V., Rombouts, S. A., Rypma, B., Schlaggar, B. L., Sc hmidt, S., Seidler, R. D., Siegle, G. J., Sorg, C., T eng, G.-J. J., V eijola, J., Villringer, A., W alter, M., W ang, L., W eng, X.-C. C., Whitfield-Gabrieli, S., Williamson, P ., Windisch b erger, C., Zang, Y.-F. F., Zhang, H.-Y. Y., Castel- lanos, F. X., Milham, M. P ., Mar. 2010. T ow ard disco very science of h uman brain function. Pro ceedings of the National Academy of Sciences of the United 24 States of America 107 (10), 4734–4739. URL http://dx.doi.org/10.1073/pnas.0911855107 Bro wn, G. G., Mathalon, D. H., Stern, H., F ord, J., Mueller, B., Greve, D. N., McCarth y , G., V oyv odic, J., Glov er, G., Diaz, M., et al., 2011. Multisite reliabilit y of cognitive b old data. Neuroimage 54 (3), 2163–2175. Caramanos, Z., F onov, V. S., F rancis, S. J., Nara yanan, S., Pike, G. B., Collins, D. L., Arnold, D. L., 2010. Gradien t distortions in mri: Characterizing and correcting for their effects on siena-generated measures of brain volume c hange. NeuroImage 49 (2), 1601–1611. Chang, C.-C., Lin, C.-J., 2011. LIBSVM: A library for supp ort v ector mac hines. A CM T ransactions on Intelligen t Systems and T ec hnology 2, 27:1–27:27, soft- w are av ailable at http://www.csie.ntu.edu.tw/ ~ cjlin/libsvm . Chen, G., Saad, Z. S., Britton, J. C., Pine, D. S., Cox, R. W., 2013. Linear mixed-effects mo deling approac h to fmri group analysis. Neuroimage 73, 176– 190. Cheng, W., Palaniy appan, L., Li, M., Kendrick, K. M., Zhang, J., Luo, Q., Liu, Z., Y u, R., Deng, W., W ang, Q., Ma, X., Guo, W., F rancis, S., Liddle, P ., Ma yer, A. R., Sc humann, G., Li, T., F eng, J., May 2015. V o xel-based, brain- wide asso ciation study of aberrant functional connectivit y in schizophrenia implicates thalamo cortical circuitry . Np j Sc hizophrenia 1, –. URL http://dx.doi.org/10.1038/npjschz.2015.16 Cohen, J., 1992. A p o w er primer. Psychological bulletin 112 (1), 155. Collins, D. L., Neelin, P ., Peters, T. M., Ev ans, A. C., 1994. Automatic 3D in tersub ject registration of MR v olumetric data in standardized T alairach space. Journal of computer assisted tomograph y 18 (2), 192–205. URL http://view.ncbi.nlm.nih.gov/pubmed/8126267 Cortes, C., V apnik, V., Sep. 1995. Supp ort-vector net works. Machine Learning 20 (3), 273–297. URL http://dx.doi.org/10.1007/BF00994018 Damoiseaux, J. S., Rom b outs, S. A. R. B., Barkhof, F., Sc heltens, P ., Stam, C. J., Smith, S. M., Beckmann, C. F., Sep. 2006. Consistent resting-state net works across healthy sub jects. Proce edings of the National Academy of Sciences 103 (37), 13848–13853. URL http://dx.doi.org/10.1073/pnas.0601417103 Dansereau, C., Bellec, P ., Lee, K., Pittau, F., Gotman, J., Gro v a, C., 2014. Detection of abnormal resting-state net works in individual patien ts suffering from fo cal epilepsy: An initial step tow ard individual connectivity assess- men t. F rontiers in Neuroscience 8 (419). URL http://www.frontiersin.org/brain_imaging_methods/10.3389/ fnins.2014.00419/abstract 25 Dansereau, C., Risterucci, C., Pich, E. M., Arnold, D., Bellec, P ., 2013. A p o w er analysis for multisite studies in resting-state functional connectivity , with an application to clinical trials in alzheimer’s disease. V ol. 9. pp. P248 – P249, alzheimer’s Asso ciation International Conference 2013 Alzheimer’s Asso ciation International Conference 2013. URL http://www.sciencedirect.com/science/article/pii/ S1552526013011461 Desmond, J., Glo ver, G., Aug. 2002. Estimating sample size in functional mri (fmri) neuroimaging studies: Statistical p o w er analyses. Journal of Neuro- science Metho ds 118 (2), 115–128. URL http://dx.doi.org/10.1016/s0165- 0270(02)00121- 8 Durnez, J., Mo erkerk e, B., Nichols, T. E., 2014. Post-hoc p ow er estimation for top ological inference in fmri. Neuroimage 84, 45–64. Edw ard, V., Windisch b erger, C., Cunnington, R., Erdler, M., Lanzenberger, R., Ma yer, D., Endl, W., Beisteiner, R., Nov 2000. Quantification of fmri artifact reduction by a no vel plaster cast head holder. Hum Brain Mapp 11 (3), 207– 213. Elliott, M. R., Bowtell, R. W., Morris, P . G., Jun 1999. The effect of scanner sound in visual, motor, and auditory functional mri. Magn Reson Med 41 (6), 1230–1235. F air, D. A., Nigg, J. T., Iyer, S., Bathula, D., Mills, K. L., Dosen bac h, N. U. F., Sc hlaggar, B. L., Mennes, M., Gutman, D., Bangaru, S., Buitelaar, J. K., Dickstein, D. P ., Martino, A. D., Kennedy , D. N., Kelly , C., Luna, B., Sch w eitzer, J. B., V elanov a, K., W ang, Y.-F., Mostofsky , S., Castellanos, F. X., Milham, M. P ., 2012. Distinct neural signatures detected for adhd subt yp es after controlling for micro-mo vemen ts in resting state functional connectivit y mri data. F ront Syst Neurosci 6, 80. URL http://dx.doi.org/10.3389/fnsys.2012.00080 F easter, D., Mikulic h-Gilbertson, S., Brinc ks, A., Sep. 2011. Mo deling site effects in the design and analysis of multi-site trials. The American journal of drug and alcohol abuse 37 (5), 383–391. URL http://dx.doi.org/10.3109/00952990.2011.600386 F ono v, V., Ev ans, A. C., Botteron, K., Almli, C. R., McKinstry , R. C., Collins, D. L., Brain Developmen t Co operative Group, Jan. 2011. Unbiased a verage age-appropriate atlases for p ediatric studies. NeuroImage 54 (1), 313–327. URL http://dx.doi.org/10.1016/j.neuroimage.2010.07.033 F riedman, L., Glov er, G., Jun. 2006. Report on a m ulticen ter fmri qualit y assur- ance protocol. Journal of magnetic resonance imaging : JMRI 23 (6), 827–839. URL http://dx.doi.org/10.1002/jmri.20583 26 F riedman, L., Glov er, G., Consortium, T. F., Nov. 2006. Reducing interscanner v ariabilit y of activ ation in a multicen ter fmri study: Controlling for signal-to- fluctuation-noise-ratio (sfnr) differences. NeuroImage 33 (2), 471–481. URL http://dx.doi.org/10.1016/j.neuroimage.2006.07.012 F riedman, L., Stern, H., Brown, G. G., Mathalon, D. H., T urner, J., Glov er, G. H., Gollub, R. L., Lauriello, J., Lim, K. O., Cannon, T., et al., 2008. T est–retest and betw een-site reliabilit y in a m ulticen ter fmri study . Human brain mapping 29 (8), 958–972. Gio ve, F., Gili, T., Iacov ella, V., Macaluso, E., Maraviglia, B., Oct. 2009. Images-based s uppression of unw an ted global signals in resting-state func- tional connectivit y studies. Magnetic resonance imaging 27 (8), 1058–1064. URL http://dx.doi.org/10.1016/j.mri.2009.06.004 Glo ver, G. H., Mueller, B. A., T urner, J. A., v an Erp, T. G., Liu, T. T., Greve, D. N., V o yvodic, J. T., Rasm ussen, J., Bro wn, G. G., Keator, D. B., et al., 2012. F unction biomedical informatics research net work recommendations for prosp ectiv e multicen ter functional mri studies. Journal of Magnetic Reso- nance Imaging 36 (1), 39–54. Greicius, M. D., Sriv astav a, G., Reiss, A. L., Menon, V., Mar. 2004. Default- mo de netw ork activity distinguishes Alzheimer’s disease from healthy aging: Evidence from functional MRI. Pro ceedings of the National Academy of Sci- ences of the United States of America 101 (13), 4637–4642. URL http://dx.doi.org/10.1073/pnas.0308627101 Hun ter, J. D., 2007. Matplotlib: A 2d graphics en vironment. Computing In Science & Engineering 9 (3), 90–95. Jo vicich, J., Minati, L., Marizzoni, M., Marc hitelli, R., Sala-Llonc h, R., Bartr ´ es- F az, D., Arnold, J., Benninghoff, J., Fiedler, U., Ro ccatagliata, L., et al., 2016. Longitudinal repro ducibilit y of default-mo de netw ork connectivity in health y elderly participants: A m ulticentric resting-state fmri study . Neu- roImage 124, 442–454. Kilpatric k, L., Zald, D., Pardo, J., Cahill, L., 2006. Sex-related differences in amygdala functional connectivity during resting conditions. Neuroimage 30 (2), 452–461. Lund, T. E., Madsen, K. H., Sidaros, K., Luo, W.-L., Nichols, T. E., Jan. 2006. Non-white noise in fMRI: do es mo delling ha v e an impact? NeuroImage 29 (1), 54–66. URL http://dx.doi.org/10.1016/j.neuroimage.2005.07.005 Milham, M. P ., F air, D., Mennes, M., Mostofsky , S. H., 2012. The adhd-200 consortium: a model to adv ance the translational p otential of neuroimaging in clinical neuroscience. F ron tiers in Systems Neuroscience 6 (62). URL http://www.frontiersin.org/systems_neuroscience/10.3389/ fnsys.2012.00062/full 27 Mueller, S. G., W einer, M. W., Thal, L. J., P etersen, R. C., Jack, C., Jagust, W., T ro jano wski, J. Q., T oga, A. W., Bec kett, L., Nov 2005. The alzheimer’s disease neuroimaging initiativ e. Neuroimaging Clin N Am 15 (4), 869–77, xi– xii. URL http://dx.doi.org/10.1016/j.nic.2005.09.008 Nielsen, J., Zielinski, B., Fletcher, T., Alexander, A., Lange, N., Bigler, E., Lainhart, J., Anderson, J., 2013. Multisite functional connectivit y mri classi- fication of autism: Abide results. F rontiers in human neuroscience 7, –. URL http://view.ncbi.nlm.nih.gov/pubmed/24093016 Orban, P ., Madjar, C., Sa v ard, M., Dansereau, C., T am, A., Das, S., Ev ans, A. C., Rosa-Neto, P ., Breitner, J. C., Bellec, P ., o ct 2015. T est-retest resting- state fMRI in healthy elderly p ersons with a family history of alzheimer’s disease. Scien tific Data 2, 150043. URL http://dx.doi.org/10.1038/sdata.2015.43 P ow er, J. D., Barnes, K. A., Snyder, A. Z., Sc hlaggar, B. L., P etersen, S. E., F eb. 2012. Spurious but systematic correlations in functional connectivity MRI net works arise from sub ject motion. NeuroImage 59 (3), 2142–2154. URL http://dx.doi.org/10.1016/j.neuroimage.2011.10.018 P ow er, J. D., Cohen, A. L., Nelson, S. M., Wig, G. S., Barnes, K. A., Churc h, J. A., V ogel, A. C., Laumann, T. O., Miezin, F. M., Schlaggar, B. L., Pe- tersen, S. E., Nov. 2011. F unctional Netw ork Organization of the Human Brain. Neuron 72 (4), 665–678. URL http://dx.doi.org/10.1016/j.neuron.2011.09.006 Shehzad, Z., Kelly , C. M., Reiss, P . T., Gee, D. G., Gotimer, K., Uddin, L. Q., Lee, S. H. H., Margulies, D. S., Roy , A. K. K., Bisw al, B. B., Petk ov a, E., Castellanos, F. X., Milham, M. P ., Oct. 2009. The resting brain: uncon- strained yet reliable. Cerebral cortex (New Y ork, N.Y. : 1991) 19 (10), 2209– 2229. URL http://dx.doi.org/10.1093/cercor/bhn256 Sheline, Y. I., Price, J. L., Y an, Z., Min tun, M. A., 2010. Resting-state func- tional mri in depression unmasks increased connectivit y betw een netw orks via the dorsal nexus. Pro ceedings of the National Academ y of Sciences 107 (24), 11020–11025. Sutton, B. P ., Goh, J., Hebrank, A., W elsh, R. C., Chee, M. W., Park, D. C., 2008. In vestigation and v alidation of intersite fmri studies using the same imaging hardw are. Journal of Magnetic Resonance Imaging 28 (1), 21–28. T am, A., Dansereau, C., Badh war, A., Orban, P ., Belleville, S., Chertko w, H., Dagher, A., Hanganu, A., Monc hi, O., Rosa-Neto, P ., Shmuel, A., W ang, S., Breitner, J., Bellec, P ., 2015. Common effects of amnestic mild cognitive impairmen t on resting-state connectivity across four indep enden t studies. F ron tiers in Aging Neuroscience 7 (242). 28 URL http://www.frontiersin.org/aging_neuroscience/10.3389/ fnagi.2015.00242/abstract v an den Heuv el, M., Mandl, R., Hulshoff Pol, H., Apr. 2008. Normalized Cut Group Clustering of Resting-State fMRI Data. PLoS ONE 3 (4), e2001+. URL http://dx.doi.org/10.1371/journal.pone.0002001 V an Dijk, K. R., Hedden, T., V enk ataraman, A., Ev ans, K. C., Lazar, S. W., Buc kner, R. L., Jan. 2010. In trinsic functional connectivity as a to ol for human connectomics: theory , prop erties, and optimization. Journal of neurophysiol- ogy 103 (1), 297–321. URL http://dx.doi.org/10.1152/jn.00783.2009 V an Dijk, K. R., Sabuncu, M. R., Buckner, R. L., 2012. The influence of head motion on intrinsic functional connectivity mri. Neuroimage 59 (1), 431–438. V anhoutte, G., V erho ye, M., der Linden, A. V., May 2006. Changing b ody temp erature affects the t2* signal in the rat brain and reveals hypothalamic activit y . Magn Reson Med 55 (5), 1006–1012. URL http://dx.doi.org/10.1002/mrm.20861 W ang, D., Liu, B., Qin, W., W ang, J., Zhang, Y., Jiang, T., Y u, C., Dec. 2012. KIBRA gene v arian ts are asso ciated with synchronization within the default- mo de and executive control net works. NeuroImage. URL http://dx.doi.org/10.1016/j.neuroimage.2012.12.022 W orsley , K. J., F riston, K. J., Sep. 1995. Analysis of fMRI Time-Series Revisited—Again. NeuroImage 2 (3), 173–181. URL http://dx.doi.org/10.1006/nimg.1995.1023 Y an, C., Liu, D., He, Y., Zou, Q., Zh u, C., Zuo, X., Long, X., Zang, Y., Ma y 2009. Sp on taneous Brain Activit y in the Default Mode Net w ork Is Sensitive to Differen t Resting-State Conditions with Limited Cognitiv e Load. PLoS ONE 4 (5), e5743+. URL http://dx.doi.org/10.1371/journal.pone.0005743 Y an, C.-G. G., Craddo ck, C. C., Zuo, X.-N. N., Zang, Y.-F. F., Milham, M. P ., Oct. 2013. Standardizing the intrinsic brain: to wards robust measurement of inter-individual v ariation in 1000 functional connectomes. NeuroImage 80, 246–262. URL http://view.ncbi.nlm.nih.gov/pubmed/23631983 Y eo, B. T., Krienen, F. M., Sepulcre, J., Sabuncu, M. R., Lashk ari, D., Hollinshead, M., Roffman, J. L., Smoller, J. W., Z¨ ollei, L., Polimeni, J. R., Fisc hl, B., Liu, H., Buckner, R. L., Sep. 2011. The organization of the human cerebral cortex estimated by intrinsic functional connectivity . Journal of neu- roph ysiology 106 (3), 1125–1165. URL http://dx.doi.org/10.1152/jn.00338.2011 29 Zuo, X.-N., Anderson, J. S., Bellec, P ., Birn, R. M., Biswal, B. B., Blautzik, J., Breitner, J. C. S., Buc kner, R. L., Calhoun, V. D., Castellanos, F. X., Chen, A., Chen, B., Chen, J., Chen, X., Colcombe, S. J., Courtney , W., Craddock, R. C., Di Martino, A., Dong, H.-M., F u, X., Gong, Q., Gorgolewski, K. J., Han, Y., He, Y., He, Y., Ho, E., Holmes, A., Hou, X.-H., Huckins, J., Jiang, T., Jiang, Y., Kelley , W., Kelly , C., King, M., LaCon te, S. M., Lainhart, J. E., Lei, X., Li, H.-J., Li, K., Li, K., Lin, Q., Liu, D., Liu, J., Liu, X., Liu, Y., Lu, G., Lu, J., Luna, B., Luo, J., Lurie, D., Mao, Y., Margulies, D. S., Ma y er, A. R., Meindl, T., Meyerand, M. E., Nan, W., Nielsen, J. A., O’Connor, D., Paulsen, D., Prabhak aran, V., Qi, Z., Qiu, J., Shao, C., Shehzad, Z., T ang, W., Villringer, A., W ang, H., W ang, K., W ei, D., W ei, G.- X., W eng, X.-C., W u, X., Xu, T., Y ang, N., Y ang, Z., Zang, Y.-F., Zhang, L., Zhang, Q., Zhang, Z., Zhang, Z., Zhao, K., Zhen, Z., Zhou, Y., Zhu, X.-T., Milham, M. P ., 9 Dec. 2014. An open science resource for establishing reliability and repro ducibilit y in functional connectomics. Sci Data 1, 140049. 30 Supplementary Material – Statistic al p ower and pr e diction ac cur acy in multisite r esting-state fMRI c onne ctivity Submitted to Neuroimage. C. Dansereau 1 , 2 , Y. Benha jali 1 , 3 ,C. Risterucci 4 , E. Merlo Pic h 4 , P . Orban 1 , D. Arnold 5 , P . Bellec 1 , 2 1 Cen tre de Rec herche de l’Institut Universitaire de G´ eriatrie de Mon tr´ eal, Mon tr´ eal, CA 2 Departmen t of Computer Science and Op erations Researc h, Univ ersity of Mon- treal, Mon treal, CA 3 D ´ epartement d’an throp ologie, Univ ersit ´ de Mon tr´ eal, Mon tr´ eal, CA 4 Clinical Imaging, pRED, F.Hoffman-La Ro c he, Basel, CH 5 NeuroRx inc., Mon tr´ eal, CA F or all questions regarding the pap er, please address correspondence to Pierre Bellec, CRIUGM, 4545 Queen Mary , Montreal, QC, H3W 1W5, Canada. Email: pierre.b ellec (at) criugm.qc.ca. Figure S1: Standard deviation of resting-state connectivity across sub jects, in the DMN, for eac h site, sup erimp osed on the MNI152 template. 31 Figure S2: Standard deviation of resting-state connectivity across sub jects, for the full connectome and eac h site. 32 Figure S3: Standard deviation of resting-state time-series across sub jects, a ver- aged across all connections, at eac h site. Figure S4: Panel A sho ws the results of a White test for homoscedasticity , across sites. P anel B show the a verage absolute difference in standard deviation b et w een any pair of sites, and Panel C show the same difference, relative to the a verage of the standard deviation at the tw o sites. 33 Figure S5: The figure shows av erage connectomes across all sites, as well as connections with a significan t motion effect. Figure S6: The figure shows av erage connectomes across all sites, as well as connections with a significan t sex effect. 34 Figure S7: The figure shows av erage connectomes across all sites, as well as connections with a significan t age effect. 35 Figure S8: Average connectomes for individual sites, as w ell as connections with a significant site effect. This Figure is identical to Figure 2 in the paper, with the difference that the Cam bridge site w as excluded from the analysis. The in tersection ( ∩ ) of the significan t site effects are sho wn in red and the symmetric difference ( 4 ) of the significant site effects are shown in yello w. Baltimore ∩ : 9 , 4 : 16, Berlin ∩ : 318 , 4 : 333, Newark ∩ : 23 , 4 : 36, New-Y ork.b ∩ : 25 , 4 : 45, Oxford ∩ : 377 , 4 : 251, Queensland ∩ : 946 , 4 : 389, Sain t-Louis ∩ : 49 , 4 : 162 36

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment