Evaluation of Granger causality measures for constructing networks from multivariate time series

Granger causality and variants of this concept allow the study of complex dynamical systems as networks constructed from multivariate time series. In this work, a large number of Granger causality measures used to form causality networks from multiva…

Authors: Elsa Siggiridou, Christos Koutlis, Alkiviadis Tsimpiris

Evaluation of Granger causality measures for constructing networks from   multivariate time series
Article Evaluation of Granger causality measures for constructing networks from multivariate time series Elsa Siggiridou 1 , Christos Koutlis 1,2 , Alkiviadis T simpiris 3 and Dimitris Kugiumtzis 1, * 1 Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, University Campus, 54124, Thessaloniki, Greece; E-mail: dkugiu@auth.gr; esingiri@auth.gr 2 Information T echnologies Institute, Centre of Resear ch and T echnology Hellas, 57001, Thessaloniki, Greece; E-mail: ckoutlis@iti.gr 3 Department of Computer , Informatics and T elecommunications Engineering, International Hellenic University , 62124, Serres, Gr eece; E-mail: atsimpiris@teicm.gr * Author to whom correspondence should be addr essed; E-Mail: dkugiu@auth.gr; T el: +30-2310995955. Received: date; Accepted: date; Published: date Abstract: Granger causality and variants of this concept allow the study of complex dynamical systems as networks constructed from multivariate time series. In this work, a large number of Granger causality measures used to form causality networks fr om multivariate time series ar e assessed. These measures are in the time domain, such as model-based and information measures, the frequency domain and the phase domain. The study aims also to compare bivariate and multivariate measures, linear and nonlinear measures, as well as the use of dimension reduction in linear model-based measures and information measures. The latter is particular relevant in the study of high-dimensional time series. For the performance of the multivariate causality measures, low and high dimensional coupled dynamical systems are considered in discrete and continuous time, as well as deterministic and stochastic. The measures ar e evaluated and ranked according to their ability to provide causality networks that match the original coupling structure. The simulation study concludes that the Granger causality measures using dimension reduction ar e superior and should be preferr ed particularly in studies involving many observed variables, such as multi-channel electroencephalograms and financial markets. Keywords: Granger causality; causality networks; dimension reduction measures; multivariate time series 1. Introduction Real-world complex systems have been studied as networks formed from multivariate time series, i.e. observations of a number of system variables, such as financial markets and brain dynamics [ 1 , 2 ]. The nodes in the network are the observed variables and the connections are defined by an inter dependence measure. The correct estimation of the interdependence between the observed system variables is critical for the formation of the network and consequently the identification of the underlying coupling structure of the observed system. Many interdependence measures that quantify the causal effect between the variables observed simultaneously in a time series ar e based on the concept of Granger causality [ 3 ]: a variable X Granger causes a variable Y if the information in the past of X improves the pr ediction of Y . The concept was first mentioned by W iener [ 4 ] and it is also referr ed to as W iener-Granger causality , but for brevity we use the common term Granger causality here. The methodology on Granger causality was first developed in econometrics and it has been widely applied to many other fields, such as car diology and neur oscience (analysis of electroencephalograms (EEG) and functional magnetic resonance imaging (fMRI) [ 5 – 8 ]) and climate [ 9 , 10 ]. 2 of 26 The initial form of Granger causality based on autoregressive models has been extended to nonlinear models, basically local linear models [ 11 – 14 ], but also kernel-based and radial basis models [ 15 – 18 ], and recently more advanced models, such as neural networks [ 19 , 20 ]. In a wider sense, the directed dependence inherent in Granger causality is r eferred to as coupling, synchr onization, connectivity and information flow depending on the estimation approach for the interdependence. Geweke [ 21 , 22 ] defined an analogue of Granger causality in the frequency domain, developed later to other frequency measures [ 23 , 24 ]. A number of nonlinear measures of interdependence inspired by the Granger causality idea have been developed, making use of state-space techniques [ 25 , 26 ], information measures [ 27 – 29 ], and techniques based on the concept of synchronization [ 30 – 32 ]. W e refer to all these measur es as causality measures (in the setting of multivariate time series) and the networks derived by these measures as causality networks. Distinguishing indirect and dir ect causality with the available methods is a difficult task [ 33 , 34 ], and multivariate measures ar e expected to address better this task than bivariate measures. The bivariate causality measures do not make use of the information of other observed variables besides the variables X and Y of the examined causality from X to Y , denoted X → Y , and thus estimate direct and indirect causality , where indirect causality is this mediated by a third variable Z , e.g. X → Z and Z → Y results in X → Y . The multivariate causality measures apply conditioning on the other observed variables to estimate direct causal effects, denoted as X → Y | Z , where Z stands for the other observed variables included as conditioning terms [ 35 – 37 ]. For high-dimensional time series, i.e. lar ge number K of observed variables, the estimation of direct causal effects is dif ficult and the use of multivariate causality measur es is problematic. One solution to this problem is to account only for a subset of the other observed variables based on some criterion of relevance to the driving or r esponse variable [ 38 ]. In a differ ent approach, dimension r eduction techniques have been embodied in the computation of the measure restricting the conditioning terms and they have been shown to improve the ef ficiency of the direct causality measur es [ 29 , 39 – 43 ]. Recent comparative studies have assessed causal effects with various causality measures, using also significance tests for each causal effect [ 14 , 44 – 51 ]. Some studies concentrated on the comparison of direct and indirect causality measures [ 46 , 52 ], wher eas other studies focused on specific types of causality measures, e.g. frequency domain measures [ 53 – 56 ], or differ ent significance tests for a causality measure [ 57 – 59 ]. These studies ar e done on specific r eal data types, mostly from brain, which limits the generalization of the conclusions. Whereas most of the abovementioned studies wer e concentrated on the estimation of specific causal effects by the tested measures, this study is merely focused on assessing the bivariate ( X → Y ) and multivariate and ( X → Y | Z ) causality measures that estimate best the whole set of causal effects for all pairs ( X , Y ), i.e. the true causality network. In particular , high dimensional systems and subsequently high-dimensional time series are consider ed, so that the estimated networks have up to 25 nodes. Some first results of the application of different causality measures on simulated systems and evaluation of their accuracy in matching the original network were presented earlier in [ 60 ]. As the focus is on the preservation of the original causality network, we assess the existence of each causality term applying appropriate significance criteria. The causality measures are ranked as to their ability to match the original causality networks of dif ferent dynamical systems and stochastic pr ocesses. For the computation of the causality measures, several softwar e are freely accessible [ 61 – 66 ], but we have developed most of the causality measures in the context of pr evious studies of our team, and few measur es were run fr om the software [ 63 ]. The structur e of the paper is as follows. In Sec. 2 , we pr esent the causality measur es, the identification of network connections from each measure, the statistical evaluation of the accuracy of each measur e in identifying the original coupling network, the formation of a score index for each measure, and finally the synthetic systems used in the simulations. In Sec. 3 , we present the results of the measures on these 3 of 26 systems, and we rank the measures as for their accuracy in matching the original coupling network. Discussion and conclusions follow in Sec. 4 . 2. Methodology The methodology implemented for the comparative study of causality measures aiming at evaluating the measures and ranking them as to their accuracy in identifying correctly the underlying coupling network is presented her e. The methodology includes the causality measures compared in the study , the techniques for the identification of network connections from each measur e, the statistical evaluation of the accuracy of each measure in identifying the original coupling network, and the formation of an appropriate scor e index for the overall performance of each measure. Finally , the synthetic systems used in the simulation study are pr esented. 2.1. Causality measures First, it is noted that in this comparative study correlation or in general symmetric measures of X and Y are not considered. Many such measures wer e initially included in the study , e.g. many phase-based measures such as phase locking value (PL V)[ 67 ], phase lag index (PLI) [ 68 ] and weighted phase lag index (wPLI) [ 69 ], rho index (RHO) [ 70 ], phase slope index (PSI) [ 31 ] and mean phase coherence (MPC) [ 71 ]. However , their evaluation in the designed framework is not possible as the identification of the exact directed connections of the original coupling network is quantified to assess the measur e performance. Causality measures can be divided in thr ee categories as to the domain of data repr esentation they are defined in: time, frequency and phase (see Fig. 1 ). The measures in time domain dominate and they are Figure 1. T ree structur e for the types of causality measures. The five main classes are highlighted (frame box in bold). Each measure type in a box is given a code number , used as refer ence in T able 1 . further divided in model-based and model-free measures. Many of the model-free measur es are based on information theory measures and the other model-free measures on the time domain are r eferred to as “other ” measures. Thus five main classes of causality measures ar e considered in this study: model-based measures, information measur es, frequency measur es, phase measures and other measures that cannot be defined in terms of the other four classes. The measures organized in these five classes and included in the comparative study are briefly discussed below , and they are listed in T able 1 , with refer ence and code 4 of 26 number denoting the type of measure (the class, bivariate or multivariate and with our without dimension reduction). T able 1. List of causality measures organized in the five classes. The first column has the measure notation including the measure parameters, the second column has a brief description, the third column has the code of each measure denoting its type, and the fourth column has a r elated refer ence. Symbol Description T ype Ref. Model-based GCI( p ) Granger causality index, p is the V AR order , for Hénon maps p = 2, 5, for V AR process p = 3, 5, for Mackey-Glass system and neural mass model p = 5, 10, 20 1.1.1.1 [ 3 ] CGCI( p ) Conditional Granger causality index 1.1.2 [ 22 ] PGCI( p ) Partial Granger causality index 1.1.2 [ 72 ] RCGCI( p ) Restricted Granger causality index 1.1.1.2 [ 43 ] Information TE( m , τ ) T ransfer entropy . Lag τ = 1 for all systems, embedding dimension for Hénon maps m = 2, 3, for V AR process m = 3, 5, for Mackey-Glass system and neural mass model m = 5, 10, 15 1.2.1.1 [ 27 ] PTE( m , τ ) Partial transfer entropy 1.2.1.2.1 [ 75 ] STE( m , τ ) Symbolic transfer entr opy 1.2.1.1 [ 76 ] PSTE( m , τ ) Partial symbolic transfer entropy 1.2.1.2.1 [ 77 ] TER V( m , τ ) T ransfer entropy on rank vectors 1.2.1.1 [ 78 ] PTER V( m , τ ) Partial transfer entr opy on rank vectors 1.2.1.2.1 [ 79 ] PMIME( L ) Partial mutual information from mixed embedding, maximum lag for Hénon maps and V AR model L m a x = 5, for Mackey-Glass system and neural mass model L m a x = 20 1.2.1.2.2 [ 42 ] Frequency PDC( p , i ) Partial directed coher ence, i denotes the power band i = δ , θ , α , β , γ (relative proportion of the whole spectrum). For Hénon maps p = 2, 5, for the V AR process p = 3, 5, for the Mackey-Glass system and the neural mass model p = 5, 10, 20 2.1 [ 24 ] GPDC( p , i ) Generalized partial directed coher ence 2.1 [ 80 ] DTF( p , i ) Directed transfer function 2.1 [ 23 ] dDTF( p , i ) Direct dir ected transfer function 2.1 [ 81 ] GGC( p , i ) Geweke’s spectral Granger causality 1.1.1.1 [ 82 ] RGPDC( p , i ) Restricted generalized partial dir ected coherence 2.2 [ 83 ] Phase DPI Phase directionality index 3 [ 30 ] Other MCR( m , τ ) Mean conditional recurr ence, m is the same as for the information measures 1.2.2 [ 26 ] DED Directed event delay 1.2.2 [ 84 ] The first class of model-based measures regar ds measures that implement the original concept of Granger causality , the bivariate measure of Granger causality index (GCI) [ 3 ] (only X and Y variables are considered), and the multivariate measur es of the conditional Granger causality index (CGCI) [ 22 ] and the partial Granger causality (PGCI) [ 72 ] (also the other observed variables denoted Z are included). All these measures r equire the fit of a vector autor egressive model (V AR), on the two or mor e variables. The 5 of 26 order p of V AR denoting the lagged variables of each variable contained in the model can be estimated by an information criterion such as AIC or BIC, which often does not pr ovide optimal lags, e.g. see the simulation study in [ 73 ] and citations ther ein, and the so-called p -hacking (in the sense of p -value) in terms of V AR models for Granger causality in [ 74 ]. T o overcome the use of order estimation criteria, here we use a couple of predefined appropriate orders p for each system (see T able 1 ). In the presence of many observed variables, dimension reduction in V AR has been pr oposed, and here we use the one developed from our team, the restricted conditional Granger causality index (RCGCI) [ 43 ]. Thus for this class of measures, ther e are bivariate and multivariate measures, and multivariate measures that apply dimension reduction, as noted in the sketched division of causality measur es in Fig. 1 . These ar e all linear measures and besides this constraint they have been widely used in applications. Other nonlinear extensions are not considered in this study for two reasons: either they were very computationally intensive, such as the cross pr edictions of local state space models [ 14 ], or they were too complicated so that discrepancies to the original methods may occur [ 19 , 20 ]. The information measures of the second class have also been popular in diverse applications recently due to their general form, as they do not require any specific model, they are inherently nonlinear measures and can be applied to both deterministic systems and stochastic processes of any type, e.g. oscillating flows and discrete maps of any dimension. The main measure they rely on is the mutual information (MI), and more pr ecisely the conditional mutual information (CMI). There have been several forms for causality measures based on MI and CMI in the literature, e.g. see the coarse-grained mutual information in [ 85 ], but the prevailing one is the transfer entr opy (TE), originally defined for two variables [ 27 ]. The multivariate version, termed partial transfer entropy (PTE) was later proposed together with dif ferent estimates of the entropies involved in the definition of PTE, bins [ 86 ], correlation sums [ 87 ] and nearest neighbors [ 75 ]. Here, we consider the near est neighbor estimate for both TE and PTE, found to be the most appropriate for high dimensions. Equivalent forms to TE and PTE are defined for the ranks of the embedding vectors rather than the observations directly . W e consider the bivariate measures of symbolic transfer entropy (STE) [ 76 ] and transfer entropy on rank vectors (TER V) [ 78 ], and the multivariate measures of partial symbolic transfer entropy (PSTE) [ 77 ] and partial transfer entropy on rank vectors (PTER V) [ 79 ]. The idea of dimension r eduction was implemented in TE first, applying a scheme for a sparse non-uniform embedding of both X and Y , termed mutual information on mixed embedding (MIME) [ 29 ]. This bivariate measure was later extended to the multivariate measure of partial MIME (PMIME) [ 42 ]. Only the PMIME is included in the study simply due to the computational cost, and it is noted that by construction the measure gives zer o for insignificant causal effects, so it does not r equire binarization when networks of binary connections have to be derived (the positive values are simply set to one). All the methods in the third class of frequency measur es rely on the estimation of the V AR model, either on only the two variables X and Y or on all the observed variables (we consider only the latter case here). Geweke’s spectral Granger causality (GGC) is the early measure implementing the concept of Grangre causality in the frequency domain [ 21 , 82 ], included in the study . Another older such measure included in the study is the direct transfer function (DTF) [ 23 ], which although it is multivariate measure it does not discriminate direct from indir ect causal effects. For this, an improvement is pr oposed and used also in this study , termed direct directed transfer function (dDTF) [ 81 ]. W e also consider the partial dir ected coherence (PDC) [ 24 ] and the improved version of generalized partial directed coher ence (GPDC) [ 80 ], which have been particularly popular in EEG analysis. When applied to EEG, the measures are defined in terms of frequency bands of physiological relevance ( δ , θ , α , β , γ , going from low to high frequencies), and the same proportional split of the fr equency range is followed here (as if the sampling frequency was 100 Hz). Finally , we consider a dimension reduction of V AR in the GPDC measure called restricted GPDC (RGPDC), recently pr oposed from our team [ 83 ]. 6 of 26 As mentioned above the class of phase measures contains a good number of measures used in connectivity analysis, mainly in neuroscience dealing with oscillating signals such as EEG, but most of these measures are symmetric and thus out of the scope of the current study . In the evaluation of the causality measures we consider the bivariate measur e of phase dir ectionality index (DPI) [ 30 ], which is a measure of synchronization designed for oscillating time series. Information measures have also been implemented in the phases, e.g. see [ 88 ], but not considered her e. Another class of measur es used mainly in neuroscience regar ds inter -dependence measures based on neighborhoods in the r econstructed state space of each of the two variables X and Y . A series of such measures have been pr oposed after the first work in [ 25 ], using also ranks of the reconstr ucted vectors [ 89 ], the latter making the measur e computationally very slow . The conver gent cross mapping is developed under the same approach [ 90 ], and the same yields for the measure of mean conditional recurrence (MCR) [ 26 ]. It is noted that all these measures are bivariate and they are expected to suffer from estimating indirect connections in the estimated causality network. The MCR is included as a representative of the state space bivariate measures in the class of other measures. In this class, we include also event synchronization measur es [ 84 ], and specifically the direct event delay (DED) that is a directional bivariate measure, consider ed as causality measure and included in the study . For the information measures wher e the estimation of entropies in high dimensions is hard, the comparison of the multivariate measures that do not include dimension reduction to these including dimension reduction would be unfair when high-dimensional systems ar e considered. T o address this, in the calculation of a multivariate information measure not making use of dimension reduction, we choose to restrict the set of the conditioning variables Z in the estimated causal effect X → Y | Z to only the three mor e r elevant variables. In the simulations, we consider the number of observed variables (equal to the subsystems being coupled) to be K = 5 and K = 25. While for K = 5 all the r emaining variables ar e considered in Z , for K = 25 only three of the r emaining 23 variables are selected. The criterion of selection is the mutual information of the r emaining variables to the driving variable X , which is common for the selection of variables [ 38 ]. 2.2. Identification of original network connections W e suppose that a dynamical system is formed by the coupling of K subsystems and we observe one variable from each subsystem, so that a multivariate time series of dimension K is derived. The coupling structure of the original system can be displayed as a network of K nodes where the connections are determined by the system equations. Formally , in the graph-theoretical framework, a network is repr esented by a graph G = ( V ; E ) , where V is the set of K nodes, and E is the set of the connections among the nodes of V . The original coupling network is given as a graph of directed binary connections, wher e the connection fr om node i to node j is assigned with a value a i , j being one or zer o, depending whether variables of the subsystem i are pr esent in the equation determining the variables of the subsystem j . The components a i , j , i , j = 1, . . . , K , form the adjacency matrix A that defines the network. The computation of any causality measure presented in Sec. 2.1 on all the directed pairs ( i , j ) of the K observed variables gives a weight matrix R (assuming only positive values of the measure, so that a transformation of the measur e can be applied if necessary). Thus, the pairwise causality matrix R with entries R i , j = R X i → X j defines a network of weighted connections, assigning the weighted directed connection R i , j from each node i to each node j . In applications, often binary networks ar e sought to better r epresent the estimated structur e of the underlying system. Here, we are inter ested to compar e how the causality measure retrieves the original directed coupling structure, and therefor e we want to transform the weighted network to a binary network. Commonly , the weighted matrix R is transformed into an adjacency matrix A by suitable thresholding, 7 of 26 keeping in the graph only connections with weights higher than some threshold (and setting their weights to one) and removing the weaker connections (setting their weights to zero). For each causality measure, an appropriate threshold criterion is sought to determine the significant values of the measure that correspond to connections in the binary network. W e consider three approaches for thr esholding that have been used in the literature. 1. Rigorous thr esholding is provided by appr opriate significance test for the causality measure R X i → X j . For all considered causality measur es in this study , we expect the causality measure to lie at the zero level if there is no causal effect and be positive if it is, so that the test is one-sided. So, the null and alternative hypotheses are r espectively: H 0 : R X i → X j = 0, H 1 : R X i → X j > 0 (1) Parametric significance tests have been developed only for the linear causality measures, and for consistency we apply the randomization significance test to all causality measures, making use of the simple technique of time-shifted surrogates. Specifically , we generate M resampled (surr ogates) time series for the driving variable X , each by shifting cyclically the original observations of X by a random step w . For the original time series of the driving variable X i denoted { X i , t } = { x 1, t , x 2, t , . . . , x n , t } , the surrogate time series is { X ∗ i , t } = { x w + 1, t , x w + 2, t , . . . , x n , t , x 1, t , . . . , x w − 1, t , x w , t } . In this way we destr oy any form of coupling of X i and any other variable X j , so that { X ∗ i , t } is consistent to H 0 , but it preserves the dynamics and the mar ginal distribution of X i . The test statistic is the causality measur e R X i → X j , and it takes the value R 0 on the original time series pair and the values R 1 , R 2 , . . . , R M on each of the M resampled time series pairs. The rank of R 0 in the order ed list of M + 1 values R 0 , R 1 , R 2 , . . . , R M , denoted r 0 , defines the p -value of the randomization test as p = 1 − r 0 − 0.326 M + 1 + 0.348 [ 91 ]. If R 0 is at the right tail of the empirical distribution formed by R 1 , R 2 , . . . , R M , then the H 0 is rejected, which suggests that p < α , where α is the significance level of the test determining the tail. For a multivariate time series of K variables, totally K ( K − 1 ) significance tests are performed for each causality measure, indicating that multiple tests ar e performed on the same dataset. This is a known issue in statistics and corrections for multiple testing can be further be applied, such as the false discovery rate [ 92 ]. Here, we refrain from using such a correction and rather use three different significance levels α = 0.01, 0.05, 0.1. W e opted for this as the same setting is applied for all causality measures. 2. The second thresholding criterion is given by the desired density of the binary network. In the simulation study , we know the density of the original network, denoted by the number of connections ρ 0 . W e consider five different values for the density ρ of the estimated causality binary network given in multiples of ρ 0 as 0.6, 0.8, 1.0, 1.2, 1.4. 3. The third thresholding criterion is simply given by a predefined magnitude threshold on the causality measure. Here, we select an appropriate threshold t h ρ separately for each causality measure and each coupling strength for the same system, where ρ indicates the corresponding density . Having 10 realizations for each scenario (system and coupling strength), the t h ρ is the average of the thresholds found for the given density ρ in the 10 realizations. 2.3. Performance indices for statistical evaluation of methods accuracy For a system of K variables there ar e K ( K − 1 ) order ed pairs of variables to estimate causality . In the simulations of known systems, we know the true coupling pairs and thus we can compute performance indices to rate the causality measures as for their overall matching of the original connections in the network. Here, we consider the performance indices of specificity , sensitivity , Matthews correlation coefficient, F-measur e and Hamming distance. 8 of 26 The sensitivity is the proportion of the true causal effects (true positives, TP) correctly identified as such, given as sens=TP/(TP+FN), where FN (false negatives) denotes the number of pairs having true causal ef fects but have gone undetected. The specificity is the proportion of the pairs correctly not identified as having causal effects (true negatives, TN), given as spec=TN/(TN+FP), where FP (false positives) denotes the number of pairs found falsely to have causal effects. An ideal causality measure would give values of sensitivity and specificity at one. T o weigh sensitivity and specificity collectively we consider the Matthews correlation coef ficient (MCC) [ 93 ] given as MCC = TP · TN − FP · FN p ( TP + FP ) · ( TP + FN ) · ( TN + FP ) · ( TN + FN ) . (2) MCC ranges from -1 to 1. If MCC=1 there is perfect identification of the pairs of true and no causality , if MCC=-1 there is total disagreement and pairs of no causality are identified as pairs of causality and vice versa, whereas MCC at the zero level indicates random assignment of pairs to causal and non-causal effects. Similarly , we consider the F-measur e that combines pr ecision and sensitivity . The precision, called also positive predictive value, is the number of detected true causal effects divided by the total number of detected casual effects, given as pr ec=TP/(TP+FP), and the F-measure (FM) is defined as FM = 2 · prec · sens prec + sens = 2TP 2TP + FN + FP , which ranges fr om 0 to 1. If FM=1 there is perfect identification of the pairs of true causality , wher eas if FM=0 no true coupling is detected. The Hamming distance (HD) is the sum of false positives (FP) and false negatives (FN), HD=FP+FN. Thus HD gets non-negative integer values bounded below by zer o (perfect identification) and above by K ( K − 1 ) if all pairs are misclassified. 2.4. Score Index In Sec. 2.3 , we presented five performance indices to evaluate in dif ferent ways the match of the original network and the estimated network fr om each causality measur e. Further , we want to quantify this match for differ ent settings, which involve differ ent systems and dif ferent scenarios for each system regar ding the number of variables K , the system free parameter , the time series length n and the coupling strength C , where applicable (to be presented below in Sec. 2.5 ). For this, we rely on the index MCC, and for each scenario of a differ ent system, we rank the causality measures according to their average MCC (across 10 r ealizations generated per scenario). For equal MCC, ordinal ranking (called also ”1234” ranking) is adopted [ 94 ]. Specifically , the order of measures of equal MCC is decided from distinct ordinal numbers given at random to each measure of equal MCC value. Next, we derive the average rank of a causality measure i for all differ ent coupling strengths C of a system j , P i , j , as the average of the ranks of the causality measure i in all coupling strengths tested for the system j . A score s i , j of the causality measure i for the system j is then derived by normalization of the average rank P i , j over the number N of all measures so as to scale between zero and one, s i , j = ( N − P i , j ) / ( N − 1 ) , where one is for the best measure ranked at top. The overall score of the causality measur e i over all systems, s i , is simply given by the average s i , j over all systems, including the two different K values for systems S1, S2 and S3, the two values for the system parameter ∆ and A for systems S2 and S3, respectively , and the two time series lengths n for system S1. The systems are presented in the next section. 9 of 26 2.5. Synthetic systems For the comparative study , we use four systems with differ ent properties: the coupled Hénon maps as an example of discrete-time chaotic coupled system [ 95 ], the coupled Mackey-Glass system as an example of continuous-time chaotic coupled system but of high complexity [ 95 , 96 ], the so-called neural mass model as an example of a continuous-time stochastic system used as an EEG model [ 97 , 98 ], and the vector autoregr essive model (V AR) as suggested in [ 99 ], used as an example of a discrete-time linear stochastic process. The four systems are briefly presented below . S1 : The system of coupled Hénon maps is a system of coupled chaotic maps defined as x i , t = 1.4 − x 2 i , t − 1 + 0.3 x i , t − 2 , for i = 1, K x i , t = 1.4 − ( 0.5 C ( x i − 1, t − 1 + x i + 1, t − 1 ) + ( 1 − C ) x i , t − 1 ) 2 + 0.3 x i , t − 2 , for j = 2, . . . , K − 1 (3) where K denotes the number of variables and C is the coupling strength. W e consider the system for K = 5 and K = 25 and the corresponding coupling network is shown in Fig. 2 a and b, r espectively . Figure 2. Coupling networks for S1, S2, S3 in ( a ) for K = 5, in ( b ) for K = 25 , and in ( c ) for S4 and K = 25. Multivariate time series of size K are generated and we use short and long time series of length n = 512 and n = 2048, respectively . An exemplary time series for K = 5 is given in Fig. 3 a. 10 of 26 Figure 3. T ime series for: ( a ) coupled Hénon maps for C = 0.2, ( b ) coupled Mackey-Glass for ∆ = 100 and C = 0.2, ( c ) coupled Mackey-Glass for ∆ = 300 and C = 0.2, ( d ) neural mass for A = 3.45 and C = 80, ( e ) neural mass for A = 3.7 and C = 80, ( f ) V AR(3) model for C = 0.23. S2 : The system of coupled Mackey-Glass is a system of coupled identical delayed differential equations defined as ˙ x j ( t ) = − 0.1 x j ( t ) + K ∑ i = 1 C i j x i ( t − ∆ ) 1 + x i ( t − ∆ ) 10 for j = 1, 2, . . . , K (4) where K is the number of subsystems coupled to each other , C i j is the coupling str ength and ∆ is the lag parameter . W e set C ii = 0.2 and C i , j for i 6 = j zero or C according to the ring coupling structur e shown in Fig. 2 a and b for K = 5 and K = 25, respectively . For details on the solution of the delay differ ential equations and the generation of the time series, see [ 95 ]. T wo scenarios are considered regar ding the inherent complexity of each of the K subsystems given by ∆ = 100 and ∆ = 300, regar ding high complexity (correlation dimension is about 7.0 [ 100 ]) and even higher complexity (not aware of any specific study for this regime), r espectively . Exemplary time series for each ∆ and K = 5 are given in Fig. 3 b and c. The time series used in the study have length n = 4096. 11 of 26 S3 : The neural mass model is a system of coupled differ ential equations with a stochastic term that produces time series similar to EEG simulating dif ferent states of brain activity , e.g. normal and epileptic. It is defined as ˙ y j 0 ( t ) = y j 3 ( t ) ˙ y j 3 ( t ) = A aS [ y j 1 ( t ) − y j 2 ( t ) ] − 2 a y j 3 ( t ) − a 2 y j 0 ( t ) ˙ y j 1 ( t ) = y j 4 ( t ) ˙ y j 4 ( t ) = A a      p j ( t ) + C 2 S ( C 1 y j 0 ) + K ∑ i = 1 i 6 = j C i j y i 6 ( t )      − 2 a y j 4 ( t ) − a 2 y j 1 ( t ) ˙ y j 2 ( t ) = y j 5 ( t ) ˙ y j 5 ( t ) = B b n C 4 S [ C 3 y j 0 ( t ) ] o − 2 b y j 5 ( t ) − b 2 y j 2 ( t ) ˙ y j 6 ( t ) = y j 7 ( t ) ˙ y j 7 ( t ) = A a d S ( y j 1 ( t ) − y j 2 ( t ) ) − 2 a d y j 7 ( t ) − a 2 d y j 6 ( t ) (5) where j denotes each of the K subsystems repr esenting the neuron population defined by eight interacting variables and the population (subsystem) interacts with other populations through variable y j 4 with coupling strength C i j . W e set C ii = 0.0 and C i , j for i 6 = j zero or C according to the ring coupling structur e shown in Fig. 2 a and b for K = 5 and K = 25, r espectively . The term p j ( t ) repr esents a random influence from neighboring or distant populations, A is an excitation parameter and B , a , b , a d , C 1- C 4 other parameters (see [ 97 ] for more details). The function S is the sigmoid function S ( v ) = 2 e 0 / ( 1 + e r ( v 0 − v ) ) , where r is the steepness of the sigmoid and e 0 , v 0 are other parameters explained in [ 97 ]. From each population j = 1, . . . , K we consider only the first variable y j 0 and obtain the multivariate time series of K variables. The value of the excitation parameter A , affects the form of the output signals combined with the coupling strength level, ranging fr om similar to normal brain activity with no spikes to almost periodic with many spikes similar to epileptic brain activity . W e consider two values for this parameter , one for low excitation with A = 3.45 and one for high excitation with A = 3.7. Exemplary time series for each A and K = 5 are given in Fig. 3 d and e. The time series used in the study have length n = 4096. S4 : The V AR process on K = 25 variables and order P = 3 as suggested in [ 99 ] is used as r epresentative of a high-dimensional linear stochastic pr ocess. Initially , 4% of the coefficients (total coefficients 1875) of V AR(3) selected randomly are set to 0.9 and the rest are zero and the positive coefficients are reduced iteratively until the stationarity condition is fulfilled. The autoregressive terms of lag one ar e set to one. The true couplings are 8% of a total of 600 possible order ed couplings. An exemplary coupling network of random type is shown in Figure 2 (c). The time series length is set to n = 512 and an exemplary time series of only five of the K = 25 variables of the V AR(3) process is shown in Fig. 3 f. For S1, S2 and S3, the coupling strength C , fixed for all couplings, is varied to study a wide range of coupling levels from zero coupling to very strong coupling. Specifically , for S1 and S2 C = 0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, and for S3 C = 0, 20, 40, 80, 120, 160, 200. For S4, only one case of coupling strength is consider ed, given by the magnitude 0.23 of all non-zero coefficients, for which the stationarity of V AR(3) process is reached. For each system and scenario of differ ent coupling strength, 10 multivariate time series (realizations) ar e generated to obtain statistically valid results. The evaluation is performed as described in Section 2.4 . 12 of 26 3. Results In this section, the evaluation of the performance of all causality measures is presented for each system and scenario. First, the procedur e of the evaluation is shown in one specific setting, then the measures ar e evaluated and ranked for each system and finally the overall ranking is given. 3.1. Evaluation of measures in one exemplary setting W e consider a multivariate time series of length n = 512 from the system S1 of coupled Hénon maps for K = 5 variables and coupling strength C = 0.2. The original coupling network has the ring structur e as shown in Fig. 2 a. W e derive the estimated causality (weight) matrix by the bivariate measure of transfer entropy (TE) using the appr opriate parameters of embedding dimension m = 2 and τ = 1 R TE =        0 0.148 − 0.003 − 0.002 − 0.012 − 0.005 0 0.167 0.013 − 0.021 − 0.015 0.094 0 0.051 − 0.004 − 0.015 0.016 0.092 0 − 0.014 − 0.011 − 0.009 0.003 0.193 0        (6) where the negative values denote the negative bias in the estimation of TE with the nearest neighbors estimate. Applying the three criteria of measure significance for transforming the weight matrix to an adjacency matrix (see Sec. 2.2 ), we derive the causality binary networks. Specifically , as shown in Fig. 4 , differ ent binary networks are obtained for the different values of the significance level of the randomization test, the network density and the magnitude threshold. Figure 4. Binary networks of TE measure for S1 using ( a ) statistical significance test (light grey → α = 0.01 , black → α = 0.05, grey → α = 0.1), ( b ) density threshold (light grey → ρ = 4 , black → ρ = 6, grey → ρ = 8) and ( c ) magnitude threshold (light gr ey → t h 4 , black → t h 6 , grey → t h 8 ) In Fig. 4 a, the binary network for significance level α = 0.01 and α = 0.05 coincides with the original coupling network, whereas for α = 0.1 more connections ar e present r egar ding indirect causal ef fects (for the latter the statistical significant values are given in bold in the form of eq.( 6 )). When preserving the 4, 6, and 8 strongest connections, as shown in Fig. 4 b, the true structure is preserved only when the correct density is set (6 connections), indicating that the highest TE values are r eached at the true couplings. This seems the optimal strategy for thr esholding, but in real-world applications the actual network density is not a-priori known. Similarly , in Fig. 4 c, the binary networks obtained using three magnitude thr esholds on TE values are shown. Each of the three magnitude thresholds is computed as the average threshold for preserving the corresponding network density across 10 r ealizations, for this specific coupling str ength 13 of 26 and causality measure. These magnitude thresholds happen not to be the ones corresponding to the network densities for this r ealization and actually none of the thr ee thresholds identifies all the existing and non-existing connections. For illustration purposes, we compute the performance indices for TE at this scenario using the statistical significance criterion for α = 0.1, given in T able 2 . T able 2. Computation of the performance indices for the causality measure TE, wher e the binary causality network is derived using the statistical significance criterion for α = 0.1. true positive true negative positive found TP = 6 FP = 2 negative found FN = 0 TN = 12 sens : 6/6 = 1 spec : 12/14 = 0.86 MCC : 6 · 12 − 2 · 0 √ ( 6 + 2 ) · ( 6 + 0 ) · ( 12 + 2 ) · ( 12 + 0 ) = 0.80 FM : 2 · 6 2 · 6 + 0 + 2 = 0.86 HD : 2 + 0 = 2 W e note that the two extra connections found significant using α = 0.1 reduce the specificity to 0.86 while sensitivity is one, which affects accordingly the other three measur es. Note that the mismatch of just two out of 20 connections (HD=2) gives MCC=0.8, significantly lower fr om one, and the same holds for the F-measure index. For the same scenario, the measure of PMIME (for a maximum lag L = 5 well above the optimal lag 2) gives a weight matrix of zero and positive numbers R PMIME =        0 0.162 0 0 0 0 0 0.101 0 0 0 0.076 0 0.077 0 0 0 0.094 0 0 0 0 0 0.173 0        No significance criterion is applied here, and simply setting the positive numbers to one gives the adjacency matrix, and in this case the estimated causality network matches exactly the original coupling network giving HD=0 and all other indices equal to one. 3.2. Results with respect to performance indices, significance criteria and coupling str ength First, we give a comprehensive presentation reporting all the performance indices presented in Sec. 2.3 and the significance criteria in Sec. 2.2 for system S1 and K = 25, n = 2048 and C = 0.2. In T able 3 , the five performance indices of the eight highest ranked causality measures in terms of MCC ar e presented. 14 of 26 T able 3. The rankings of the eight best measures according to MCC for coupled Hénon maps ( K = 25, n = 2048, C = 0.2) are pr esented for the three binarization methods. The sensitivity (sens), specificity (spec), F measure (FM) and Hamming distance (HD) performance indices ar e also presented. Measure sens spec MCC FM HD statistical significance test ( α = 0.05) 1 PMIME ( L = 5) 0.79 0.99 0.86 0.86 11 2 RGPDC ( p = 5, β ) 0.86 0.85 0.49 0.49 84.5 3 RGPDC ( p = 5, α ) 0.87 0.84 0.47 0.47 91.6 4 GPDC( p = 5, β ) 0.87 0.84 0.47 0.46 92.3 5 RCGCI ( p = 5) 0.92 0.81 0.47 0.45 106.1 6 RGPDC ( p = 5, γ ) 0.86 0.83 0.46 0.46 95.8 7 GCI( p = 5) 0.83 0.84 0.45 0.45 92.9 8 RGPDC ( p = 5, θ ) 0.83 0.84 0.45 0.45 95.4 density threshold ( ρ 0 = 48) 1 PMIME ( L = 5) 0.79 0.99 0.86 0.86 10.9 2 TE( m = 2) 0.68 0.97 0.66 0.68 28.8 3 TE( m = 3) 0.67 0.97 0.64 0.67 30.2 4 PGCI( p = 5) 0.6 0.96 0.56 0.6 36.8 5 GPDC( p = 5, β ) 0.59 0.96 0.56 0.59 37 6 GPDC( p = 5, α ) 0.57 0.96 0.54 0.57 38.8 7 RGPDC ( p = 5, β ) 0.56 0.96 0.52 0.56 40 8 CGCI( p = 5) 0.56 0.96 0.52 0.56 40 magnitude threshold ( th 48 ) 1 PMIME ( L = 5) 0.78 0.99 0.86 0.86 11 2 TE( m = 2) 0.67 0.97 0.65 0.67 29.9 3 TE( m = 3) 0.66 0.97 0.64 0.67 29.9 4 GPDC( p = 5, β ) 0.58 0.96 0.55 0.57 39.8 5 GPDC( p = 5, α ) 0.57 0.95 0.53 0.56 41.8 6 RGPDC ( p = 5, β ) 0.54 0.96 0.51 0.55 40.4 7 PGCI( p = 5) 0.50 0.96 0.51 0.52 40.6 8 RGPDC ( p = 5, α ) 0.53 0.96 0.50 0.54 41.9 In T able 3 and all tables to follow , the measures making use of dimension reduction, i.e. PMIME, RCGCI and RGPDC, are highlighted (bold face) to accommodate comparison with the other measures. The results are or ganized in thr ee blocks, one for each of the three significance criteria. For the criterion of statistical significance at α = 0.05, the dimension r eduction measures score highest in all performance indices. The PMIME ( L = 5) measure obtains the greatest specificity value 0.99 and RCGCI ( p = 5) the greatest sensitivity value 0.92. A large dif ference between the first MCC=0.86 for PMIME and the MCC for the other highest ranked measur es is observed, while for the specificity and sensitivity indices this does not hold. This is explained by the fact that a small decrease in specificity implies increase in the number of falsely detected causal ef fects that for networks of low density dominates in the determination of MCC (see eq.( 2 )). For the significance criteria of density ( ρ 0 = 48 equal to the number of true couplings) and threshold ( t h 48 ), PMIME ( L = 5) is unaltered at first rank while the information and fr equency measur es exhibit better performance compared to the criterion of statistical significance. It is also observed that these two criteria show lower sensitivity and higher specificity , which questions the rule that the couplings of largest causality values are the true ones. When ρ is smaller or larger than the true density , sensitivity changes more than specificity again due to the sparseness of the true network. Similar conclusions are inferred for the significance criterion of magnitude thr eshold. 15 of 26 W e demonstrate further the dependence of the causality measure performance on the parameter in each significance criterion using the S4 system of V AR process ( K = 25, n = 512, P = 3). In T able 4 , the ranking of the eight best measures in terms of MCC is pr esented for three parameters of each of the thr ee significance criteria. T able 4. The rankings of the eight best measur es according to MCC for the system S4 of the V AR process ( K = 25, n = 512, p = 3) in conjunction with each significance criterion and its parameter . Three rankings ar e given in three blocks, one for each significance criterion and for three different choices of its parameter , where ρ 0 = 48 is the true density . statistical significance test Measure a = 0.01 a = 0.05 a = 0.1 1 RGPDC( p = 3 , α ) 0.944 0.868 0.861 2 RGPDC( p = 5 , α ) 0.944 0.867 0.861 3 RGPDC( p = 3 , θ ) 0.940 0.868 0.861 4 RGPDC( p = 5 , θ ) 0.939 0.867 0.861 5 RGPDC( p = 3 , δ ) 0.938 0.868 0.861 6 RGPDC( p = 5 , δ ) 0.936 0.867 0.861 7 RGPDC( p = 5 , β ) 0.933 0.867 0.861 8 RCGCI( p = 3 ) 0.933 0.867 0.861 density threshold Measure 0.6 ρ 0 ρ 0 1.4 ρ 0 1 RCGCI( p = 3 ) 0.758 0.974 0.862 2 RGPDC( p = 5 , α ) 0.758 0.972 0.862 3 RCGCI( p = 5 ) 0.758 0.972 0.862 4 RGPDC( p = 3 , α ) 0.758 0.972 0.862 5 RGPDC( p = 3 , γ ) 0.755 0.972 0.862 6 RGPDC( p = 5 , γ ) 0.755 0.972 0.862 7 RGPDC( p = 5 , θ ) 0.785 0.972 0.862 8 RGPDC( p = 3 , β ) 0.758 0.969 0.862 magnitude threshold Measure t h 0.6 ρ 0 t h ρ 0 t h 1.4 ρ 0 1 RCGCI( p = 3 ) 0.751 0.979 0.868 2 RGPDC( p = 3 , θ ) 0.753 0.976 0.868 3 RGPDC( p = 5 , θ ) 0.753 0.975 0.869 4 RGPDC( p = 3 , α ) 0.757 0.975 0.868 5 RGPDC( p = 3 , δ ) 0.750 0.975 0.868 6 RCGCI( p = 5 ) 0.752 0.974 0.868 7 RGPDC( p = 5 , α ) 0.752 0.974 0.869 8 RGPDC( p = 5 , δ ) 0.751 0.974 0.868 For this linear system, the highest ranked causality measures for all three significance criteria are the linear measures using dimension r eduction RCGCI and RGPDC for various parameter values. This is somehow expected as these measures ar e both linear and the underlying system is linear , and they use dimension r eduction as the number of variables is K = 25. For such a high-dimensional time series, the bivariate linear measures give indirect (and false in our evaluation) causality effects, whereas the multivariate linear measures without dimension r eduction cannot r each the performance of RCGCI and RGPDC as the time series length n = 512 is relatively small for estimating accurately the V AR model parameters (75 coefficients in V AR(3) are to be estimated for each of the K = 25 variables). The best performance of the measures is achieved for the criterion of statistical significance when a = 0.01 and for 16 of 26 the other two significance criteria when the parameters corresponding to the true density ρ 0 , as expected. Comparing the three rankings for the best parameter choice of each criterion, it is observed that the statistical significance gives MCC values almost as high as the other two methods. This fact indicates the advantage of the statistical significance criterion, wher e the a-priori knowledge of the true network density is not requir ed. Fr om this point on, all the presented r esults are for the criterion of statistical significance. W e here discuss the dependence of the measure accuracy on the coupling strength C and use as example the system S1 with K = 25 and n = 2048. In Fig. 5 , the MCC for PMIME, RGPDC, RCGCI and TE is given as a function of C for the three significance criteria. Figure 5. MCC of PMIME ( L = 5), TE ( m = 3), RGPDC ( p = 5, δ ), RCGCI ( p = 5) as a function of the coupling strength C in system S1 of the coupled Hénon maps ( K = 25, n = 2048) for the three significance criteria: (a) statistical testing at α = 0.05, (b) true density threshold ρ 0 = 48 and (c) magnitude threshold t h ρ 0 . The parameters in the criteria are α = 0.05 for the statistical testing, the true density of the original network ρ 0 = 48, and the average magnitude threshold t h ρ 0 over all magnitude thresholds corr esponding to the true density for the 10 realizations. For all significance criteria, the PMIME exhibits the best performance for large C > 0.1 while TE has the highest MCC for small C at 0.05 and 0.1. Though S1 is a nonlinear system, the linear measures RCGCI and RGPDC (using the same dimension reduction step) ar e competitive and as good as or better than TE for large C . Thus for this system and setting of n , K and large C , the rate of indirect (false) couplings found by the nonlinear bivariate measure TE is as large as or larger than the undetected nonlinear couplings from the linear measures RCGCI and RGPDC. This indicates that linear measures with dimension reduction may even perform better than nonlinear ones in settings of time series from nonlinear systems. It is noted that the coupling strength C = 0.05 is very weak and the dimension reduction methods PMIME, RCGCI and RGPDC find no significant causal effects giving zer o, which cannot change with any significance criterion. On the other hand, the small TE values for C = 0.05 are still found significant at a good pr oportion giving rather large MCC at the level of 0.8. 17 of 26 3.3. Ranking of causality measures for each synthetic system W e derive summary results of all measures at each system over all coupling strengths C and for differ ent number of variables K and time series lengths n where applicable. For this, we use the average score index s i , j for each measure i at each system j as defined in Sec. 2.4 . In all results in this section, the statistical significance testing for α = 0.05 has been used. In T able 5 , the average score s i , j for system S1 (coupled Hénon maps) is presented for the eight measures scoring highest at each scenario combining K = 5 and K = 25 with n = 512 and n = 2048. T able 5. The ranking of the eight best measures according to the score index for system S1 of coupled Hénon maps for all scenarios of number of variables K and time series length n . K = 5 K = 25 n = 512 n = 2048 n = 512 n = 2048 Measure Score Measure Score Measure Score Measure Score PMIME ( L = 5) 0.87 PMIME ( L = 5) 0.84 PMIME ( L = 5) 0.92 PMIME ( L = 5) 0.91 TER V( m = 3) 0.78 RGPDC ( p = 5, δ ) 0.77 TE( m = 2) 0.87 RGPDC ( p = 5, δ ) 0.83 STE( m = 3) 0.77 RGPDC ( p = 5, θ ) 0.76 TE( m = 3) 0.86 RGPDC ( p = 5, θ ) 0.83 TE( m = 3) 0.72 RCGCI ( p = 5) 0.76 RGPDC ( p = 5, β ) 0.81 RCGCI ( p = 5) 0.82 RCGCI ( p = 5) 0.71 RGPDC ( p = 5, γ ) 0.75 RGPDC ( p = 5, α ) 0.81 RGPDC ( p = 5, β ) 0.82 RGPDC ( p = 5, α ) 0.71 RGPDC ( p = 5, β ) 0.75 RGPDC ( p = 5, θ ) 0.79 RGPDC ( p = 5, α ) 0.81 CGCI( p = 5) 0.70 PDC( p = 5, α ) 0.75 RGPDC ( p = 5, δ ) 0.78 RGPDC ( p = 5, γ ) 0.80 RGPDC ( p = 5, γ ) 0.70 GPDC( p = 5, α ) 0.73 RCGCI ( p = 5) 0.77 CGCI( p = 5) 0.76 In all scenarios of S1 the PMIME measure is found to have the best performance. Also, the other measures of dimension reduction RCGCI and RGPDC reach highly ranked positions in all scenarios, especially in the case of lar ge time series length. It is noted that these two measur es ar e linear and they beat many other nonlinear measur es showing the importance of pr oper dimension reduction. For small time series length ( n = 512), the information measures show better performance and it is again notable that the bivariate measures, such as TE, STE and TER V , score higher than the corresponding multivariate measures, PTE PSTE and PTER V . Again, the explanation for this lies in the inability of the multivariate measures to deal with high dimensions if dimension r eduction is not employed. Having even as low as three conditioning variables in the conditional mutual information used by these measures (in the case of K = 25 the three mor e corr elated variables in terms of MI to the driving variable ar e selected fr om the 23 remaining variables) does not provide as accurate estimates of the causal effects as the respective bivariate measures. These multivariate measures (along other multivariate measur es of no dimension r eduction) give non-existing causal effects even to the beginning and end of the ring, whereas the respective bivariate measures do not, and only estimate additionally indir ect causal effects (r esults not shown her e). In T able 6 , the average score for system S2 of coupled Mackey-Glass subsystems is presented for the eight measures scoring highest at each scenario combining K = 5 and K = 25 with ∆ = 100 and ∆ = 300, where ∆ controls the complexity of each subsystem. 18 of 26 T able 6. The ranking of the eight best measures according to the score index for system S2 of coupled Mackey-Glass subsystems for all scenarios of number of variables K and delay parameter ∆ that controls the complexity of each subsystem. K = 5 K = 25 ∆ = 100 ∆ = 300 ∆ = 100 ∆ = 300 Measure Score Measure Score Measure Score Measure Score RGPDC ( p = 20, α ) 0.90 RCGCI ( p = 20) 0.88 PMIME ( L = 50) 1.00 PMIME ( L = 50) 0.95 RGPDC ( p = 20, δ ) 0.88 RGPDC ( p = 20, γ ) 0.86 PGCI( p = 5) 0.87 RCGCI ( p = 5) 0.89 RGPDC ( p = 20, β ) 0.88 RGPDC ( p = 20, δ ) 0.86 RGPDC ( p = 20, γ ) 0.85 RGPDC ( p = 5, δ ) 0.88 RGPDC ( p = 20, γ ) 0.88 RGPDC ( p = 20, α ) 0.86 RGPDC ( p = 20, δ ) 0.85 RGPDC ( p = 5, γ ) 0.87 RCGCI ( p = 20) 0.86 RGPDC ( p = 20, θ ) 0.85 RGPDC ( p = 20, β ) 0.84 RGPDC ( p = 5, β ) 0.86 RGPDC ( p = 20, θ ) 0.86 RGPDC ( p = 20, β ) 0.85 RCGCI ( p = 20) 0.84 RGPDC ( p = 5, α ) 0.86 RGPDC ( p = 5, θ ) 0.86 RGPDC ( p = 5, γ ) 0.80 RGPDC ( p = 20, α ) 0.83 RGPDC ( p = 5, θ ) 0.85 RGPDC ( p = 5, δ ) 0.84 RCGCI ( p = 5) 0.78 PGCI( p = 20) 0.82 RGPDC ( p = 20, α ) 0.81 This system is comprised of highly complex systems with complexity incr easing with ∆ . For K = 5 and regardless of ∆ , the linear measures using dimension reduction RCGCI and RGPDC show the best performance, indicating again the importance of dimension r eduction, her e for oscillating complex systems. The PMIME scor es slightly lower than these measur es and given that RGPDC scor es equally high at differ ent bands, together with RCGCI they occupy the first eight places, so that the PMIME is not listed. For K = 25 on the other hand, the PMIME scores much higher than the RCGCI and RGPDC measures and is at the first place for both ∆ = 100 and ∆ = 300. Apparently , the dimension reduction in the information measure of PMIME is mor e effective than in the V AR-based measur e of RCGCI and RGPDC for lar ger K . In T able 7 , the average score for system S3 of the neural mass model is presented as for S2, but having as system parameter A = 3.45, 3.7, where the latter value indicates mor e clear oscillating behavior . T able 7. The ranking of the eight best measures accor ding to the score index for system S3 of the neural mass model for all scenarios of number of variables K and oscillation controlling parameter A . K = 5 K = 25 A = 3.45 A = 3.7 A = 3.45 A = 3.7 Measure Score Measure Score Measure Score Measure Score RCGCI ( p = 20) 0.88 GPDC( p = 20, θ ) 0.94 GPDC( p = 5, θ ) 0.83 GPDC( p = 20, θ ) 0.92 GPDC( p = 20, θ ) 0.88 PDC( p = 20, θ ) 0.88 GPDC( p = 20, θ ) 0.82 CGCI( p = 20) 0.90 RGPDC ( p = 20, α ) 0.88 PMIME( L = 20 ) 0.80 RCGCI ( p = 20) 0.81 dDTF( p = 20, δ ) 0.89 RGPDC ( p = 20, γ ) 0.87 RGPDC ( p = 5, α ) 0.78 RGPDC ( p = 20, δ ) 0.79 GPDC( p = 5, α ) 0.87 PGCI( p = 20) 0.87 RCGCI ( p = 5) 0.78 RGPDC ( p = 20, β ) 0.79 PMIME( L = 20 ) 0.87 CGCI( p = 20) 0.87 RGPDC ( p = 5, α ) 0.77 RGPDC ( p = 20, θ ) 0.79 PDC( p = 5, α ) 0.82 RGPDC ( p = 20, θ ) 0.86 RGPDC ( p = 20, θ ) 0.77 CGCI( p = 20) 0.78 PDC( p = 20, θ ) 0.80 RGPDC ( p = 20, δ ) 0.85 RGPDC ( p = 20, γ ) 0.77 PDC( p = 20, θ ) 0.78 GPDC( p = 20, α ) 0.80 In all scenarios GPDC shows the best performance. The RCGCI measure for A = 3.45 reaches the next position and also RGPDC reaches a high position on the ranking in the first three scenarios. The fact that GPDC scores higher than RGPDC also for K = 25 indicates that for this system and both A the inclusion of all lagged terms in V AR of order p = 5 or p = 20 gives somehow better identification of the correct couplings after significance testing. This is so, due to the relative lar ge length n = 4096 of the time series that allows for the reliable estimation of the coef ficients being as many as 20 · 25 = 500. For A = 3.45, the PMIME does not score high as its sensitivity is comparatively small (fails to find significant proportion 19 of 26 of true causal effects), whereas for A = 3.7 the PMIME is also among the first eight best measures. It is observed that in all settings the frequency measures, and particularly at low frequency bands, have the ability to identify the tr ue causality interactions better than information and other measures. This is reasonable since this system is characterized by str ongly harmonic oscillations. For S4, no average r esults are shown as the system is run for only one scenario, and the ranking for this was shown in T able 4 and discussed earlier . 3.4. Overall ranking of causality measures For an overall evaluation of the causality measur es, the average score s i over all systems and scenarios is computed for each measure i , as defined in Sec. 2.4 . In T able 8 , the ten measures with highest scor e s i are listed. T able 8. A verage score index over all systems and scenarios. Measure Score PMIME 0.80 RGPDC 0.79 RCGCI 0.78 GPDC 0.63 CGCI 0.61 PGCI 0.61 PDC 0.56 TE 0.51 dDTF 0.50 TER V 0.46 It is noted that for each measur e computed for varying parameters, such as the fr equency bands for the frequency measur es, only the one with the highest score is listed. The best performance is achieved by the three measur es making use of dimension reduction, with the information measure PMIME scoring highest. It is noted that there is a jump in score fr om the third to the fourth place, showing the superiority of the measures of dimension reduction over the other measures. The remaining places in the list are dominated by the linear measures in the time and frequency domain. Comparing the frequency measur es based all on the same V AR model we note that GPDC, and even PDC score higher than dDTF . As for the information measures, the bivariate measur es TE and TER V scor e much higher than the multivariate respective measur es (results not shown), indicating the inability of multivariate information measur es to perform well unless an appropriate dimension r eduction is applied. 4. Discussion In this paper , a simulation study is performed for the estimation of causality networks from multivariate time series. For the network construction, Granger causality measures, simply termed here as causality measur es, of dif ferent type wer e employed as information and model-based measur es, measures based on phase, fr equency measures and measures making use of dimension r eduction. These measures ar e applied to linear and nonlinear (chaotic), deterministic and stochastic, coupled simulated systems, to evaluate their ability to detect the existing coupled pairs of observed variables of these systems. W e considered the nonlinear dynamical systems of coupled Hénon maps (S1), coupled Mackey-Glass subsystems (S2), the so-called neural mass model (S3), and a linear vector autoregr essive process (V AR) of order 3 (S4). For systems S1, S2, S3 we used K = 5 and K = 25 subsystems, whereas S4 was defined only on K = 25 variables. For S2 and S3, we considered two regimes of dif ferent complexity for each system, 20 of 26 controlled by a system parameter . For S1, S2, and S3, a range of coupling strengths C were designed covering the setting of none to weak and strong coupling. For S1, a small and a lar ge time series length n were used. This design of the simulation aimed at testing the causality measur es on dif ferent types of systems with respect to time (S1, S4 are discrete and S2, S3 continuous in time), low and high dimensional having K = 5 and K = 25, linear (S4) and nonlinear (S1, S2, S3), deterministic (S1, S2) and stochastic (S3, S4), and for a range of coupling strengths (S1, S2, S3). Thus, rather than concentrating on a particular system or type of systems, e.g. often met in EEG studies, we aimed at evaluating the performance of the causality measures on many differ ent system settings. Measures that were best suited for strongly oscillating systems, such as the fr equency measures, may not be appr opriate for maps, and on the other hand, information measures on ranks (such as TER V) that are more appr opriate for discrete-time systems (maps) may not be appropriate for str ongly oscillating signals. However , the evaluation showed that this was not the case, and, in the overall ranking, frequency measur es dominated, but also TER V was included among the ten best. The evaluation of the measur es was based on the match of the causality network constr ucted from each measure to the original coupling network of the system generating the multivariate time series. For this, three significance criteria wer e used to transform the value of each causality measure, corresponding to a weighted network connection, to a binary value, corresponding to a binary connection. While the criteria of network density thr eshold and magnitude threshold are arbitrary and best r esults are only attained when the thresholds are given based on the knowledge of the original coupling network, the statistical testing, which does not requir e a-priori information on the underlying system, was competitive and was further suggested as the criterion of choice to derive binary connections. Performance indices were computed checking the pr eservation of the original binary connections (true) in the causality binary network (estimated), and we used the Matthews correlation coefficient (MCC) to quantify best the matching performance of each causality measure as it weighs sensitivity and specificity . W e considered bivariate and multivariate causality measures, and a subset of three multivariate measures making use of dimension reduction. The first is the information measur e of partial mutual information from mixed embedding (PMIME), which can be considered as a r estriction of the partial transfer entropy (PTE) to the most r elevant lagged variables. The other two measures ar e linear and they are both based on V AR model. The dimension reduction suggests fitting a sparse V AR rather than a full V AR. While the conditional Granger causality index (CGCI) is defined in the time domain on the full V AR, the restricted CGCI (RCGCI) is computed on the sparse V AR, and accor dingly in the fr equency domain the generalized partial directed coher ence (GPDC) is modified using a sparse V AR to the restricted GPDC (RGPDC). While linear models can be estimated sufficiently well in high-dimensional time series, provided the length of the time series is much larger than the number of the unknown model coefficients, the estimation of entropies, used in information measures, in high dimension is problematic, even when using the most appropriate estimate of nearest neighbors. T o make a fair comparison between the multivariate information measure making use of dimension r eduction (PMIME) to the other multivariate information measures in high-dimensional time series (her e K = 25), for the latter measur es we do not condition the causal relationship among the driving and response to all r emaining variables (23 in our case), but rather select the three variables that are best correlated in terms of mutual information to the driving variable. In this way we avoid to some degree the curse of dimensionality , but still the embedding is done separately for each of the five variables, i.e. the driving, the response and the other thr ee variables, whereas for the measures using dimension reduction, the embedding is built jointly for all variables selecting only the most appropriate lagged variables. The evaluation of the causality measures showed differences in their performance in the differ ent systems and their parameters ( n , K , C and system parameter ∆ for S2 and A for S3). For system S1, the 21 of 26 measures making use of dimension reduction scored highest regardless of n and K with the PMIME attaining highest MCC for all but very small C . Bivariate information measures scored high here but only for small n . For system S2, the frequency measur es were the most accurate at all fr equency bands, especially for small C while for stronger couplings dimension reduction measures r eached higher MCC. For high-dimensional time series ( K = 25) again the PMIME scored highest followed mainly by the linear dimension reduction measures for differ ent parameter values. For system S3 characterized by str ong oscillations, the frequency measur es performed best occupying the highest ranks and only the PMIME entered the list of eight highest rankings for the system parameter A = 3.7 for both K . For the linear V AR system S4, as expected, the linear measures with dimension r eduction performed best and the respective linear measures of full dimension had also incr eased performance. The conclusions of the simulation study on comparatively low-dimensional ( K = 5) and high-dimensional ( K = 25) time series from differ ent systems are itemized as follows: 1. The multivariate measures making use of dimension reduction (PMIME, RCGCI, RGPDC) outperform all other bivariate and multivariate measures. 2. Among the dimension r eduction measures, the information measur e of PMIME is overall best but the overall score is slightly higher than that of the other two linear measures. Though the PMIME outperforms the other measures in the chaotic systems S1 and S2, for the strongly oscillating stochastic system S3 and the linear stochastic process S4 it scor es lower than RCGCI and RGPDC. 3. Linear measures, especially these applying dimension reduction, exhibited a competitive performance to other nonlinear measures also on nonlinear systems, such as S1, S2 and S3. This remark supports the use of linear measures (pr eferably with dimension reduction) to settings that may involve nonlinear relationships. Certainly , results still depend on the studied system. 4. Though bivariate measures tend to identify causality relationships that are not dir ect, they do not fail in identifying non-existing causal relationships. The latter occurs when using multivariate measures without dimension r eduction. Though this effect cannot be captur ed by standar d performance indices used in this study , such as the sensitivity and specificity , it is a significant finding advocating against the use of multivariate measures, unless dimension r eduction is applied. Admittedly , the collection of causality measures is biased including all measur es our team has developed. On the other hand, the collection is not compr ehensive, leaving out nonlinear measur es that are more difficult to implement and could not be found freely available when the study was initiated. It is noted that initially , many connectivity measures that are not directional, especially these based on phases, were included, but they could not be fairly evaluated in the designed framework comparing the derived network to the original network of directed connections. The simulation study was conducted using four systems, leaving out other systems, such as the coupled Lorenz or coupled Rössler systems, as well as dif fer ent coupling str uctures, such as the random (used her e only in S4), small-world and scale-fr ee, used by our team in other studies. Besides these shortcomings, we believe the current study can be useful for methodologists and practitioners to assess the strengths and weaknesses of the differ ent causality measures and their applicability especially to high-dimensional time series. Funding: Part of the research was supported by a grant from the Greek General Secretariat for Research and T echnology (Aristeia II, No 4822). References 1. Arenas, A.; Diaz-Guilera, A.; Kurths, J.; Moreno, Y .; Zhou, C. Synchronization in Complex Networks. Physics Reports 2008 , 469 , 93–153. 2. Zanin, M.; Sousa, P .; Papo, D.; Bajo, R.; García-Prieto, J.; Pozo, F .; Menasalvas, E.; Boccaletti, S. Optimizing Functional Network Representation of Multivariate T ime Series. Scientific Reports 2012 , 2 , 630. 22 of 26 3. Granger , J. Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Acta Physica Polonica B 1969 , 37 , 424–438. 4. W iener , N. The theory of prediction. Modern Mathematics for Engineers; Beckenbach, E., Ed. McGraw-Hill, 1956. 5. Smith, S.; Miller , K.; Salimi-Khorshidi, G.; W ebster , M.; Beckmann, C.; Nichols, T .; Ramsey , J.; W oolrich, M. Network Modelling Methods for fMRI. Neuroimage 2011 , 54 , 875–891. 6. Bastos, A.; Schoffelen, J.M. A T utorial Review of Functional Connectivity Analysis Methods and Their Interpretational Pitfalls. Frontiers in Systems Neuroscience 2016 , 9 . 7. Porta, A.; Faes, L. W iener-Granger Causality in Network Physiology W ith Applications to Cardiovascular Control and Neur oscience. Proceedings of the IEEE 2016 , 104 , 282–309. 8. Fornito, A.; Zalesky , A.; Bullmore, E.T . Fundamentals of Brain Network Analysis , 1 ed.; Academic Press, Elsevier , 2016. 9. Hlinka, J.; Hartman, D.; V ejmelka, M.; Runge, J.; Marwan, N.; Kurths, J.; Palus, M. Reliability of Infer ence of Directed Climate Networks Using Conditional Mutual Information. Entropy 2011 , 15 , 2023–2045. 10. Dijkstra, H.; Hernández-García, E.; Masoller , C.; Barreiro, M. Networks in Climate ; Cambridge University Press: Cambridge, 2019. 11. Schiff, S.; So, P .; Chang, T .; Burke, R.; Sauer , T . Detecting Dynamical Interdependence and Generalized Synchrony thr ough Mutual Prediction in a Neural Ensemble. Physical Review E 1996 , 54 , 6708–6724. 12. Le V an Quyen, M.; Martinerie, J.; Adam, C.; V arela, F .J. Nonlinear Analyses of Interictal EEG Map the Brain Interdependences in Human Focal Epilepsy . Physica D: Nonlinear Phenomena 1999 , 127 , 250–266. 13. Chen, Y .; Rangarajan, G.; Feng, J.; Ding, M. Analyzing Multiple Nonlinear T ime Series with Extended Granger Causality . Physics Letters A 2004 , 324 , 26–35. 14. Faes, L.; Porta, A.; Nollo, G. Mutual Nonlinear Prediction as a T ool to Evaluate Coupling Strength and Directionality in Bivariate T ime Series: Comparison among Different Strategies Based on k Nearest Neighbors. Physical Review E 2008 , 78 , 026201. 15. Marinazzo, D.; Pellicoro, M.; Stramaglia, S. Nonlinear parametric model for Granger causality of time series. Physical Review E 2006 , 73 . 16. Marinazzo, D.; Pellicoro, M.; St ramaglia, S. Kernel Method for Nonlinear Granger Causality . Physical Review Letters 2008 , 100 , 144103. 17. Marinazzo, D.; Liao, W .; Chen, H.; Stramaglia, S. Nonlinear Connectivity by Granger Causality . NeuroImage 2011 , 58 , 330–338. 18. Karanikolas, G.V .; Giannakis, G.B. Identifying Directional Connections in Brain Networks via Multi-kernel Granger Models. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 6304–6308. 19. Montalto, A.; Stramaglia, S.; Faes, L.; T essitore, G.; Prevete, R.; Marinazzo, D. Neural Networks with Non-uniform Embedding and Explicit V alidation Phase to Assess Granger Causality . Neural Networks 2015 , 71 , 159–171. 20. Abbasvandi, Z.; Nasrabadi, A.M. A Self-organized Recurrent Neural Network for Estimating the Effective Connectivity and its Application to EEG Data. Computers in Biology and Medicine 2019 , 110 , 93–107. 21. Geweke, J. Measurement of Linear Dependence and Feedback Between Multiple T ime Series. Journal of the American Statistical Association 1982 , 77 , 304–313. 22. Geweke, J.F . Measures of Conditional Linear Dependence and Feedback between T ime Series. Journal of the American Statistical Association 1984 , 79 , 907–915. 23. Kaminski, M.J.; Blinowska, K.J. A New Method of the Description of the Information Flow in the Brain Structures. Biological Cybernetics 1991 , 65 , 203–210. 24. Baccalá, L.; Sameshima, K. Partial Directed Coherence: a New Concept in Neural Structure Determination. Biological Cybernetics 2001 , 84 , 463–474. 25. Arnhold, J.; Grassberger , P .; Lehnertz, K.; Elger , C.E. A Robust Method for detecting Interdependences: Application to Intracranially Recorded EEG. Physica D 1999 , 134 , 419–430. 23 of 26 26. Romano, M.C.; Thiel, M.; Kurths, J.; Gr ebogi, C. Estimation of the Direction of the Coupling by Conditional Probabilities of Recurr ence. Physical Review E 2007 , 76 , 036211. 27. Schreiber , T . Measuring Information T ransfer . Physical Review Letters 2000 , 85 , 461–464. 28. Palus, M. Coarse-grained entropy rates for characterization of complex time series. Physica D:Nolinear Phenom. 1996 , 93 , 64–77. 29. Vlachos, I.; Kugiumtzis, D. Non-uniform State Space Reconstruction and Coupling Detection. Physical Review E 2010 , 82 , 016207. 30. Rosenblum, M.G.; Pikovski, A.S. Detecting Direction of Coupling in Interacting Oscillators. Physical Review E 2001 , 64 , 045202. 31. Nolte, G.; Ziehe, A.; Nikulin, V .V .; Schlögl, A.; Krämer , N.; Brismar , T .; Müller , K.R. Robustly Estimating the Flow Direction of Information in Complex Physical Systems. Physical Review Letters 2008 , 100 , 234101. 32. Lehnertz, K.; Bialonski, S.; Horstmann, M.T .; Kr ug, D.; Rothkegel, A.; Staniek, M.; W agner , T . Synchronization Phenomena in Human Epileptic Brain Networks. Journal of Neuroscience Methods 2009 , 183 , 42–48. 33. Rings, T .; Lehnertz, K. Distinguishing between Direct and Indir ect Directional Couplings in Large Oscillator Networks: Partial or Non-partial Phase Analyses? Chaos 2016 , 26 . 34. Han, X.; Shen, Z.; W ang, W .X.; Lai, Y .C.; Grebogi, C. Reconstructing Direct and Indirect Interactions in Networked Public Goods Game. Scientific Reports 2016 , 6 . 35. Blinowska, K.J.; Ku, R.; Kamiski, M. Granger Causality and Information Flow in Multivariate Processes. Physical Review E 2004 , 50 , 050902. 36. Kus, R.; Kaminski, M.; Blinowska, K. Determination of EEG Activity Propagation: Pair-wise versus Multichannel Estimate. IEEE T ransactions on Biomedical Engineering 2004 , 51 , 1501–1510. 37. Eichler , M. Graphical modelling of multivariate time series. Probability Theory and Related Fields 2012 , 153 , 233–268. 38. Marinazzo, D.; Pellicoro, M.; Stramaglia, S. Causal Information Approach to Partial Conditioning in Multivariate Data Sets. Computational and Mathematical Methods in Medicine 2012 , 2012 . 39. Faes, L.; Nollo, G.; Porta, A. Information-based Detection of Nonlinear Granger Causality in Multivariate Processes via a Nonuniform Embedding T echnique. Physical Review E 2011 , 83 , 051112. 40. Runge, J.; Heitzig, J.; Petoukhov , V .; Kurths, J. Escaping the Curse of Dimensionality in Estimating Multivariate T ransfer Entropy . Physical Review Letters 2012 , 108 , 258701. 41. W ibral, M.; W ollstadt, P .; Meyer , U.; Pampu, N.; Priesemann, V .; V icente, R. Revisiting W iener ’s Principle of Causality - Interaction-Delay Reconstruction Using T ransfer Entropy and Multivariate Analysis on Delay-weighted Graphs. 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society , 2012, pp. 3676–3679. 42. Kugiumtzis, D. Direct Coupling Information Measure fr om Non-uniform Embedding. Physical Review E 2013 , 87 , 062918. 43. Siggiridou, E.; Kugiumtzis, D. Granger Causality in Multivariate T ime Series Using a T ime-Order ed Restricted V ector Autoregressive Model. IEEE T ransactions on Signal Processing 2016 , 64 , 1759–1773. 44. Smirnov , D.; Andrzejak, R. Detection of weak Directional Coupling: Phase-Dynamics Approach versus State-Space Approach. Physical Review E 2005 , 71 , 036207. 45. Papana, A.; Kugiumtzis, D.; Larsson, P .G. Reducing the Bias of Causality Measures. Physical Review E 2011 , 83 , 036207. 46. Silfverhuth, M.J.; Hintsala, H.; Kortelainen, J.; Seppanen, T . Experimental Comparison of Connectivity Measures with Simulated EEG Signals. Medical & Biological Engineering & Computing 2012 , 50 , 683–688. 47. Fasoula, A.; Attal, Y .; Schwartz, D.; Ding, M.; Liang, H. Comparative Performance Evaluation of Data-Driven Causality Measures Applied to Brain Networks. Journal of Neuroscience Methods 2013 , 215 , 170–189. 48. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Simulation Study of Dir ect Causality Measur es in Multivariate T ime Series. Entropy 2013 , 15 , 2635–2661. 49. Zaremba, A.; Aste, T . Measures of Causality in Complex Datasets with Application to Financial Data. Entropy 2014 , 16 , 2309–2349. 24 of 26 50. Cui, D.; Li, X. Multivariate EEG Synchronization Strength Measur es. Signal Processing in Neuroscience; Li, X., Ed. Springer , 2016, pp. 235–259. 51. Bakhshayesh, H.; Fitzgibbon, S.P .; Janani, A.S.; Gr ummett, T .S.; Pope, K.J. Detecting Connectivity in EEG: A Comparative Study of Data-Driven Effective Connectivity Measures. Computers in Biology and Medicine 2019 , 111 , 103329. 52. Blinowska, K.J. Review of the Methods of Determination of Directed Connectivity from Multichannel Data. Medical & Biological Engineering & Computing 2011 , 49 , 521–529. 53. Astolfi, L.; Cincotti, F .; Mattia, D.; Lai, M.; Baccala, L.; De V ico Fallani, F .; Salinari, S.; Ursino, M.; Zavaglia, M.; Babiloni, F . Comparison of Differ ent Multivariate Methods for the Estimation of Cortical Connectivity: Simulations and Applications to EEG Data. 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, 2005, pp. 4484–4487. 54. Florin, E.; Gross, J.; Pfeifer , J.; Fink, G.; T immermann, L. Reliability of Multivariate Causality Measures for Neural Data. Journal of Neuroscience Methods 2011 , 198 , 344–358. 55. W u, M.H.; Frye, R.E.; Zouridakis, G. A Comparison of Multivariate Causality based Measures of Effective Connectivity . Computers in Biology and Medicine 2011 , 21 , 1132–1141. 56. Sommariva, S.; Sorrentino, A.; Piana, M.; Pizzella, V .; Marzetti, L. A Comparative Study of the Robustness of Frequency-Domain Connectivity Measur es to Finite Data Length. Brain T opography 2019 , 32 , 675–695. 57. Diks, C.; Fang, H. T ransfer Entropy for Nonparametric Granger Causality Detection: An Evaluation of Different Resampling Methods. Entropy 2017 , 19 . 58. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Assessment of Resampling Methods for Causality T esting: A Note on the US Inflation Behavior . PLoS ONE 2017 , 12 . 59. Moharramipour , A.; Mostame, P .; Hossein-Zadeh, G.A.; Wheless, J.; Babajani-Feremi, A. Comparison of Statistical T ests in Effective Connectivity Analysis of ECoG Data. Journal of Neuroscience Methods 2018 , 308 , 317–329. 60. Siggiridou, E.; Koutlis, C.; T simpiris, A.; Kimiskidis, V .K.; Kugiumtzis, D. Causality Networks from Multivariate T ime Series and Application to Epilepsy . 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, pp. 4041–4044. 61. Cui, J.; Xu, L.; Br essler , L.; Ding, M.; Liang, H. BSMAR T: A Matlab/C T oolbox for Analysis of Multichannel Neural T ime Series. Neural Networks 2008 , 21 , 1094–1104. 62. Seth, A.K. A MA TLAB T oolbox for Granger Causal Connectivity Analysis. Journal of Neuroscience Methods 2010 , 186 , 262 – 273. 63. Niso, G.; Bruña, R.; Pereda, E.; Gutiérrez, R.; Bajo, R.; Maesté, F .; Del-Pozo, F . HERMES: T owards an Integrated T oolbox to Characterize Functional and Effective Brain Connectivity . Neuroinformatics 2013 , 11 , 405–434. 64. Lizier , J. JIDT: An Information-Theoretic T oolkit for Studying the Dynamics of Complex Systems. Frontiers in Robotics and AI 2014 , 1 . 65. Barnett, L.; Seth, K.A. The MVGC Multivariate Granger Causality T oolbox: A New Approach to Granger -Causal Inference. Journal of Neuroscience Methods 2014 , 223 , 50–68. 66. Montalto, A.; Faes, L.; Marinazzo, D. MuTE: A MA TLAB T oolbox to Compar e Established and Novel Estimators of the Multivariate T ransfer Entropy . PLoS ONE 2014 , 9 . 67. Lachaux, J.P .; Rodriguez, E.; Martinerie, J.; V arela, F .J. Measuring Phase Synchrony in Brain Signals. Human Brain Mapping 1999 , 8 , 194–208. 68. Stam, C.J.; Nolte, G.; Daffertshofer , A. Phase Lag Index: Assessment of Functional Connectivity from Multi Channel EEG and MEG with Diminished Bias from Common Sources. Human Brain Mapping 2007 , 28 , 1178–1193. 69. V inck, M.; Oostenveld, R.; V an W ingerden, M.; Battaglia, F .; Pennartz, C.M.A. An Improved Index of Phase-Synchronization for Electrophysiological Data in the Presence of V olume-Conduction, Noise and Sample-Size Bias. Neuroimage 2011 , 55 , 1548–1565. 25 of 26 70. T ass, P .; Rosenblum, M.G.; W eule, J.; Kurths, J.; Pikovsky , A.; V olkmann, J.; Schnitzler , A.; Fr eund, H.J. Detection of n:m Phase Locking from Noisy Data: Application to Magnetoencephalography . Physical Review Letters 1998 , 81 , 3291–3294. 71. Mormann, F .; Lehnertz, K.; David, P .; Elger , C.E. Mean Phase Coherence as a Measure for Phase Synchronization and its Application to the EEG of Epilepsy Patients. Physica D 2000 , 144 , 358–369. 72. Guo, S.; Seth, A.K.; Kendrick, K.; Zhou, C.; Feng, J. Partial Granger Causality – Eliminating Exogenous Inputs and Latent V ariables. Journal of Neuroscience Methods 2008 , 172 , 79–93. 73. Hatemi-J, A.; Hacker , R.S. Can the LR T est Be Helpful in Choosing the Optimal Lag Order in the V AR Model when Information Criteria Suggest Differ ent Lag Orders?. Applied Economics 2009 , 41 , 1121-1125. 74. Bruns, S. B; Stern, D. I. Lag Length Selection and P-hacking in Granger Causality T esting: Prevalence and Performance of Meta-regr ession Models. Empirical Economics 2019 , 56 , 797-830. 75. Papana, A.; Kugiumtzis, D.; Larsson, P .G. Detection of Direct Causal Effects and Application in the Analysis of Electroencephalograms fr om Patients with Epilepsy . International Journal of Bifurcation and Chaos 2012 , 22 , 1250222. 76. Staniek, M.; Lehnertz, K. Symbolic T ransfer Entropy . Physical Review Letters 2008 , 100 . 77. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Detecting Causality in Non-stationary T ime Series Using Partial Symbolic T ransfer Entropy: Evidence in Financial Data. Computational Economics 2015 , pp. 1–25. doi:10.1007/s10614-015-9491-x. 78. Kugiumtzis, D. T ransfer Entropy on Rank V ectors. Journal of Nonlinear Systems and Applications 2012 , 3 , 73–81. 79. Kugiumtzis, D. Partial T ransfer Entropy on Rank V ectors. European Physical Journal Special T opics 2013 , 222 , 401–420. 80. Baccala, L.A.; Sameshima, K.; T akahashi, D.Y . Generalized Partial Directed Coherence. 2007 15th International Conference on Digital Signal Pr ocessing, 2007, pp. 163–166. 81. Korzeniewska, A.; Manczak, M.and Kaminski, M.; Blinowska, K.; Kasicki, S. Determination of Information Flow Direction between Brain Structur es by a Modified Directed T ransfer Function Method (dDTF). Journal of Neuroscience Methods 2003 , 125 , 195–207. 82. Hu, S.; Dai, G.; W orrell, G.A.; Dai, Q.; Liang, H. Causality Analysis of Neural Connectivity: Critical Examination of Existing Methods and Advances of New Methods. IEEE T ransactions on Neural Networks 2011 , 22 , 829–844. 83. Siggiridou, E.; Kimiskidis, V .K.; Kugiumtzis, D. Dimension Reduction of Frequency-Based Direct Granger Causality Measures on Short T ime Series. Journal of Neur oscience Methods 2017 , 289 , 64–74. 84. Quian Quiroga, R.; Kreuz, T .; Grassberger , P . Event Synchronization: A Simple and Fast Method to Measure Synchronicity and T ime Delay Patterns. Physical Review E 2002 , 66 , 041904. 85. Paluš, M.; Komárek, V .; Hrncír , Z.; Šterbová, K. Synchronization as adjustment of information rates: detection from bivariate time series. Physical Review E 2001 , 63 , 046211. 86. V erdes, P . Assessing causality from multivariate time series. Phys. Rev . E 2005 , 72 , 026222. 87. V akorin, V .A.; Krakovska, O.A.; McIntosh, A.R. Confounding Effects of Indirect Connections on Causality Estimation. Journal of Neuroscience Methods 2009 , 184 , 152–160. 88. Paluš, M.; Stefanovska, A. Direction of Coupling from Phases of Interacting Oscillators: An Information-Theoretic Appr oach. Physical Review E 2003 , 67 , 055201. 89. Chicharro, D.; Andrzejak, R. Reliable Detection of Directional Couplings Using Rank Statistics. Physical Review E 2009 , 80 , 026217. 90. Sugihara, G.; May , R.; Y e, H.; Hsieh, C.; Deyle, E.; Fogarty , M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012 , 338 , 496–500. 91. Y u, G.H.; Huang, C.C. A Distribution Free Plotting Position. Stochastic Environmental Research and Risk Assessment 2001 , 15 , 462 – 476. 92. Benjamini, Y .; Hochberg, Y . Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple T esting. Journal of the Royal Statistical Society. Series B (Methodological) 1995 , 57 , 289–300. 93. Matthews, B.W . Comparison of the Predicted and Observed Secondary Structure of T4 Phage L ysozyme. Biochimica et Biophysica Acta 1975 , 405 , 442–451. 26 of 26 94. Lansdown, F . Ordinal Ranking Methods for Multicriterion Decision Making. Naval Research Logistics (NRL) 1998 , 43 , 613–627. 95. Kugiumtzis, D. Direct-Coupling Information Measure fr om Nonuniform Embedding. Physical Review E 2013 , 87 , 062918. 96. Senthilkumar , D.V .; Lakshmanan, M.; Kurths, J. T ransition from Phase to Generalized Synchronization in T ime-Delay Systems. Chaos: An Interdisciplinary Journal of Nonlinear Science 2008 , 18 , 023118. 97. W endling, F .; Bellanger , J.; Bartolomei, F .; Chauvel, P . Relevance of Nonlinear Lumped-Parameter Models in the Analysis of Depth-EEG Epileptic Signals. Biological Cybernetics 2000 , 83 , 367–378. 98. Koutlis, C.; Kugiumtzis, D. Discrimination of Coupling Structur es Using Causality Networks from Multivariate T ime Series. Chaos: An Interdisciplinary Journal of Nonlinear Science 2016 , 26 . 99. Basu, S.; Michailidis, G. Regularized Estimation in Sparse High-Dimensional T ime Series Models. Annals of Statistics 2015 , 43 , 1535–1567. 100. Kugiumtzis, D. State Space Reconstruction Parameters in the Analysis of Chaotic T ime Series - the Role of the T ime W indow Length. Physica D 1996 , 95 , 13–28.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment