Typical Methods
Network Structure Reconstruction
There is a wide variety of network structure reconstruction algorithm from observed time series, stemming from applied neuroscience, but also machine learning and econometrics, which have fueled the area of causal inference from temporal data with numerous novel techniques (Popescu-Guyon, 2013). Briefly, despite the 20th century rise to prominence of statistics, initially intended to resolve causal quandaries in agricultural and industrial process refinement, the field of statistical causal inference is relatively young. Although its pioneers have received wide praise (Clive Granger receiving the Nobel Prize and Judea Pearl receiving the ACM Turing Award) the methods they have developed are not yet widely known and are still subject to refinement. Even though one of the least controversial necessary criterion of establishing a cause-effect relationship is temporal precedence, many causal inference algorithms do not require time information and establish possible causal relations among observations on other grounds, based on conditional independence testing (Pearl, 2000), or, more recently, based on statistics of the joint distribution of pairs of variables (Cause-effect, 2008-2011). The work of Clive Granger, built upon the 20th century development of time series modeling in engineering and economics, with some input from physiology, leads to a framework which admittedly does not allow us to identify causality unequivocally, but has received a lot of attention because of the simplicity of the method and practical successes obtained in econometrics and neuroscience (Popescu-Guyon, 2013).
The basic idea behind Granger causality to test whether observations of time series of two variables A and B are symptomatic of an underlying process “A causes B” rather than “B causes A”, is to fit various predictive models A(present time) and B(present time) as a function of A(past times) and B(past times). Clues are obtained if A can be better predicted from past values of A and B rather than from A itself but B cannot be predicted from past values of A and B rather than from B itself. Numerous improved methods have been derived, incorporating, for instance, frequency domain analysis in lieu of time domain analyses (Nolte-Muller, 2010). One recent idea is to add contemporaneous values of B to predict A and vice versa to take into account instantaneous causal effect, due for instance to insufficient time resolution (Moneta-Spirtes, 2005). In neuroscience, simple linear auto-regressive (AR) models underlying Granger causality do not capture well the complexity of neural signals. A non-linear version of Granger causality called Transfer Entropy (Schreiber, 2000), which reduces to Granger causality for simple AR models (Barnett et al., 2009) is gaining popularity (Wibral, 2011; Battaglia, 2012; Stetter, 2012).
It is well known that causal relationships can be confounded: the fact that A and B are correlated or co-variant does mean that A and B are in a causal relationship: there may be a third common cause C. A typical way of alleviating the problem of false positive causal relationships is to perform conditional independence tests. If A and B are independent given C, the existence of a direct causal relationship between A and B is ruled out and the remaining possibilities are A->C->B, B->C->A or A<-C->B. However, one of the greatest challenges that network structure reconstruction methods have to face is the “curse of dimensionality”. With the explosion of the number of variables it becomes quickly impractical to reliably conduct conditional independence tests, with require a number of samples exponential in the number of variables jointly tested. Moreover, it is practically never possible to record all the neurons of a network (with fluorescence methods for instance, some neurons may be invisible or not marked). Hence it is likely that one would violate the assumption of “causal sufficiency” (no neuron that influences two observed neurons is unobserved), which often made by methods relying on conditional independence tests.
Another approach, which is not limited to statistics of pairs of variables, is to use score-based methods, by performing a search in the space of all possible architectures, guided by an objective function assessing the goodness of signal reconstruction (possibly penalized to favor sparse connectivity). Such methods include Bayesian approaches such as Dynamical Causal Modeling (DCM) (Friston, 2003), which compare data generating models formulated in terms of differential equations modeling the dynamics of hidden states in the nodes of a probabilistic graphical model, where conditional dependencies are parameterized in terms of directed effective connections. Other related methods include L1 and/or L2 penalized regression methods (Ryali, 2012).
Another possible remedy to the problem of confounding, which attacks the problem of the curse of dimensionality from a different angle, is to recourse to conditioning on the average activity of the population of nearby neurons rather than on combinations of single neurons, then rely only on statistics of joint activity of pairs of neurons (Stetter, 2012; Steinmetz, 2013). The promising results of the “Cause-Effect Pairs” challenge that we recently organized demonstrates that we can go a long way to infer causal relationships from pairs of variables, without conditioning on other variables (Guyon et al, 2013b). The Area under the ROC Curve (AUC) of the top ranking participants exceeded 0.8 on a combination of real and artificial data (an AUC of 0.5 is obtained for random guesses and the perfect score is 1). The methods used by the participants are model-free. They exploit features of the joint distribution of two variables, some of which are derived from information theoretic principles. The predictions are made with pattern recognition algorithms trained on thousands of examples of cause-effect pairs. The challenge was limited to data samples not time-ordered. For this reason, we are planning a new Cause-Effect Pairs Challenge for Time-Series Data, which will be held in conjunction with the proposed challenge (but is not part of this proposal). The hope is that such techniques could be applied with success to our new network structure reconstruction challenge, possibly even reaching better performance after preprocessing by conditioning on average activity of nearby neurons and by exploiting the availability of time ordering of samples.
Software and other resources
Sample code: Code we provide to get you started.
Network reconstruction: A web site providing a comparison of various network reconstruction methods.
Jovo's webpage: Several useful papers and free code.