The causal relationships determining the behaviour of a system under study are inherently directional: by manipulating a cause we can control its effect, but an effect cannot be used to control its cause. Understanding the network of causal relationships is necessary, for example, if we want to predict the behaviour in settings where the system is subject to different manipulations. However, we are rarely able to directly observe the causal processes in action; we only see the statistical associations they induce in the collected data. This thesis considers the discovery of the fundamental causal relationships from data in several different learning settings and under various modeling assumptions. Although the research is mostly theoretical, possible application areas include biology, medicine, economics and the social sciences.
Latent confounders, unobserved common causes of two or more observed parts of a system, are especially troublesome when discovering causal relations. The statistical dependence relations induced by such latent confounders often cannot be distinguished from directed causal relationships. Possible presence of feedback, that induces a cyclic causal structure, provides another complicating factor. To achieve informative learning results in this challenging setting, some restricting assumptions need to be made. One option is to constrain the functional forms of the causal relationships to be smooth and simple. In particular, we explore how linearity of the causal relations can be effectively exploited. Another common assumption under study is causal faithfulness, with which we can deduce the lack of causal relations from the lack of statistical associations. Along with these assumptions, we use data from randomized experiments, in which the system under study is observed under different interventions and manipulations.
In particular, we present a full theoretical foundation of learning linear cyclic models with latent variables using second order statistics in several experimental data sets. This includes sufficient and necessary conditions on the different experimental settings needed for full model identification, a provably complete learning algorithm and characterization of the underdetermination when the data do not allow for full model identification. We also consider several ways of exploiting the faithfulness assumption for this model class. We are able to learn from overlapping data sets, in which different (but overlapping) subsets of variables are observed. In addition, we formulate a model class called Noisy-OR models with latent confounding. We prove sufficient and worst case necessary conditions for the identifiability of the full model and derive several learning algorithms. The thesis also suggests the optimal sets of experiments for the identification of the above models and others. For settings without latent confounders, we develop a Bayesian learning algorithm that is able to exploit non-Gaussianity in passively observed data.
Last updated on 19 Apr 2013 by Noora Suominen de Rios - Page created on 19 Apr 2013 by Noora Suominen de Rios