Scalable Probabilistic Causal Structure Discovery

Scalable Probabilistic Causal Structure Discovery

Dhanya Sridhar, Jay Pujara, Lise Getoor

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 5112-5118. https://doi.org/10.24963/ijcai.2018/709

Complex causal networks underlie many real-world problems, from the regulatory interactions between genes to the environmental patterns used to understand climate change. Computational methods seek to infer these causal networks using observational data and domain knowledge. In this paper, we identify three key requirements for inferring the structure of causal networks for scientific discovery: (1) robustness to noise in observed measurements; (2) scalability to handle hundreds of variables; and (3) flexibility to encode domain knowledge and other structural constraints. We first formalize the problem of joint probabilistic causal structure discovery.  We develop an approach using probabilistic soft logic (PSL) that exploits multiple statistical tests, supports efficient optimization over hundreds of variables, and can easily incorporate structural constraints, including imperfect domain knowledge. We compare our method against multiple well-studied approaches on biological and synthetic datasets, showing improvements of up to 20% in F1-score over the best performing baseline in realistic settings.
Keywords:
Knowledge Representation and Reasoning: Action, Change and Causality
Uncertainty in AI: Bayesian Networks
Machine Learning: Learning Graphical Models
Constraints and SAT: Constraints and Data Mining ; Machine Learning