research-article

Public Access

Generalized Score Functions for Causal Discovery

Authors:
Biwei Huang

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Kun Zhang

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Yizhu Lin

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Bernhard Schölkopf

MPI for Intelligent Systems, Tuebingen, Germany

MPI for Intelligent Systems, Tuebingen, Germany
View Profile

,
Clark Glymour

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2018Pages 1551–1560https://doi.org/10.1145/3219819.3220104

Published:19 July 2018Publication History

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1551–1560

ABSTRACT

Discovery of causal relationships from observational data is a fundamental problem. Roughly speaking, there are two types of methods for causal discovery, constraint-based ones and score-based ones. Score-based methods avoid the multiple testing problem and enjoy certain advantages compared to constraint-based ones. However, most of them need strong assumptions on the functional forms of causal mechanisms, as well as on data distributions, which limit their applicability. In practice the precise information of the underlying model class is usually unknown. If the above assumptions are violated, both spurious and missing edges may result. In this paper, we introduce generalized score functions for causal discovery based on the characterization of general (conditional) independence relationships between random variables, without assuming particular model classes. In particular, we exploit regression in RKHS to capture the dependence in a nonparametric way. The resulting causal discovery approach produces asymptotically correct results in rather general cases, which may have nonlinear causal mechanisms, a wide class of data distributions, mixed continuous and discrete data, and multidimensional variables. Experimental results on both synthetic and real-world data demonstrate the efficacy of our proposed approach.

Supplemental Material

huang_causal_discovery.mp4

mp4

408 MB

Download

References

C. F. Aliferis, A. R. Statnikov, I. Tsamardinos, S. Mani, and X. D. Koutsoukos . 2010. Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation. Journal of Machine Learning Research Vol. 11 (2010), 171--234. Google ScholarDigital Library
F. R. Bach and M. I. Jordan . 2002. Learning graphical models with Mercer kernels. Advances in Neural Information Processing Systems (2002), 1009--1016. Google ScholarDigital Library
T. E. Bakken, A. M. Dale, and N. J. Schork . 2011. A Geographic Cline of Skull and Brain Morphology among Individuals of European Ancestry. Hum Hered Vol. 72(1) (2011), 35--44.Google Scholar
P Bühlmann, J. Peters, and J. Ernest . 2014. CAM: Causal Additive Models, high-dimensional order search and penalized regression. Annals of Statistics Vol. 42(6) (2014), 2526--2556.Google ScholarCross Ref
W. Buntine . 1991. Theory refinment on Bayesian networks. Uncertainty in Artificial Intelligence (1991), 52--60. Google ScholarDigital Library
A. Caponnetto and E. De Vito . 2006. Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics (2006).Google Scholar
D. M. Chickering . 2003. Optimal Structure Identification With Greedy Search. Journal of Machine Learning Research Vol. 3 (2003), 507--554. Google ScholarDigital Library
D. M. Chickering and D. Heckerman . 1997. Efficient approximations for the marginal likelihood of bayesian networks with hidden variables. Machine Learning Vol. 29 (1997), 181--212. Google ScholarDigital Library
T. Claassen and T. Heskes . 2012. A Bayesian approach to constraint based causal inference. Uncertainty in Artificial Intelligence (2012), 207--216. Google ScholarDigital Library
K. Fukumizu, F. R. Bach, and M. I. Jordan . 2004. Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. Journal of Machine Learning Research Vol. 5 (2004), 73--79. Google ScholarDigital Library
K. Fukumizu, A. Gretton, X. Sun, and B. Schölkopf . 2007. Kernel measures of conditional dependence. NIPS Vol. 11 (2007), 489--496. Google ScholarDigital Library
D. Geiger and D. Heckerman . 1994. Learning Gaussian networks. In Proceedings of Tenth Conference on Uncertainty in Artificial Intelligence (1994), 235 --243. Google ScholarDigital Library
D. Heckerman, D. Geiger, and D.M. Chickering . 1995. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning Vol. 20 (1995), 197--243. Google ScholarDigital Library
D. Heckerman, C. Meek, and G. Cooper . 2006. A Bayesian approach to causal discovery. Innovations in Machine Learning (2006), 1--28.Google Scholar
P. Hoyer, D. Janzing, J. Mooji, Peters J., and B. Schölkopf . 2009. Nonlinear causal discovery with additive noise models. NIPS (2009). Google ScholarDigital Library
B. Huang, K. Zhang, J. Zhang, R. Sanchez-Romero, C. Glymour, and B. Schölkopf . 2017. Behind Distribution Shift: Mining Driving Forces of Changes and Causal Arrows. ICDM (2017), 913--918.Google Scholar
A. Hyttinen, F. Eberhardt, and M. J"arvisalo . 2014. Constraint-based causal discovery: Conflict resolution with answer set programming. Uncertainty in Artificial Intelligence (2014), 340--349. Google ScholarDigital Library
A. Hyv"arinen and S.n M. Smith . 2013. Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research Vol. 14 (2013), 111--152. Google ScholarDigital Library
S. Imoto, T. Goto, and S. Miyano . 2002. Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing (2002), 175--186.Google Scholar
M. V. D. Laan, S. Dudoit, and S. Keles . 2004. Asymptotic optimality of likelihood-based cross-validation. Statistical Applications in Genetics and Molecular Biology Vol. 3(1) (2004), 1--23.Google Scholar
S. Meiri and T. Dayan . 2003. On the validity of Bergmann's rule. Journal of Biogeography Vol. 30(3) (2003), 331--351.Google ScholarCross Ref
J. Pearl . 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press New York. Google ScholarDigital Library
A. N.V. Ruigrok, G. S. Khorshidi, M. Lai, S. B. Cohen, M. V. Lombardo, R. J. Tait, and J. Suckling . 2014. A meta-analysis of sex differences in human brain structure. Neuroscience and Biobehavioral Reviews Vol. 39 (2014), 34--50.Google ScholarCross Ref
B. Schölkopf and A. J. Smola . 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA. Google ScholarDigital Library
G. E. Schwarz . 1978. Estimating the dimension of a model. Annals of Statistics Vol. 6(2) (1978), 461--464.Google ScholarCross Ref
E. Sokolova, P. Groot, T. Claassen, and T. Heskes . 2014. Causal discovery from databases with discrete and continuous variables. Workshop on Probabilistic Graphical Models (2014), 442--457.Google ScholarCross Ref
P. Spirtes . 2010. Introduction to Causal Inference. Journal of Machine Learning Research Vol. 11 (2010), 1643--1662. Google ScholarDigital Library
P. Spirtes, C. Glymour, and R. Scheines . 1993. Causation, Prediction, and Search. Spring-Verlag Lectures in Statistics.Google Scholar
P. Spirtes and K. Zhang . 2016. Causal discovery and inference: Concepts and recent methodological advances. Applied Informatics Vol. 3(3) (2016).Google Scholar
M. Springmann, D. Mason-DCroz, S. Robinson, P. Ballon, T. Garnett, and C. Godfray . 2016. The global and regional health impacts of future food production under climate change. The Lancet Vol. 387 (10031) (2016), 1937--1946.Google Scholar
I. Tsamardinos, L. E. Brown, and C. F. Aliferis . 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning Vol. 65(1) (2006), 31--78. Google ScholarDigital Library
K. Zhang, B. Huang, J. Zhang, C. Glymour, and B. Schölkopf . 2017. Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. IJCAI (2017). Google ScholarDigital Library
K Zhang and A. Hyv"arinen . 2009 a. Causality discovery with additive disturbances: An information-theoretical perspective. Machine learning and knowledge discovery in databases (2009), 570--585.Google Scholar
K. Zhang and A. Hyv"arinen . 2009 b. On the identifiability of the post-nonlinear causal model. UAI (2009), 647--655. Google ScholarDigital Library
K. Zhang, J. Peters, D. Janzing, and B. Schölkopf . 2011. Kernel-based conditional independence test and application in causal discovery. Uncertainty in Artificial Intelligence (2011), 804--813. Google ScholarDigital Library
K. Zhang, B. Schölkopf, P. Spirtes, and C. Glymour . 2018. Learning causality and causality-related learning: some recent progress. National Science Review Vol. 5(1) (2018), 26--29.Google Scholar

Index Terms

Generalized Score Functions for Causal Discovery
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Kernel methods

Recommendations

Generalised Partial Association in Causal Rules Discovery
Progress in Artificial Intelligence
Abstract
One of the most significant challenges for machine learning nowadays is the discovery of causal relationships from data. This causal discovery is commonly performed using Bayesian like algorithms. However, more recently, more and more causal ...
Read More
Causal Discovery via Causal Star Graphs
Discovering causal relationships among observed variables is an important research focus in data mining. Existing causal discovery approaches are mainly based on constraint-based methods and functional causal models (FCMs). However, the constraint-based ...
Read More
Disentangling causality: assumptions in causal discovery and inference
Abstract
Causality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 4,817
  Total Downloads
- Downloads (Last 12 months)1,024
- Downloads (Last 6 weeks)139
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Generalized Score Functions for Causal Discovery

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Generalised Partial Association in Causal Rules Discovery

Causal Discovery via Causal Star Graphs

Disentangling causality: assumptions in causal discovery and inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Generalized Score Functions for Causal Discovery

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Generalised Partial Association in Causal Rules Discovery

Causal Discovery via Causal Star Graphs

Disentangling causality: assumptions in causal discovery and inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media