Causal inference in highly parameterized ML

September 17, 2020 — February 19, 2025

algebra

graphical models

hidden variables

hierarchical models

how do science

machine learning

networks

neural nets

probability

statistics

Suspiciously similar content

Applying a causal graph structure in the challenging environment of a no-holds-barred nonparametric machine learning algorithm such as a neural net or its ilk. I am interested in this because it seems necessary and kind of obvious for handling things like dataset shift, but is often ignored. What is that about?

I do not know at the moment. This is a link salad for now.

For a good example application, see Bernhard Schölkopf: Causality and Exoplanets.

Perhaps I should start with the book (Peters, Janzing, and Schölkopf 2017), or the chatty casual introduction (Schölkopf 2022).

See also the brain salad graphical models and supervised models.

1 With agency

See causality, agency, decisions.

2 Invariances implied by causality

Léon Bottou, From Causal Graphs to Causal Invariance:

For many problems, it’s difficult to even attempt drawing a causal graph. While structural causal models provide a complete framework for causal inference, it is often hard to encode known physical laws (such as Newton’s gravitation, or the ideal gas law) as causal graphs. In familiar machine learning territory, how does one model the causal relationships between individual pixels and a target prediction? This is one of the motivating questions behind the paper Invariant Risk Minimization (IRM). In place of structured graphs, the authors elevate invariance to the defining feature of causality.

He commends the Cloudera Fast Forward tutorial Causality for Machine Learning, which is a nice bit of applied work.

3 Causality for feedback and continuous fields

See causality under feedback.

4 Double learning

See Double learning.

5 As “Deep Causality”

Not sure what this is yet (Berrevoets et al. 2024; Deng et al. 2022; Lagemann et al. 2023). End-to-end deep learning of graphical structures?

6 Benchmarking

CauseMe: A platform to benchmark causal discovery methods (Runge et al. 2019)

Detecting causal associations in time series datasets is a key challenge for novel insights into complex dynamical systems such as the Earth system or the human brain. Interactions in such systems present a number of major challenges for causal discovery techniques and it is largely unknown which methods perform best for which challenge.

The CauseMe platform provides ground truth benchmark datasets featuring different real data challenges to assess and compare the performance of causal discovery methods. The available benchmark datasets are either generated from synthetic models mimicking real challenges, or are real world data sets where the causal structure is known with high confidence. The datasets vary in dimensionality, complexity and sophistication.

7 Tooling

7.1 Dowhy

py-why/dowhy: DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

7.2 Causalnex

quantumblacklabs/causalnex: A Python library that helps data scientists to infer causation rather than observing correlation.

CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.

7.3 caus2e

MLResearchAtOSRAM/cause2e: The cause2e package provides tools for performing an end-to-end causal analysis of your data.

The main contribution of cause2e is the integration of two established causal packages that have currently been separated and cumbersome to combine:

Causal discovery methods from the py-causal package, which is a Python wrapper around parts of the Java TETRAD software. It provides many algorithms for learning the causal graph from data and domain knowledge.

Causal reasoning methods from the DoWhy package, which is the current standard for the steps of a causal analysis starting from a known causal graph and data

7.4 TETRAD

TETRAD (source, tutorial) is a tool for discovering and visualizing and calculating giant empirical DAGs, including general graphical inference and causality. It’s written by eminent causality inference people.

Tetrad is a program which creates, simulates data from, estimates, tests, predicts with, and searches for causal and statistical models. The aim of the program is to provide sophisticated methods in a friendly interface requiring very little statistical sophistication of the user and no programming knowledge. It is not intended to replace flexible statistical programming systems such as Matlab, Splus or R. Tetrad is freeware that performs many of the functions in commercial programs such as Netica, Hugin, LISREL, EQS and other programs, and many discovery functions these commercial programs do not perform. …

The Tetrad programs describe causal models in three distinct parts or stages: a picture, representing a directed graph specifying hypothetical causal relations among the variables; a specification of the family of probability distributions and kinds of parameters associated with the graphical model; and a specification of the numerical values of those parameters.

py-causal is a wrapper around TETRAD for python, and R-causal for R.

8 Incoming

Nisha Muktewar and Chris Wallace, Causality for Machine Learning is the book Bottou recommends on this theme.
For coders, Ben Dickson writes on Why machine learning struggles with causality.
Cheng Soon Ong recommends Finn Lattimore to me as an important perspective.
biomedia-mira/deepscm: Repository for Deep Structural Causal Models for Tractable Counterfactual Inference (Pawlowski, Coelho de Castro, and Glocker 2020).
ICML 2022 Tutorial on causality and deep learning
Causality and Deep Learning: Synergies, Challenges and the Future Tutorial

9 References

Arjovsky, Bottou, Gulrajani, et al. 2020. “Invariant Risk Minimization.”

Athey, and Wager. 2019. “Estimating Treatment Effects with Causal Forests: An Application.” arXiv:1902.07409 [Stat].

Bareinboim, Correa, Ibeling, et al. 2022. “On Pearl’s Hierarchy and the Foundations of Causal Inference.” In Probabilistic and Causal Inference: The Works of Judea Pearl.

Berrevoets, Kacprzyk, Qian, et al. 2024. “Causal Deep Learning.”

Besserve, Mehrjou, Sun, et al. 2019. “Counterfactuals Uncover the Modular Structure of Deep Generative Models.” In arXiv:1812.03253 [Cs, Stat].

Bishop. 2021. “Artificial Intelligence Is Stupid and Causal Reasoning Will Not Fix It.” Frontiers in Psychology.

Black, Koepke, Kim, et al. 2023. “Less Discriminatory Algorithms.” SSRN Scholarly Paper.

Bongers, Forré, Peters, et al. 2021. “Foundations of Structural Causal Models with Cycles and Latent Variables.” The Annals of Statistics.

Bongers, and Mooij. 2018. “From Random Differential Equations to Structural Causal Models: The Stochastic Case.” arXiv:1803.08784 [Cs, Stat].

Bongers, Peters, Schölkopf, et al. 2016. “Structural Causal Models: Cycles, Marginalizations, Exogenous Reparametrizations and Reductions.” arXiv:1611.06221 [Cs, Stat].

Christiansen, Pfister, Jakobsen, et al. 2020. “A Causal Framework for Distribution Generalization.”

Deng, Zheng, Tian, et al. 2022. “Deep Causal Learning: Representation, Discovery and Inference.”

Everitt, Carey, Langlois, et al. 2021. “Agent Incentives: A Causal Perspective.” In Proceedings of the AAAI Conference on Artificial Intelligence.

Everitt, Kumar, Krakovna, et al. 2019. “Modeling AGI Safety Frameworks with Causal Influence Diagrams.”

Ferasso, and Alnoor. 2022. “Artificial Neural Network and Structural Equation Modeling in the Future.” In Artificial Neural Networks and Structural Equation Modeling: Marketing and Consumer Research Applications.

Fernández-Loría, and Provost. 2021. “Causal Decision Making and Causal Effect Estimation Are Not the Same… and Why It Matters.” arXiv:2104.04103 [Cs, Stat].

Friedrich, Antes, Behr, et al. 2020. “Is There a Role for Statistics in Artificial Intelligence?” arXiv:2009.09070 [Cs].

Gendron, Witbrock, and Dobbie. 2023. “A Survey of Methods, Challenges and Perspectives in Causality.”

Goyal, Lamb, Hoffmann, et al. 2020. “Recurrent Independent Mechanisms.” arXiv:1909.10893 [Cs, Stat].

Hammond, Fox, Everitt, et al. 2023. “Reasoning about Causality in Games.” Artificial Intelligence.

Hartford, Lewis, Leyton-Brown, et al. 2017. “Deep IV: A Flexible Approach for Counterfactual Prediction.” In Proceedings of the 34th International Conference on Machine Learning.

Huang, Fu, and Franzke. 2020. “Detecting Causality from Time Series in a Machine Learning Framework.” Chaos: An Interdisciplinary Journal of Nonlinear Science.

Hubinger, Jermyn, Treutlein, et al. 2023. “Conditioning Predictive Models: Risks and Strategies.”

Johnson, Duvenaud, Wiltschko, et al. 2016. “Composing Graphical Models with Neural Networks for Structured Representations and Fast Inference.” In Advances in Neural Information Processing Systems 29.

Jordan, Wang, and Zhou. 2022. “Empirical Gateaux Derivatives for Causal Inference.”

Kaddour, Lynch, Liu, et al. 2022. “Causal Machine Learning: A Survey and Open Problems.”

Karimi, Barthe, Schölkopf, et al. 2021. “A Survey of Algorithmic Recourse: Definitions, Formulations, Solutions, and Prospects.”

Karimi, Muandet, Kornblith, et al. 2022. “On the Relationship Between Explanation and Prediction: A Causal View.”

Kirk, Zhang, Grefenstette, et al. 2023. “A Survey of Zero-Shot Generalisation in Deep Reinforcement Learning.” Journal of Artificial Intelligence Research.

Kocaoglu, Snyder, Dimakis, et al. 2017. “CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training.” arXiv:1709.02023 [Cs, Math, Stat].

Kosoy, Chan, Liu, et al. 2022. “Towards Understanding How Machines Can Learn Causal Overhypotheses.”

Künzel, Sekhon, Bickel, et al. 2019. “Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences.

Lagemann, Lagemann, Taschler, et al. 2023. “Deep Learning of Causal Structures in High Dimensions Under Data Limitations.” Nature Machine Intelligence.

Lattimore. 2017. “Learning How to Act: Making Good Decisions with Machine Learning.”

Leeb, Lanzillotta, Annadani, et al. 2021. “Structure by Architecture: Disentangled Representations Without Regularization.” arXiv:2006.07796 [Cs, Stat].

Liao, Chen, Yang, et al. 2020. “Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach.” In Advances in Neural Information Processing Systems.

Li, Dai, Shangguan, et al. 2022. “Causality-Structured Deep Learning for Soil Moisture Predictions.” Journal of Hydrometeorology.

Liu, Zhang, Gong, et al. 2022. “Identifying Latent Causal Content for Multi-Source Domain Adaptation.”

Locatello, Bauer, Lucic, et al. 2019. “Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations.” arXiv:1811.12359 [Cs, Stat].

Locatello, Poole, Raetsch, et al. 2020. “Weakly-Supervised Disentanglement Without Compromises.” In Proceedings of the 37th International Conference on Machine Learning.

Louizos, Shalit, Mooij, et al. 2017. “Causal Effect Inference with Deep Latent-Variable Models.” In Advances in Neural Information Processing Systems 30.

Lu, Wu, Hernández-Lobato, et al. 2021. “Nonlinear Invariant Risk Minimization: A Causal Approach.” arXiv:2102.12353 [Cs, Stat].

Mehta, Albiero, Chen, et al. 2022. “You Only Need a Good Embeddings Extractor to Fix Spurious Correlations.”

Melnychuk, Frauen, and Feuerriegel. 2022. “Causal Transformer for Estimating Counterfactual Outcomes.” In Proceedings of the 39th International Conference on Machine Learning.

Mishler, and Kennedy. 2021. “FADE: FAir Double Ensemble Learning for Observable and Counterfactual Outcomes.” arXiv:2109.00173 [Cs, Stat].

Mooij, Peters, Janzing, et al. 2016. “Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks.” Journal of Machine Learning Research.

Ng, Fang, Zhu, et al. 2020. “Masked Gradient-Based Causal Structure Learning.” arXiv:1910.08527 [Cs, Stat].

Ng, Zhu, Chen, et al. 2019. “A Graph Autoencoder Approach to Causal Structure Learning.” In Advances In Neural Information Processing Systems.

Ortega, Kunesch, Delétang, et al. 2021. “Shaking the Foundations: Delusions in Sequence Models for Interaction and Control.” arXiv:2110.10819 [Cs].

Pawlowski, Coelho de Castro, and Glocker. 2020. “Deep Structural Causal Models for Tractable Counterfactual Inference.” In Advances in Neural Information Processing Systems.

Peters, Bühlmann, and Meinshausen. 2016. “Causal Inference by Using Invariant Prediction: Identification and Confidence Intervals.” Journal of the Royal Statistical Society Series B: Statistical Methodology.

Peters, Janzing, and Schölkopf. 2017. Elements of Causal Inference: Foundations and Learning Algorithms. Adaptive Computation and Machine Learning Series.

Poulos, and Zeng. 2021. “RNN-Based Counterfactual Prediction, with an Application to Homestead Policy and Public Schooling.” Journal of the Royal Statistical Society Series C: Applied Statistics.

Rakesh, Guo, Moraffah, et al. 2018. “Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM ’18.

Richardson, and Robins. 2013. “Single World Intervention Graphs (SWIGs): A Unification of the Counterfactual and Graphical Approaches to Causality.”

Richens, and Everitt. 2024. “Robust Agents Learn Causal World Models.”

Roscher, Bohn, Duarte, et al. 2020. “Explainable Machine Learning for Scientific Insights and Discoveries.” IEEE Access.

Rotnitzky, and Smucler. 2020. “Efficient Adjustment Sets for Population Average Causal Treatment Effect Estimation in Graphical Models.” Journal of Machine Learning Research.

Rubenstein, Bongers, Schölkopf, et al. 2018. “From Deterministic ODEs to Dynamic Structural Causal Models.” In Uncertainty in Artificial Intelligence.

Runge, Bathiany, Bollt, et al. 2019. “Inferring Causation from Time Series in Earth System Sciences.” Nature Communications.

Schölkopf. 2022. “Causality for Machine Learning.” In Probabilistic and Causal Inference: The Works of Judea Pearl.

Schölkopf, Locatello, Bauer, et al. 2021. “Toward Causal Representation Learning.” Proceedings of the IEEE.

Shalit, Johansson, and Sontag. 2017. “Estimating Individual Treatment Effect: Generalization Bounds and Algorithms.” arXiv:1606.03976 [Cs, Stat].

Shi, Blei, and Veitch. 2019. “Adapting Neural Networks for the Estimation of Treatment Effects.” In Proceedings of the 33rd International Conference on Neural Information Processing Systems.

Simchoni, and Rosset. 2023. “Integrating Random Effects in Deep Neural Networks.”

Tigas, Annadani, Jesson, et al. 2022. “Interventions, Where and How? Experimental Design for Causal Models at Scale.” Advances in Neural Information Processing Systems.

Veitch, and Zaveri. 2020. “Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding.”

Vowels, Camgoz, and Bowden. 2022. “D’ya Like DAGs? A Survey on Structure Learning and Causal Discovery.” ACM Computing Surveys.

Wang, Lijing, Adiga, Chen, et al. 2022. “CausalGNN: Causal-Based Graph Neural Networks for Spatio-Temporal Epidemic Forecasting.” Proceedings of the AAAI Conference on Artificial Intelligence.

Wang, Yixin, and Jordan. 2021. “Desiderata for Representation Learning: A Causal Perspective.” arXiv:2109.03795 [Cs, Stat].

Wang, Sifan, Sankaran, and Perdikaris. 2022. “Respecting Causality Is All You Need for Training Physics-Informed Neural Networks.”

Wang, Yuhao, Solus, Yang, et al. 2017. “Permutation-Based Causal Inference Algorithms with Interventions.”

Wang, Xingqiao, Xu, Tong, et al. 2021. “InferBERT: A Transformer-Based Causal Inference Framework for Enhancing Pharmacovigilance.” Frontiers in Artificial Intelligence.

Ward, MacDermott, Belardinelli, et al. 2024. “The Reasons That Agents Act: Intention and Instrumental Goals.”

Willig, Zečević, Dhami, et al. 2022. “Can Foundation Models Talk Causality?”

Yang, Liu, Chen, et al. 2020. “CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models.” arXiv:2004.08697 [Cs, Stat].

Yoon. n.d. “E-RNN: Entangled Recurrent Neural Networks for Causal Prediction.”

Zhang, Kun, Gong, Stojanov, et al. 2020. “Domain Adaptation as a Problem of Inference on Graphical Models.” In Advances in Neural Information Processing Systems.

Zhang, Rui, Imaizumi, Schölkopf, et al. 2021. “Maximum Moment Restriction for Instrumental Variable Regression.” arXiv:2010.07684 [Cs].

Zhang, Jiaqi, Jennings, Zhang, et al. 2023. “Towards Causal Foundation Model: On Duality Between Causal Inference and Attention.”

Zhou, Xie, Hao, et al. 2023. “Emerging Synergies in Causality and Deep Generative Models: A Survey.”