Basic Details
"Had I started writing the paper earlier, I would have made it to the conference deadline." The ability to think about how things could have turned out differently from how they did in reality, often referred to as counterfactual reasoning, is a fundamental aspect of human cognition. Is counterfactual reasoning a human capacity that machines cannot have? Surprisingly, recent advances at the interface of psychology, causality and machine learning have demonstrated that it is possible to build machines that perform and benefit from counterfactual reasoning, in a way similarly as humans do. This tutorial aims to introduce students and researchers to the current state of research in this area, providing them both with insights from cognitive science (🧠) and with an overview of relevant technical advances in machine learning (🤖).
In the first part of the tutorial, we look into counterfactual reasoning from the perspective of psychology. We first discuss the functional roles of counterfactuals in human cognition, the factors that affect which counterfactuals we think about, and how we make counterfactual inferences using mental simulation. Then, we shift focus to machine learning. We start by introducing structural causal models (SCMs), a mathematical framework that allows us to formalize probabilistic counterfactual reasoning. We then discuss the identification of counterfactual quantities and present a range of areas where counterfactual reasoning has been successfully applied in machine learning. We conclude with an overview of recent advances at the intersection of counterfactual reasoning and large language models (LLMs) and a technical deep dive into using SCMs to enable counterfactual generation in LLMs.
What, When, and Where
Resources
Slides and Materials
Key References
- Counterfactual thinking Roese, Psychological Bulletin 1997
- Counterfactual thought Byrne, Annual Review of Psychology 2016
- Counterfactual simulation in causal cognition Gerstenberg, Trends in Cognitive Sciences 2024
- Antecedents to spontaneous counterfactual thinking: Effects of expectancy violation and outcome valence Sanna & Turley, Personality and Social Psychology Bulletin 1996
- Unlucky victims or lucky survivors? Teigen & Jensen, European Psychologist 2010
- When less is more: Counterfactual thinking and satisfaction among Olympic medalists Medvec et al., Journal of Personality and Social Psychology 1995
- Downward counterfactuals and motivation: The wake-up call and the Pangloss effect McMullen & Markman, Personality and Social Psychology Bulletin 2000
- The lessons we (don't) learn: Counterfactual thinking and organizational accountability after a close call Morris & Moore, Administrative Science Quarterly 2000
- The functional theory of counterfactual thinking Epstude & Roese, Personality and Social Psychology Review 2008
- Counterfactual thinking: An fMRI study on changing the past for a better future Van Hoeck et al., Social Cognitive and Affective Neuroscience 2013
- Causation Lewis, Journal of Philosophy 1973
- Conversational processes and causal explanation Hilton, Psychological Bulletin 1990
- Making things happen: A theory of causal explanation Woodward, Oxford University Press 2003
- Causation in the law Hart & Honoré, Oxford University Press 1985
- Causal responsibility and counterfactuals Lagnado et al., Cognitive Science 2013
- A theory of blame Malle et al., Psychological Inquiry 2014
- Culpable control and counterfactual reasoning in the psychology of blame Alicke et al., Personality and Social Psychology Bulletin 2008
- Norm theory: Comparing reality to its alternatives Kahneman & Miller, Psychological Review 1986
- Counterfactual potency Petrocelli et al., Journal of Personality and Social Psychology 2011
- Crediting causality Spellman, Journal of Experimental Psychology 1997
- When contributions make a difference: Explaining order effects in responsibility attribution Gerstenberg & Lagnado, Psychonomic Bulletin & Review 2012
- Event controllability in counterfactual thinking Girotto et al., Acta Psychologica 1991
- The nature of explanation Craik, Cambridge University Press 1943
- The simulation heuristic Kahneman & Tversky, In "Judgment Under Uncertainty: Heuristics and Biases", Cambridge University Press 1982
- Probabilistic models of physical reasoning Smith et al., In "Bayesian Models of Cognition: Reverse Engineering the Mind", MIT Press 2025
- A counterfactual simulation model of causal judgments for physical events Gerstenberg et al., Psychological Review 2021
- Programs as causal models: Speculations on mental programs and mental representation Chater & Oaksford, Cognitive Science 2013
- Concepts in a probabilistic language of thought Goodman et al., In "The Conceptual Mind: New Directions in the Study of Concepts", MIT Press 2015
- A computational model of responsibility judgments from counterfactual simulations and intention inferences Wu et al., CogSci 2023
- Causality Pearl, Cambridge University Press 2009
- Elements of causal inference: foundations and learning algorithms Peters et al., The MIT Press 2017
- Causal machine learning: A survey and open problems Kaddour et al., arXiv preprint 2022
- On Pearl's hierarchy and the foundations of causal inference Bareinboim et al., In "Probabilistic and Causal Inference: The Works of Judea Pearl" 2022
- Complete identification methods for the causal hierarchy Shpitser & Pearl, JMLR 2008
- Complete graphical characterization and construction of adjustment sets in markov equivalence classes of ancestral graphs Perkovic et al., JMLR 2018
- Estimating individual treatment effect: generalization bounds and algorithms Shalit et al., ICML 2017
- Treatment effect risk: Bounds and inference Kallus, Management Science 2023
- Counterfactual explanations without opening the black box: Automated decisions and the GDPR Wachter et al., Harvard Journal of Law & Technology 2017
- Counterfactual explanations and algorithmic recourses for machine learning: A review Verma et al., ACM Computing Surveys 2024
- Causal explanations and XAI Beckers, CLeaR 2022
- Counterfactual explanations as interventions in latent space Crupi et al., Data Mining and Knowledge Discovery 2022
- Algorithmic recourse: from counterfactual explanations to interventions Karimi et al., FAccT 2021
- Algorithmic recourse under imperfect causal knowledge: a probabilistic approach Karimi et al., NeurIPS 2020
- Decisions, counterfactual explanations and strategic behavior Tsirtsis & Gomez-Rodriguez, NeurIPS 2020
- Counterfactual fairness Kusner et al., NeurIPS 2017
- Path-specific counterfactual fairness Chiappa, AAAI 2019
- Causal fairness analysis Bareinboim & PleÄŤko, ICML 2022 Tutorial
- Counterfactual harm Richens et al., NeurIPS 2022
- Human-aligned calibration for AI-assisted decision making Corvelo Benz & Gomez-Rodriguez, NeurIPS 2023
- Counterfactual explanations in sequential decision making under uncertainty Tsirtsis et al., NeurIPS 2021
- Towards causal foundations of safe AI Fox & Everitt, UAI 2023 Tutorial
- Woulda, coulda, shoulda: Counterfactually-guided policy search Buesing et al., ICLR 2019
- Counterfactual off-policy evaluation with gumbel-max structural causal models Oberst & Sontag, ICML 2019
- Designing decision support systems using counterfactual prediction sets Straitouri & Gomez-Rodriguez, ICML 2024
- Cladder: Assessing causal reasoning in language models Jin et al., NeurIPS 2023
- Causal reasoning and large language models: Opening a new frontier for causality Kiciman et al., Transactions on Machine Learning Research 2023
- What if the tv was off? examining counterfactual reasoning abilities of multi-modal language models Zhang et al., CVPR 2024
- Counterfactual token generation in large language models Chatzi et al., CLeaR 2025
Presenters

Tobias Gerstenberg
Stanford University
Tobias is an assistant professor of psychology at Stanford University. He leads the Causality in Cognition Lab which studies the role that causality plays in people’s understanding of the world, and of each other.

Manuel Gomez-Rodriguez
Max Planck Institute for Software Systems
Manuel is a tenured faculty at MPI-SWS. He develops human-centric machine learning algorithms to enhance the functioning of social, information and networked systems.

Stratis Tsirtsis
Max Planck Institute for Software Systems
Stratis is a final-year Ph.D. candidate at MPI-SWS. He is interested in developing machine learning systems for decision making that account for human behavior and emulate aspects of human cognition.