Fabio De Sousa Ribeiro, Ainkaran Santhirasekaram, Ben Glocker
This paper addresses a major open problem: whether high-dimensional, multivariate outcomes (e.g. images) admit identifiable counterfactuals from observational data alone. In standard causal modelling, identifiability of counterfactuals is key to making valid causal claims – yet prior work on high-dimensional outcome variables has neglected theoretical guarantees.
The authors propose a novel framework combining Dynamic Optimal Transport (OT) with continuous-time flows (flow matching) to recover a unique, monotone, rank-preserving transport map from factual to counterfactual distributions under standard assumptions. This ensures that given observational data, one can consistently derive counterfactual outcomes in a way that respects the joint multivariate structure rather than assuming coordinate-wise independence or arbitrary ordering.
The authors provide theoretical analysis characterising the required conditions for identifiability and demonstrate empirically that their method yields sound counterfactuals in both a toy setting as well as a real-world chest X-ray dataset. This work significantly advances the foundations for counterfactual inference in high-dimensional domains, potentially enabling more reliable causal analysis and reinterpretation in areas like fairness, image editing, or treatment-effect estimation.
I have previously worked on counterfactual estimation in the medical imaging domain. As such this work closes some gaps in one of my own papers. It is exciting to see this approach being both theoretically sound and empirically performant.
Counterfactual Identifiability via Dynamic Optimal Transport