Scalable Intervention Target Estimation in Linear Models
Abstract
This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data generated under soft interventions and linear structural equation models (SEMs). Current causal structure learning methods either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets even for linear SEMs, limiting their scalability and sample efficiency. This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets. The pivotal idea is to estimate the intervention sites from the difference in precision matrices, across observational and interventional datasets, by repeatedly applying them on different subsets of variables. This approach facilitates establishing sample complexity guarantees as well as scalability to larger graphs. The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class. Consistency, Markov equivalency, and sample complexity are established analytically. Finally, simulation results on both real and synthetic data demonstrate the gains of the proposed approach for scalable causal structure recovery.