UC Berkeley Center for Targeted Machine Learning and Causal Inference

UC Berkeley Center for Targeted Machine Learning and Causal Inference

The UC Berkeley Center for Targeted Machine Learning and Causal Inference (CTML) harnesses the power of big data and statistical machine learning to improve health. CTML leverages unique developments in statistical machine learning, methodology pioneered by experts in the UC Berkeley Biostatistics Group, towards adaptation of these methods in research and applications.

The Targeted Machine Learning approach provides a template to construct optimal machine learning algorithms for answering any type of (often causal) question about any type of observed data system, while providing formal statistical inference. Thus, the potential sphere of relevant applications is infinite, covering an enormous range of randomized trials, sequentially adaptive randomized trials, and complex longitudinal observational studies. Indeed, there exists strong interest in adaptation of these methods, including from the FDA, industry and researchers.

However, the uptake of specific applications of the targeted learning methods (beyond prediction) has been slowed due to educational and computational barriers. Our efforts to attack this challenge have taken three overlapping paths: 1) development of new software tools (in R) that can be both applied by end-users to analyze their data with relative ease (see https://github.com/tlverse), 2) to apply these methods in collaborative research, by working closely with groups to assist in analyzing their data, and 3) training for translation of these methods into practice (see for instance https://tlverse.org/acic2019-workshop/).

Current global health collaborations include funded projects with the Gates Foundation and the International Inter-American Development Bank (IADB). Please visit CTML's Projects page for more details.

Data Science
Causal Inference and Machine Learning

Alan Hubbard

Dr. Alan Hubbard is Professor of Biostatistics, Head of the Division of Biostatistics at UC Berkeley, and Head of data analytics core at UC Berkeley SuperFund. His current research interests include causal inference, variable importance analysis, statistical machine learning, estimation of and inference for data-adaptive statistical target parameters, and targeted minimum loss-based estimation. Research in his group is generally motivated by applications to problems in computational biology, epidemiology, and precision medicine.


Maya Petersen

Dr. Maya L. Petersen is an Associate Professor of Biostatistics and Epidemiology at the University of California, Berkeley. Dr. Petersen’s methodological research focuses on the development and application of novel causal inference methods to problems in health, with an emphasis on longitudinal data and adaptive treatment strategies (dynamic regimes), machine learning methods, adaptive designs, and study design and analytic strategies for cluster randomized trials.

Causal Inference and Machine Learning

Mark van der Laan

Mark Johannes van der Laan is the Jiann-Ping Hsu/Karl E. Peace Professor of Biostatistics and Statistics at the University of California, Berkeley. He has made contributions to survival analysis, semiparametric statistics, multiple testing, and causal inference. He also developed the targeted maximum likelihood methodology. He is a founding editor of the Journal of Causal Inference.

Contact Information

2121 Berkeley Way West
Berkeley , CA 94704
United States