Abstract: In recent years, a popular task in statistics, data science, and machine learning has been to estimate a feature of a new, target population , denoted as f(Q), using existing datasets from a source population. Prominent examples of this task include (i) generalizing results from randomized trials to new, target populations or (ii) deploying pre-trained machine learning algorithms in out-of-sample environments. Unfortunately, the success of this task hinges on a key assumption called conditional exchangeability, which is unverifiable with data and often untenable in practice.
This talk explores the sensitivity of learning about f(Q) when conditional exchangeability is violated. In the first part, we present a method for nonparametric and efficient inference for f(Q) under a well-known sensitivity model from causal inference. We also propose a new technique to benchmark or calibrate the sensitivity parameter based on an idea from design sensitivity by Rosenbaum (2004). In the second park, we propose a novel measure to assess the sensitivity of learning f(Q) under local violations of conditional exchangeability. This measure, which we call SLOPE, is inspired by an idea from Hampel (1974)’s influence curve and sensitivity analysis from causal inference. Practically, SLOPE helps investigators address questions about which datasets to use for robust generalization. We conclude with two applications of our results, one in the context of a multi-state randomized trial in political science and another in the context of a multi-national randomized trial in health economics.
This work is joint work with Xinran Miao (UW-Madison) and Jiwei Zhao (UW-Madison).
About the speaker: Hyunseung (pronounced Hun-Sung) is an Associate Professor in the Department of Statistics at the University of Wisconsin–Madison. In 2015, he received his Ph.D. in Statistics from the University of Pennsylvania, where he was advised by Tony Cai and Dylan Small. From 2015 to 2016, he completed his NSF postdoctoral fellowship under Guido Imbens at Stanford University, and he has been at Madison since 2017. His research focuses on developing methods for causal inference, with particular emphasis on (a) instrumental variables and unmeasured confounding, (b) semi/nonparametric inference, and (c) dependence. He is also interested in applications to genetics, health, and political science.