Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

Abstract

We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. These shifts are defined via parametric changes in the causal mechanisms of observed variables, where constraints on parameters yield a “robustness set” of plausible distributions and a corresponding worst-case loss over the set. While the loss under an individual parametric shift can be estimated via reweighting techniques such as importance sampling, the resulting worst-case optimization problem is non-convex, and the estimate may suffer from large variance. For small shifts, however, we can construct a local second-order approximation to the loss under shift and cast the problem of finding a worst-case shift as a particular non-convex quadratic optimization problem, for which efficient algorithms are available. We demonstrate that this second-order approximation can be estimated directly for shifts in conditional exponential family models, and we bound the approximation error. We apply our approach to a computer vision task (classifying gender from images), revealing sensitivity to shifts in non-causal attributes.

Publication
Advances in Neural Information Processing Systems (NeurIPS)
Nikolaj Thams
Nikolaj Thams
Visiting Student

PhD Student, University of Copenhagen

Michael Oberst
Michael Oberst
PhD Student

Postdoc CMU, Incoming Asst Prof Johns Hopkins

David Sontag
David Sontag
Professor of EECS

My research focuses on advancing machine learning and artificial intelligence, and using these to transform health care.

Related