Current UROP openings

Project title: Analyzing deterioration of ML models due to distribution shift

Group: Clinical Machine Learning (Professor David Sontag) in EECS

Opportunity type: EECS SuperUROP for academic year 2022-2023

Project description:

Machine learning models are sensitive to differences in training and deployment environments. In healthcare, deploying a model in another hospital with different clinical practices, disease trends, or electronic health record coding systems may lead to deterioration in clinical decision support tools. Identifying when models have deteriorated and adapting them is important for patient safety. Many transfer learning methods for adapting to new domains have been developed, including importance reweighting, conditional distribution matching, two-stage offset estimation, multi-task learning, and few-shot learning. Other methods, such as invariant risk minimization and group distributionally robust optimization, learn a single model that generalizes across multiple domains.

WILDS is a recently released set of real-world benchmark datasets for distribution shift. The paper discusses how the ideal comparison between out-of-distribution loss (from training and testing on different distributions) and in-distribution loss (from both training and testing on the test distribution) is not feasible when there is insufficient test data. Instead, most analyses compare the out-of-distribution loss with the loss from evaluating the same model on held-out training data. While this comparison is feasible, it is not equivalent to the ideal comparison because the test distribution may be harder to model than the training distribution.

To examine how we can get insights about the ideal comparison from this feasible comparison, we are interested in decomposing the difference in the feasible comparison into two components: 1) Bias: How much error is due to selecting the wrong model in the hypothesis class because the model was learned on the training distribution? 2) Irreducible error: How much error is simply because the test distribution is noisier than the training distribution and thus predictions would be worse even if the right model was selected for both distributions? Bias is the difference measured in the ideal comparison, so optimal transfer learning methods would minimize only bias, as irreducible error cannot be addressed.

The primary goal of this UROP is to create a simulator where we can control the amount of bias and irreducible error introduced by the distribution shift. This will allow us to examine when standard transfer learning methods can minimize the introduced bias in two-domain and multi-domain settings. Depending on the UROP student’s interests, the project can be extended to developing a method to decompose bias and irreducible error in real-world experiments to determine whether a model is the best possible for a test distribution with limited data. Another potential extension is creating a new transfer learning objective minimizing the bias.


  • Earned an A in 6.867, 6.438, 6.437, 6.871, or equivalent grad-level ML or math classes. We may also consider an applicant with an A in 6.036 and significant ML experience from UROPs or internships.
  • Proficient in Python
  • Pass an ML test during an interview
  • Work on the project for at least 12 hours a week
  • Commit to the project for the full SuperUROP year
  • Motivated and passionate about the project

Email Christina Ji ( with resume and transcript if interested.