Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

Abstract

One of the most surprising and exciting discoveries in supervised learning was the benefit of overparameterization (i.e. training a very large model) to improving the optimization landscape of a problem, with minimal effect on statistical performance (i.e. generalization). In contrast, unsupervised settings have been under-explored, despite the fact that it was observed that overparameterization can be helpful as early as Dasgupta & Schulman (2007). We perform an empirical study of different aspects of overparameterization in unsupervised learning of latent variable models via synthetic and semi-synthetic experiments. We discuss benefits to different metrics of success (recovering the parameters of the ground-truth model, held-out log-likelihood), sensitivity to variations of the training algorithm, and behavior as the amount of overparameterization increases. We find that across a variety of models (noisy-OR networks, sparse coding, probabilistic context-free grammars) and training algorithms (variational inference, alternating minimization, expectation-maximization), overparameterization can significantly increase the number of ground truth latent variables recovered.

Publication
Proceedings of the Thirty-Seventh International Conference on Machine Learning (ICML)
Yoni Halpern
Yoni Halpern
PhD student

Google Research

Yoon Kim
Yoon Kim
Master’s student

Assistant Professor, MIT

David Sontag
David Sontag
Professor of EECS

My research focuses on advancing machine learning and artificial intelligence, and using these to transform health care.

Related