Contributed Talk: Revisiting Model Complexity in the Wake of Overparameterized Learning

Speaker: Pratik Patil, Carnegie Mellon University
Talk title: Revisiting Model Complexity in the Wake of Overparameterized Learning

Time: Tuesday, April 5, 1:15pm-1:40pm (ET)

Abstract:
Modern machine learning models have a large number of parameters relative to the number of observations. Such overparameterized models are often fit (near) perfectly to the training data and can exhibit double or multiple descents in the generalization error curve when plotted against the raw number of model parameters or other similar notions of model complexity. In light of this, we investigate the following questions: (1) Is there a more suitable measure of model complexity in general for overparameterized models? (2) Specifically, how do we compare the complexity of different (near) interpolating models? We attempt to address these through the lens of model optimism and degrees of freedom. In particular, we first re-interpret degrees of freedom (a classical notion of complexity in statistics) in the fixed-X prediction setting, which allows us to extend this concept to the random-X prediction setting. We then define a family of complexity measures, whose two extreme ends we call the emergent and intrinsic degrees of freedom of a prediction model. We show the utility of our proposed measures through several example models, both linear and nonlinear, and illustrate how the proposed measures may prove useful to align the subtle multiple descent behavior in modern machine learning with the typical single descent behavior observed in traditional statistical prediction.

Joint work with Ryan Tibshirani.

 

Return to workshop schedule