Invited Talk: Classification versus Regression in Overparameterized Regimes: Does the Loss Function Matter?
Speaker: Vidya Muthukumar, Georgia Tech
Talk title: Classification versus Regression in Overparameterized Regimes: Does the Loss Function Matter?
Time: Tuesday, April 5, 4:05pm-5:05pm (ET)
Abstract:
While the initial mathematical explanations for “benign overfitting” were provided for regression, almost all success stories of modern machine learning have occurred in classification tasks. In this talk, I will compare classification and regression tasks in the overparameterized linear model (in both noiseless and noisy settings) and present the following results:
1. On the side of optimization, the minimum-norm interpolating solution is identical to the hard-margin SVM solution either under sufficient effective overparameterization or the “neural collapse” conditions. For multiclass classification, we show an equivalence between two different SVM formulations (the multiclass SVM and the one-versus-all SVM) and interpolation. Coupled with characterizations of the implicit bias of gradient descent, our results imply that training with the cross-entropy loss and squared loss yield exactly identical solutions.
2. On the side of generalization, we uncover high-dimensional regimes where the minimum-norm interpolating solution generalizes well for a classification task, but does not generalize in a corresponding regression task. We show that classification generalization is possible despite adverse signal recovery and in regimes where margin-based bounds are not predictive of generalization.
These results show the contrasting roles of training and test loss functions in the overparameterized regime. I will highlight synergies between the proofs of these results and benign overfitting of noise.
I will conclude the talk with partial extensions of these results to kernel interpolation, and a brief discussion of consequences for adversarial robustness.
Return to workshop schedule