Invited Talk: T-Cal: An Optimal Test for the Calibration of Predictive Models

Speaker: Edgar Dobriban, University of Pennsylvania
Talk title: T-Cal: An Optimal Test for the Calibration of Predictive Models

Time: Wednesday, April 6, 4:20pm-5:20pm (ET)

Abstract:
The prediction accuracy of machine learning (ML) methods is steadily increasing, due to the use of powerful overparametrized models. At the same time, overparametrization has been found to hinder the calibration of the uncertainty predictions of ML models. In this work we study how to test the calibration of ML models, in a framework which specifically suits over-parametrized models. Our framework is non-parametric (does not make restrictive assumptions about the model’s predictions) and its running does not depend on the internal complexity of the model, only on the size of the calibration dataset (thus being applicable to large models). We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. This means that for sufficiently rich and overparametrized models, detecting mis-calibration using a small dataset may be hard or impossible. In contrast, when the conditional class probabilities are sufficiently smooth, we propose T-Cal, a minimax optimal test for calibration. We support our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures.

Return to workshop schedule