Identifying Homogeneous and Interpretable Groups for Conformal Prediction
Abstract
Conformal prediction methods are a tool for uncertainty quantification of a model's prediction, providing a model-agnostic and distribution-free statistical wrapper that generates prediction intervals/sets for a given model with finite sample generalization guarantees. However, these guarantees hold only on average, or conditioned on the output values of the predictor or on a set of predefined groups, which a-priori may not relate to the prediction task at hand. We propose a method to learn a generalizable partition function of the input space (or representation mapping) into interpretable groups of varying sizes where the non-conformity scores - a measure of discrepancy between prediction and target - are as homogeneous as possible when conditioned to the group. The learned partition can be integrated with any of the group conditional conformal approaches to produce conformal sets with group conditional guarantees on the discovered regions. Since these learned groups are expressed as strictly a function of the input, they can be used for downstream tasks such as data collection or model selection. We show the effectiveness of our method in reducing worst case group coverage outcomes in a variety of datasets.