Feeling Validated: Constructing Validation Sets for Few-Shot Intent Classification

Feeling Validated: Constructing Validation Sets for Few-Shot Intent Classification

Ari Kobren, Naveen Jafer Nizar, Swetasudha Panda, Qinlan Shen, Michael Wick, Jason Peck, Gioacchino Tangari

07 December 2022

We study validation set construction via data augmentation in true few-shot intent classification. Empirically, we demonstrate that with scarce data, model selection via a moderate number of generated examples consistently leads to higher test set accuracy than either model selection via a small number fo held out training examples, or selection of the model with the lowest training loss. For each of these methods of model selection -- including validation sets built from task-agnostic data augmentation -- validation accuracy provides a significant overestimate of test set accuracy. To support better estimates and effective model selection, we propose PanGeA, a generated method for domain-specific augmentation that is trained once on out-of-domain data, and then employed for augmentation for any domain-specific dataset. In experiments with 6 datasets that have been subsampled to both 5 and 10 examples per class, we show that PanGeA is better than or competitive with other methods in terms of model selection while also facilitating higher fidelity estimates of test set accuracy.


Venue : Empirical Methods in Natural Language Processing (EMNLP) 2022