Learning by Interpreting

Learning by Interpreting

Xuting Tang, Abdul Rafae Khan, Shusen Wang, Jia Xu

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 4390-4396. https://doi.org/10.24963/ijcai.2022/609

This paper introduces a novel way of enhancing NLP prediction accuracy by incorporating model interpretation insights. Conventional efforts often focus on balancing the trade-offs between accuracy and interpretability, for instance, sacrificing model performance to increase the explainability. Here, we take a unique approach and show that model interpretation can ultimately help improve NLP quality. Specifically, we employ our learned interpretability results using attention mechanisms, LIME, and SHAP to train our model. We demonstrate a significant increase in accuracy of up to +3.4 BLEU points on NMT and up to +4.8 points on GLUE tasks, verifying our hypothesis that it is possible to achieve better model learning by incorporating model interpretation knowledge.
Keywords:
Natural Language Processing: Interpretability and Analysis of Models for NLP
Machine Learning: Explainable/Interpretable Machine Learning
Machine Learning: Attention Models
AI Ethics, Trust, Fairness: Explainability and Interpretability