Is Semantic-aware BERT more Linguistically Aware? A Case Study on Natural Language Inference
Abstract
Recent work has shown that predicate argument label representations from semantic role labeling (SRL) can be concatenated with BERT representations to improve natural language understanding tasks such as natural language inference (NLI) and reading comprehension. Two natural questions that arise are whether infusing SRL representations with BERT 1) improves model performance and 2) increases the model’s linguistic awareness. This paper aims at answering both questions with a case study on the NLI task. We start by analyzing whether and how infusing SRL information helps BERT learn linguistic knowledge. We compare model performance on two benchmark datasets, SNLI and MNLI. We also conduct in-depth analysis on two probing datasets, Breaking NLI and HANS, which contain abundant examples where SRL information is expected to be helpful. We found that combining SRL representations with BERT representations does outperform BERT-only representations in general, with better awareness in lexical meaning and world knowledge but not in logic knowledge. We also found that infusing SRL information via predicate-wise concatenation with BERT word representations followed by an interaction layer is more effective than sentence-wise concatenation.