Does entity abstraction help generative Transformers reason?

Nicolas Gontier, Siva Reddy, C. Pal

arXiv

Abstract

We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.9% and 89.8% on CLUTRR and ProofWriter respectively. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure. BLEU on machine translation higher accuracy than baselines on the benchmark syntax of the the performance boost a BERT-like model learn word senses. They their model access to WordNet supersenses at the input level as an training loss. They better performance than other baselines on the SemEval Word the They a BERT-base model incorporating predicate-argument structure in the input during training) the model more robust to adversarial a model 2019b) on WordNet evaluate the plausibility of Their model is able predict human