MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification

Yu Lu Liu, Rachel Bawden, Thomas Scaliom, Benoît Sagot, J. Cheung

arXiv

Abstract

In text summarization and simplification, system outputs must be evaluated along multiple dimensions such as relevance, factual consistency, fluency, and grammaticality, and a wide range of possible outputs could be of high quality. These properties make the development of an adaptable, reference-less evaluation metric both necessary and challenging. We introduce MaskEval, a reference-less metric for text summarization and simplification that operates by performing masked language modeling (MLM) on the concatenation of the candidate and the source texts. It features an attention-like weighting mechanism to modulate the relative importance of each MLM step, which crucially allows it to be adapted to evaluate different quality dimensions. We demonstrate its effectiveness on English summarization and simplification in terms of correlations with human judgments, and explore transfer scenarios between the two tasks.