Feeding two birds or favoring one? Adequacy–fluency tradeoffs in evaluation and meta-evaluation of machine translation.

Behzad Shayegh

RBC Borealis/University of Alberta

The NLP Reading Group is excited to host Behzad Shayegh who will be giving a talk in-person on “Feeding two birds or favoring one? Adequacy–fluency tradeoffs in evaluation and meta-evaluation of machine translation.”

Talk Description

When we evaluate machine translation, we’re judging two things: adequacy (Does it convey the correct meaning?) and fluency (Does it sound natural?). But are our evaluation metrics balancing these two goals, or are they forced to favor one?

In this talk, based on research from his time at Google, Behzad Shayegh will investigate the severe tradeoff between adequacy and fluency in machine translation evaluation. We will explore how popular metrics lean heavily toward adequacy, potentially overlooking critical fluency errors.

More importantly, this bias persists even one level up, in meta-evaluation—the process we use to judge the evaluation metrics themselves. We’ll show how the standard WMT meta-evaluation favors adequacy-oriented metrics, and how this bias is partially due to the systems included in the test datasets. Finally, we’ll introduce a novel method to synthesize new translation systems to create a more balanced and reliable meta-evaluation. This work highlights the critical need to understand these tradeoffs as we build and rank the next generation of translation models.

Speaker Bio

Behzad Shayegh recently completed his Master’s degree in Computer Science from the University of Alberta. His research focuses on natural language processing, machine translation, and model evaluation. Behzad has had the opportunity to conduct research at Google, where he investigated biases in translation evaluation, and at RBC Borealis, where he is working on evaluation methods for LLMs’ performance with uncertain contextual information.

Logistics

Date: November 20th
Time: 10:00AM
Location: H04 or via Google Meet (See email)