FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

TACL 2022

Nouha DziriEhsan KamallooSivan MiltonOsmar ZaianeMo YuEdoardo Maria PontiSiva Reddy.

The goal of information-seeking dialogue is to respond to user queries with natural language utterances that are grounded on knowledge sources. In our recent investigation, we show that existing knowledge-grounded benchmarks are fraught with hallucinations (>60% of the responses). To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues by editing hallucinated responses in the Wizard of Wikipedia benchmark. FaithDial contains around 50K turns across 5.5K conversations. If trained on FaithDial, state-of-the-art dialogue models are significantly more faithful while also enhancing other dialogue aspects like cooperativeness, creativity and engagement.

  title = "{FaithDial: A Faithful Benchmark for Information-Seeking Dialogue}",
  author = {Dziri, Nouha and Kamalloo, Ehsan and Milton, Sivan and Zaiane, Osmar and Yu, Mo and Ponti, Edoardo M and Reddy, Siva},
  journal = {Transactions of the Association for Computational Linguistics},
  volume = {10},
  pages = {1473--1490},
  year = {2022},
  month = {12},
  publisher = {MIT Press},


  title = "On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?",
  author = {Dziri, Nouha and Milton, Sivan and Yu, Mo and Zaiane, Osmar and Reddy, Siva},
  booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
  year = {2022},
  pages = "5271--5285",
  address = "Seattle, United States",
  publisher = "Association for Computational Linguistics",
  url = ""

Brought to you by researchers from: