AI Conversationalists know what to say but not when to speak
Muhammad Umair
Tufts University
The NLP Reading Group is happy to host Muhammad Umair, from Tufts University, who will be speaking remotely in A14 at 1PM on Friday March 14th about Conversational Agents and Transition Relevance Places.
Talk Description
In verbal interactions, humans use a sophisticated turn-taking mechanism that enables smooth transitions between listening and speaking. This process prevents overlaps, avoids unintended social signals, and fosters mutual understanding. Unlike formal settings where turn order is predetermined, everyday conversations rely on a locally managed turn-taking system, where speakers anticipate Transition Relevance Places (TRPs)—moments where a listener may, but is not required to, take over the turn. Humans effortlessly predict these moments using lexico-syntactic, contextual, and prosodic cues, ensuring seamless conversational flow. Deviations from normative turn-timing can lead to negative perceptions or miscommunications, making smooth turn-taking a critical aspect of a successful conversation.
For Spoken Dialogue Systems (SDS), anticipating opportunities for speech remains a challenge. Many systems rely on silence-based thresholds rather than the anticipatory mechanism that humans use, leading to poorly times and limited fluidity.
In this talk, we present a comprehensive, evidence-based approach to improving turn-taking in SDS. First, we develop methods to annotate paralinguistic features at scale and analyze this data to create an empirically grounded operationalization of TRPs. Next, we investigate whether Large Language Models (LLMs) trained on written language can generalize to unscripted spoken dialogue. To test this, we construct a participant-labeled corpus of TRPs in spontaneous interaction and design a simple modeling task where LLMs predict whether a TRP occurs after each word in a turn. Finally, we explore whether local increases in linguistic entropy serve as reliable indicators for anticipating upcoming TRPs.
Speaker Bio
Muhammad Umair is a PhD candidate in Computer Science at Tufts University, specializing in Human-Robot Interaction (HRI), Natural Language Processing (NLP), and Conversational AI. His research focuses on improving turn-taking in Spoken Dialogue Systems (SDS) by developing data-driven models grounded in human conversational behavior. Inspired by Conversation Analysis (CA), he investigates how speakers anticipate Transition Relevance Places (TRPs) to enable more natural interactions in AI-driven systems.
He is a member of the Human Interaction Lab at Tufts University, advised by Dr. J.P. de Ruiter and Dr. Liping Liu. Prior to his PhD, Umair earned a BS in Computer Science with a minor in Cognitive and Brain Sciences from Tufts University.
Logistics
Date: March 14th
Time: 1PM
Location: A14 or via Zoom (See email)