NLP Rountable: DeepSeek

Sara Vera Marjanovic

Mila

The NLP Reading Group is excited to have a roundtable discussion led by Sara Vera Marjanovic who will be discussing the recent investigations of members of Prof. Siva Reddy’s group on the DeepSeek model. In-person attendance is highly encouraged!

Routable Description

Large Reasoning Models (LRMs) mark a fundamental shift in LLM-based problem-solving by mimicking human reasoning processes. In this paper, we analyze DeepSeek-R1, the first LRM for which we have access to its reasoning chains. In our investigation of DeepSeek-R1, we focus on the content of these reasoning chains and their patterns. We assess DeepSeek-R1’s capabilities in tasks that require deep reasoning — mathematical reasoning, code generation, and world modeling. Further, we evaluate DeepSeek-R1’s performance in in- context and long-context tasks, and examine the efficiency of its reasoning process. We explore parallels between human and DeepSeek-R1’s reasoning. We then assess its safety, machine morality, and cultural biases. Finally, we discuss the future of reasoning models and outline directions for further research in this domain.

Speaker Bio

Sara Vera Marjanovic is a second-year PhD student in Computer Science at the University of Copenhagen, currently visiting Mila as a research assistant with Siva Reddy and Karolina Stanczak. In Copenhagen, she is supervised by Isabelle Augenstein, Christina Lioma, and Maria Maistro in the NLP and IR research groups. Her previous research has looked at bias, interpretability, and uncertainty. However, during her time at Mila, she’s been working a lot more with reasoning chains and using this information to better interpret the output of Deepseek. Who knows what the future holds for her.

Logistics

Date: March 21st
Time: 1PM
Location: A14 or Zoom (See email)