Natural Language Understanding with Deep Learning / Computational Semantics
- Course Code: COMP 545 & LING 545 (Fall 2025)
- Instructors: Siva Reddy (office hours: Tuesday 3pm, ENGMC 104N) and Marius Mosbach
- Teaching Assistants: Xing Han Lu, Nicholas Meade, Aditi Khandelwal, Mehar Bhatia
- Classroom: Stewart Biology N2/2, 1205 Dr.Penfield Avenue
- Time: Tuesday and Thursdays, 1:05 PM to 2:25 PM
- Links: MyCourses
Description
The field of natural language processing (NLP) has seen multiple paradigm shifts over decades, from symbolic AI to statistical methods to deep learning. We review this shift through the lens of natural language understanding (NLU), a branch of NLP that deals with “meaning”. We start with what is meaning and what does it mean for a machine to understand language? We explore how to represent the meaning of words, phrases, sentences and discourse. We then dive into many useful NLU applications.
Throughout the course, we take several concepts in NLU such as meaning or applications such as question answering, and study how the paradigm has shifted, what we gained with each paradigm shift, and what we lost? We will critically evaluate existing ideas and try to come up with new ideas that challenge existing limitations. We will particularly work on making deep learning models for language more robust.
This course will also delve into how large language models like ChatGPT are built, and latest advances on how to train/use these models for the task at hand.
Prerequisites
You are expected to have done one of the following courses at McGill: natural language processing (COMP/LING 550) or computational linguistics (COMP/LING 445) or applied machine learning (COMP 551) or From Language to Data Science (COMP/LING 345). Make sure you are comfortable with advanced Python programming. If you have done similar courses at other universities, feel free to take the course. If you are not sure, email the instructor.
Grading
Assignments (45%): Automatic grading + written report
- Basics of deep learning and neural networks for NLP (10%)
- word2vec and word representation (10%)
- Char-RNN and ELMo (12.5%)
- Transformers and applications (12.5%)
Project (50%): You will do a project in groups of two.
- Proposal (10%)
- Presentation (10%)
- Final report (30%)
Participation (5%): Class participation amongst other things. Details to be determined.
Topics (Tentative)
Topic | Subtopics | |||
---|---|---|---|---|
Word meaning | distributional semantics | word embeddings | evaluation | |
Phrase and sentence meaning | logical representation | sentence embeddings | evaluation | |
Meaning in context | word senses | contextual word embeddings | fine-tuning | |
Interpretability | feature-based vs deep learning models | linguistic tests | probing | |
Compositionality | syntax and semantic interfaces | inductive priors | tests for compositionality | limitations |
Reasoning | inference | question answering | other applications | |
Discourse | conversational systems | |||
Language and physical world | model-theoretic semantics | grounded environment | reinforcement learning | |
Bias | word association tests | probing |
Schedule
Lecture | Date | Topic | Due dates | Additional Readings | Instructor |
---|---|---|---|---|---|
1 | Aug 28 | Course outline and perceptron | |||
2 | Sep 2 | Multi-layer perceptron and Deep Neural Networks | Sep 1: Assignment 1 release | ||
3 | Sep 4 | Word Embeddings | Distributed Representations of Words and Phrases and their Compositionality, Neural Word Embedding as Implicit Matrix Factorization | ||
4 | Sep 9 | Word embeddings (cont..), Bias | Enriching Word Vectors with Subword Information, Man is to Computer Programmer as Woman is to Homemaker?, Semantics derived automatically from language corpora contain human-like biases | ||
5 | Sep 11 | Sentence Representations (CNN, RNN, LSTMs) | |||
6 | Sep 16 | LSTMs, Backpropagation through time | |||
7 | Sep 18 | ELMo, Attention | |||
8 | Sep 23 | Transformers and Pretraining | Sep 22: Assignment 1 due; Assignment 2 release | ||
9 | Sep 25 | Analyzing large-scale models for syntax, Structure Prediction | |||
10 | Sep 30 | Large Language Models | Coreference Resolution: Survey, End-to-end Neural Coreference Resolution, Improving Machine Learning Approaches to Coreference Resolution | ||
11 | Oct 2 | Large Language Models | |||
12 | Oct 7 | Prompting, Incontext learning, Parameter efficient fine tuning | Oct 6: Assignment 2 due; Assignment 3 release | ||
13 | Oct 9 | Efficient incontext learning | |||
14 | Oct 14 | Reading week | |||
15 | Oct 16 | Reading week | |||
16 | Oct 21 | Efficient training and inference methods | Building Machine Translation Systems for the Next Thousand Languages, Quality at a glance: An audit of web-crawled multilingual datasets | ||
17 | Oct 23 | Question Answering, Retrieval-augmented LMs and IR | |||
18 | Oct 28 | Guest Speaker: TBA | Oct 27: Assignment 3 due; Assignment 4 release | ||
19 | Oct 30 | Vision and Language | |||
20 | Nov 4 | Vision and Language | Nov 3: Project proposal deadline | ||
21 | Nov 6 | Alignment, RLHF, DPO | |||
22 | Nov 11 | Retrieval-augmented QA | Nov 10: Assignment 4 due | ||
23 | Nov 13 | Guest Speaker: TBA | |||
24 | Nov 18 | Guest Speaker: TBA | |||
25 | Nov 20 | Guest Speaker: TBA | |||
26 | Nov 25 | Project presentations | |||
27 | Nov 27 | Project presentations | |||
28 | Dec 2 | Project presentations | Nov 30: Final project report submission | ||
27 | Dec 4 | Project presentations | |||
28 | Dec 9 | Project presentations | Dec 15: Final project report submission |
FAQs
Question: What are some books for learning basics of linguistics?
Bender 2013: Linguistic Fundamentals for Natural Language Processing (login with McGill credentials for free access)
Question: What are some books books on deep learning for NLP?
Jurafsky and Martin 2019: Speech and Language Processing
Tunstall et al. 2022: Natural Language Processing with Transformers (login with McGill credentials)
Eisenstein 2019: Introduction to Natural Language Processing
Goldberg 2017: Neural Network Methods for Natural Language Processing (login with McGill credentials)
Generative AI Policy
If you use any Generative AI tool for your submitted work (e.g., ChatGPT, GitHub Copilot, Claude), you must cite it and submit a detailed statement describing its use as well as a log of the chat where you used it .
You may not use AI tools for:
- Copying AI-generated prose or non-trivial code chunks – this is plagiarism
- Writing whole assignments or code files
- Writing large chunks of an assignment or code
- Using AI without citing it in your assignment
Inappropriate use of Generative AI may result in penalties on grades or referral to disciplinary authorities. If you have any question about appropriate use of AI applications for course work, please contact the instructors in their office hours.
Language of Submission
In accord with McGill University’s Charter of Students’ Rights, students in this course have the right to submit in English or in French any written work that is to be graded.
Academic Integrity
McGill University values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Disciplinary Procedures (see www.mcgill.ca/students/srr/honest/ for more information)
Inclusivity
As the instructor of this course I endeavor to provide an inclusive learning environment. However, if you experience barriers to learning in this course, do not hesitate to discuss them with me or the Office for Students with Disabilities.