CSC 498 Winter 2021: Introduction to Reinforcement Learning

Weekly schedule

Lectures: posted online as videos, viewing Tue 5:00pm - 7:00pm ET (optional), Zoom

Tutorials: Tue 4:00 - 5:00pm EST, Zoom

Animesh Garg office hours: Mon 4pm - 5pm ET, Zoom Mail:

Matthew Zhang office hours: Fri 11am - 12am ET, Zoom Mail:

Claas Voelcker office hours: Fri 11am - 12am ET, Zoom Mail:

All Emails Subject: “CSC498-W21:

Accessing resources

Course webpage: here
Piazza: piazza
Zoom: Link in Quercus Announcement

Online delivery: Lectures will be posted online as videos. In the official lecture slot, the TA will give a walkthrough of the video and answer questions about the material. Please come prepared. Tutorials, and office hours for this course will be delivered synchronously via Zoom. The link has been posted on Quercus.


Reinforcement learning is a powerful paradigm for modeling autonomous and intelligent agents interacting with the environment, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This course provides an introduction to reinforcement learning intelligence, which focuses on the study and design of agents that interact with a complex, uncertain world to achieve a goal. We will study agents that can make near-optimal decisions in a timely manner with incomplete information and limited computational resources. The course will cover Markov decision processes, reinforcement learning, planning, and function approximation (online supervised learning). The course will take an information-processing approach to the concept of mind and briefly touch on perspectives from psychology, neuroscience, and philosophy.

Learning objectives

At the end of this course, you will have gained both knowledge and system building abilities in:

List of Topics covered in this course (expected)

With a focus on AI as the design of agents learning from experience to predict and control their environment, topics will include


Repository for code:


Priority will be given to students who meet prerequisites for the course. Knowledge of probability, multivariate calculus, and linear algebra is expected.



Algorithm implementation will be done mainly in Python. Please familiarize yourself with the language and common tools (git, cmd, and the frameworks numpy and pytorch).

Textbook & Resources

There is no required textbook.The course will provide all material in class and handouts.
The students can refer the following material for additional help:

Additional resources:

Reinforcement learning ressources:

Evaluation format

This course combines lectures with Tutorials, encouraging both fundamental knowledge acquisition as well as hands-on experience. Each student will be responsible for 4 individual assignments (60%), two take-home midterm (20%) and one team project (in teams of 2) (20%).

Late penalties

Each student will have 3 grace days throughout the semester for late assignment submissions. Late submissions that exceed those grace days will lose 33% of their value for every late day beyond the allotted grace days. Late submissions that exceed three days of delay after the grace days have been used will unfortunately not be accepted. The official policy of the Registrar’s Office at UTM regarding missed exams can be found here .