CSC 498 Winter 2021: Introduction to Reinforcement Learning
Lectures: posted online as videos, viewing Tue 5:00pm - 7:00pm ET (optional), Zoom
Tutorials: Tue 4:00 - 5:00pm EST, Zoom
Animesh Garg office hours: Mon 4pm - 5pm ET, Zoom Mail: email@example.com
Matthew Zhang office hours: Fri 11am - 12am ET, Zoom Mail: firstname.lastname@example.org
Claas Voelcker office hours: Fri 11am - 12am ET, Zoom Mail: email@example.com
All Emails Subject: “CSC498-W21:
Online delivery: Lectures will be posted online as videos. In the official lecture slot, the TA will give a walkthrough of the video and answer questions about the material. Please come prepared. Tutorials, and office hours for this course will be delivered synchronously via Zoom. The link has been posted on Quercus.
Reinforcement learning is a powerful paradigm for modeling autonomous and intelligent agents interacting with the environment, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This course provides an introduction to reinforcement learning intelligence, which focuses on the study and design of agents that interact with a complex, uncertain world to achieve a goal. We will study agents that can make near-optimal decisions in a timely manner with incomplete information and limited computational resources. The course will cover Markov decision processes, reinforcement learning, planning, and function approximation (online supervised learning). The course will take an information-processing approach to the concept of mind and briefly touch on perspectives from psychology, neuroscience, and philosophy.
At the end of this course, you will have gained both knowledge and system building abilities in:
- Define the key features of reinforcement learning that distinguishes it from AI and non-interactive machine learning (as assessed by the exam).
- Given an application problem (e.g. from computer vision, robotics, etc), decide if it should be formulated as a RL problem; if yes be able to define it formally (in terms of the state space, action space, dynamics and reward model), state what algorithm (from class) is best suited for addressing it and justify your answer (as assessed by the project and the exam).
- Implement in code common RL algorithms (as assessed by the homeworks).
- Describe (list and define) multiple criteria for analyzing RL algorithms and evaluate algorithms on these metrics: e.g. regret, sample complexity, computational complexity, empirical performance, convergence, etc (as assessed by homeworks and the exam).
- Describe the exploration vs exploitation challenge and compare and contrast at least two approaches for addressing this challenge (in terms of performance, scalability, complexity of implementation, and theoretical guarantees) (as assessed by an assignment and the exam).
List of Topics covered in this course (expected)
With a focus on AI as the design of agents learning from experience to predict and control their environment, topics will include
- Markov decision processes
- Planning by approximate dynamic programming
- Monte Carlo and Temporal Difference Learning for prediction
- Monte Carlo, Sarsa, and Q-learning for control
- Dyna and planning with a learned model
- Prediction and control with function approximation
- Policy gradient methods
Repository for code: https://github.com/pairlab/csc498-material
- Homework 1 link: due Wed 02/11/2021
- Homework 2
- Homework 3
- Homework 4
Priority will be given to students who meet prerequisites for the course. Knowledge of probability, multivariate calculus, and linear algebra is expected.
- Intro to ML (CSC 311) or Intro to AI () or equivalent.
- CSC 209, MAT223, MAT232 and STA256
- CSC375, MAT224 and related
Algorithm implementation will be done mainly in Python. Please familiarize yourself with the language and common tools (git, cmd, and the frameworks numpy and pytorch).
Textbook & Resources
There is no required textbook.The course will provide all material in class and handouts.
The students can refer the following material for additional help:
- A great introductory text on reinforcement learning: Sutton and Barto, Reinforcement Learning
- If you need a refresher on common mathematical tools and tricks for ML: Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, Mathematics for Machine Learning
- If you want to learn more about convex optimization, I recommend: Stephen Boyd's EE364a: Convex Optimization I and Stephen Boyd's EE364b: Convex Optimization II. Both of them have all course materials, including lecture videos, available online.
Reinforcement learning ressources:
- RL Course from UW - Byron Boots (https://homes.cs.washington.edu/~bboots/RL-Fall2020/)
- RL Course from Stanford - Emma Brunskill (http://web.stanford.edu/class/cs234/index.html)
- RL Course from University of Alberta - Martha White (https://marthawhite.github.io/rlcourse/schedule.html)
- RL course at ASU/MIT, Dimitry Bertsekas - (http://web.mit.edu/dimitrib/www/RLbook.html )
- David Silver's course on Reinforcement Learning (http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html)
- Deep RL Course from Berkeley - Sergey Levine (http://rail.eecs.berkeley.edu/deeprlcourse/resources/#courses)
This course combines lectures with Tutorials, encouraging both fundamental knowledge acquisition as well as hands-on experience. Each student will be responsible for 4 individual assignments (60%), two take-home midterm (20%) and one team project (in teams of 2) (20%).
Each student will have 3 grace days throughout the semester for late assignment submissions. Late submissions that exceed those grace days will lose 33% of their value for every late day beyond the allotted grace days. Late submissions that exceed three days of delay after the grace days have been used will unfortunately not be accepted. The official policy of the Registrar’s Office at UTM regarding missed exams can be found here .