CS 8803 DRL: Deep Reinforcement Learning

Course Logistics

Instructor: Animesh Garg (office: CODA S1145)
Teaching Assistants: Jeremy Collins, Mili Das, Jenny (Hye-Young) Chung
Canvas will be used to take quizzes, view grades and view assignments.
EdStem: should be your first stop for questions and announcements.
Lecture: Monday / Wednesday - 3:30 PM - 4:45 PM - Instructional Center Room 111
Office Hours:

Animesh Garg: Monday 4:45 PM - 5:45 PM
TAs: Monday 3:00 PM - 4:00 PM (Zoom Link)

Course Overview

Description

Robots of the future will need to operate autonomously in unstructured and unseen environments. It is imperative that these systems are built on intelligent and adaptive algorithms. Learning by interaction through reinforcement offers a natural mechanism to postulate these problems.

This graduate-level seminar course will cover topics and new research frontiers in reinforcement learning (RL). Planned topics include: Model-Based and Model-Free RL, Policy Search, Monte Carlo Tree Search, off-policy evaluation, temporal abstraction/hierarchical approaches, inverse reinforcement learning and imitation learning.

Learning objectives

At the end of this course, you will:

Acquire familiarity with state of the art in RL
Articulate limitations of current work, identify open frontiers, and scope research projects.
Constructively critique research papers, and deliver a tutorial style presentation.
Work on a research based project, implement & evaluate experimental results, and discuss future work in a project paper.

Textbooks and Resources

There is no official textbook for the class.

A number of the supporting readings will come from: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition. This is available for free here and references will refer to the final pdf version available here.

Some other additional references that may be useful are listed below:

Additional Resources from similar courses.

Prerequisites

You need to be comfortable with:

introductory machine learning concepts (CS 4644/7643/7641)
linear algebra
basic multivariable calculus
intro to probability You also need to have strong programming skills in Python.

Note: if you don’t meet all the prerequisites above please discuss with the instructor after class.

Optional, but recommended: experience with neural networks and introductory-level familiarity with reinforcement learning and control.

Grading

Homeworks: 60%
- Four homework assignments
- Programming + Short questions
Paper presentation and implementation: 30%
- Presentation (30 mins): 10%
- Write up with critique and follow-ups: 5%
- Hands-on-implementation + notebook: 15%
Quizzes and participation (live only): 10%

Late Policy

You have a total of 4 late days to use across all homework assignments with no penalty. After those are exhausted, each additional late day incurs a 20% deduction (rounded up, e.g., 2 days and 1 minute late = 60% penalty). Late submissions for the paper presentation and implementation will not be accepted.