E1 396 : Topics in Stochastic Approximation Algorithms

August 2019

Instructor

Rajesh Sundaresan

Lecture Hours

Location

MP20 (Microelectronics and Photonics building auditorium, ground floor)

Course syllabus

Introduction to stochastic approximation algorithms, ordinary differential equation based convergence analysis, stability of iterates, multi-timescale stochastic approximation, asynchronous update algorithms, gradient search based techniques, topics in stochastic control, infinite horizon discounted and long run average cost criteria, algorithms for reinforcement learning.

Course Grade and Examination Dates

50/100 : Mid-term exam, take-home -- to be sent by email on Friday 04 October 2019 and will be due Monday 07 October 2019.
50/100 : Final exam, 14:00 - 17:00 hrs, Friday 29 November 2019 (revised schedule)

Reference Texts

V. S. Borkar, Stochastic Approximation: A Dynamical Systems Viewpoint, Hindustan Book Agency, 2008.

Lecture topics

Lecture 01: Motivating example - the urn scheme
Lecture 02: Motivating example - regression
Lecture 03: Preliminaries - Gronwall lemma discrete-time and continuous-time, fixed point theorem
Lecture 04: Well-posedness of an ODE
Lecture 05: Martingales and stopping times -- examples
Lecture 06: Optional stopping and sampling theorems, Doob's upcrossing inequality
Lecture 07: Martingale convergence theorems
Lecture 08: Stochastic approximation: Basic convergence lemma
Lecture 09: Preliminaries on asymptotic behaviour of dynamical systems
Lecture 10: Convergence of stochastic approximation iterates to a suitable set
Lecture 11: Examples of connected internally chain transitive subsets, essentiality of chain property
Lecture 12: Liapunov functions
Lecture 13: Lock-in probability, concentrating on a concentration
Lecture 14: Concentration inequalities and locking in the lock-in probability
Lecture 15: Avoidance of traps I (E.g., escaping saddle points)
Lecture 16: Avoidance of traps II (E.g., escaping saddle points)
Lecture 17: Stability: the first criterion, three props and a proof
Lecture 18: The props that provide for stability
Lecture 19: A second stability criterion
Lecture 20: Multiple timescales
Lecture 21: Averaging the natural timescale; Asynchronous schemes I with no delay, motivation for a generalisation
Lecture 22: Preliminaries on functional analysis I
Lecture 23: Preliminaries on functional analysis II and on probability measures on metric spaces
Lecture 24: Stochastic recursive inclusions I
Lecture 25: Stochastic recursive inclusions II, properties of solution spaces of differential inclusions
Lecture 26: Stochastic recursive inclusions III, convergence to a connected ICT set, applications
Lecture 27: Applications I: Averaging the natural timescale, asynchronous schemes revisted
Lecture 28: Applications II: Stochastic gradient descent
Lecture 29: Applications III: Reinforcement learning
Lecture 30: Applications IV: Collective phenomena
Lecture 31: Constant stepsize algorithms overview