E1 396 : Topics in Stochastic Approximation Algorithms
August 2019
Instructor
Lecture Hours
Lectures: Mondays and Wednesdays : 15:30 - 17:00 hrs
Location
MP20 (Microelectronics and Photonics building auditorium, ground floor)
Course syllabus
Introduction to stochastic approximation algorithms, ordinary differential equation based convergence analysis, stability of iterates, multi-timescale stochastic approximation, asynchronous update algorithms, gradient search based techniques, topics in stochastic control, infinite horizon discounted and long run average cost criteria, algorithms for reinforcement learning.
Course Grade and Examination Dates
- 50/100 : Mid-term exam, take-home -- to be sent by email on Friday 04 October 2019 and will be due Monday 07 October 2019.
- 50/100 : Final exam, 14:00 - 17:00 hrs, Friday 29 November 2019 (revised schedule)
Reference Texts
- V. S. Borkar, Stochastic Approximation: A Dynamical Systems Viewpoint, Hindustan Book Agency, 2008.
Lecture topics
- Lecture 01: Motivating example - the urn scheme
- Lecture 02: Motivating example - regression
- Lecture 03: Preliminaries - Gronwall lemma discrete-time and continuous-time, fixed point theorem
- Lecture 04: Well-posedness of an ODE
- Lecture 05: Martingales and stopping times -- examples
- Lecture 06: Optional stopping and sampling theorems, Doob's upcrossing inequality
- Lecture 07: Martingale convergence theorems
- Lecture 08: Stochastic approximation: Basic convergence lemma
- Lecture 09: Preliminaries on asymptotic behaviour of dynamical systems
- Lecture 10: Convergence of stochastic approximation iterates to a suitable set
- Lecture 11: Examples of connected internally chain transitive subsets, essentiality of chain property
- Lecture 12: Liapunov functions
- Lecture 13: Lock-in probability, concentrating on a concentration
- Lecture 14: Concentration inequalities and locking in the lock-in probability
- Lecture 15: Avoidance of traps I (E.g., escaping saddle points)
- Lecture 16: Avoidance of traps II (E.g., escaping saddle points)
- Lecture 17: Stability: the first criterion, three props and a proof
- Lecture 18: The props that provide for stability
- Lecture 19: A second stability criterion
- Lecture 20: Multiple timescales
- Lecture 21: Averaging the natural timescale; Asynchronous schemes I with no delay, motivation for a generalisation
- Lecture 22: Preliminaries on functional analysis I
- Lecture 23: Preliminaries on functional analysis II and on probability measures on metric spaces
- Lecture 24: Stochastic recursive inclusions I
- Lecture 25: Stochastic recursive inclusions II, properties of solution spaces of differential inclusions
- Lecture 26: Stochastic recursive inclusions III, convergence to a connected ICT set, applications
- Lecture 27: Applications I: Averaging the natural timescale, asynchronous schemes revisted
- Lecture 28: Applications II: Stochastic gradient descent
- Lecture 29: Applications III: Reinforcement learning
- Lecture 30: Applications IV: Collective phenomena
- Lecture 31: Constant stepsize algorithms overview