E2 237, Fall 2024 | Parimal Parag

Statistical Learning Theory

Lectures

06 Aug 2024: Lecture-01 Introduction
08 Aug 2024: Lecture-02 Review – linear algebra and convex optimization
13 Aug 2024: Lecture-03 Support vector machines – separable case
20 Aug 2024: Lecture-04 Support vector machines – non-separable case
22 Aug 2024: Lecture-05 Kernel methods – PDS kernels
27 Aug 2024: Lecture-06 Kernel methods – reproducing kernel Hilbert space
29 Aug 2024: Lecture-07 Sample complexity – probably approximately correct learning
03 Sep 2024: Lecture-08 Sample complexity – Rademacher complexity
05 Sep 2024: Lecture-09 Sample complexity – VC dimensions
10 Sep 2024: Lecture-10 Margin theory – complexity bounds on separating hyperplanes
12 Sep 2024: Lecture-11 Margin theory – margin based generalization bounds
17 Sep 2024: Lecture-12 Statistical decision theory – Bayes and minimax risk
19 Sep 2024: Lecture-13 Statistical decision theory – Minimax theorem
24 Sep 2024: Lecture-14 Statistical decision theory – Tensorization and log-cancavity
01 Oct 2024: Lecture-15 Divergence – definitions
03 Oct 2024: Lecture-16 Divergence – local behavior
08 Oct 2024: Lecture-17 Fisher information – parametrized family
10 Oct 2024: Lecture-18 Fisher information – local behavior of divergence
15 Oct 2024: Lecture-19 Large scale asymptotics – Minimax lower bounds
17 Oct 2024: Lecture-20 Large scale asymptotics – Bayesian lower bounds
22 Oct 2024: Lecture-21 Information theoretic methods – Information theory and rate distortion
24 Oct 2024: Lecture-22 Information theoretic methods – mutual information method
05 Nov 2024: Lecture-23 Reduction to hypothesis testing – Le Cam method
07 Nov 2024: Lecture-24 Reduction to hypothesis testing – Examples
12 Nov 2024: Lecture-25 Reduction to hypothesis testing – Assouad’s Lemma
14 Nov 2024: Lecture-26 Reduction to hypothesis testing – Fano’s method

Homework

15 Aug 2024: Homework-01
29 Aug 2024: Homework-02
12 Sep 2024: Homework-03
26 Sep 2024: Homework-04
10 Oct 2024: Homework-05
24 Oct 2024: Homework-06
07 Nov 2024: Homework-07

Tests and grading policy

Mid Term Hours: 08:30-10:00
Mid Term Venue: EC 1.08, ECE main building

Final Hours: 14:00-17:00
Final Venue: EC 1.07, ECE main building

13 Sep 2024: Mid Term 1 (25)
18 Oct 2024: Mid Term 2 (25)
23 Nov 2024: Final Exam (50)

Course Syllabus

Binary classification: SVM, kernel methods
Complexity bounds: bias complexity trade-off, Rademacher complexity, VC-dimension
Multiclass classification: decision trees, nearest neighbours
Estimation: parameter estimation, nonparametric regression
Optimization: stochastic gradient descent, minimax
Decision theory: statistical decision theory, large-sample asymptotics
Information theoretic bounds: mutual information method, lower bound via hypothesis testing, entropic bounds for statistical estimation, strong data processing inequality

Prerequisite

Instructor’s approval is required for crediting this course. Course requires a background in the first graduate course in probability theory and random processes.

Description

The aim of this course is to provide performance guarantees on various data driven algorithms for classification, estimation, and decision problems under uncertainty. The guarantees are provided by the upper and lower bounds on the algorithm accuracy as a function of the number of samples. The upper bounds are derived from the classical complexity results and the lower bounds follow from information theoretic techniques.

Teams/GitHub/Overleaf Information

Teams

We will use Microsoft Teams for all the course related communication.
Please do not send any email regarding the course.
You can signup for the course team Statistical-Learning-2024 using the following code 3m0ywvq.

Instructor

Parimal Parag
Office: EC 2.17
Hours: By appointment.

Time and Location

Classroom: EC 1.07, ECE main building
Hours: Tue/Thu 08:30-10:00.

Teaching Assistants

Varshini Mylabathula
Email: varshinim@iisc
Hours: By appointment.

References

Foundations of Machine Learning, Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar, 2nd edition, MIT Press, 2018.
Information Theory: From Coding to Learning, Yury Polyanskiy and Yihong Wu, Cambridge University Press, 2023.
Information-theoretic Methods for High-dimensional Statistics, Yihong Wu, Lecture notes.
High-Dimensional Statistics: A Non-asymptotic Viewpoint, Martin Wainwright, Cambridge University Press, 2019.
Introduction to Statistical Learning Theory, Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi, Lecture notes.