E2 201 Information Theory (Aug-Dec 2018)

Instructor: Himanshu Tyagi
Teaching Assistant: K R Sahasranand

Latest Announcements

  • Homework 5 posted. Quiz on Nov 23, 4:00pm in ECE 108.
  • Extra class on Friday, Nov 16, 3:00pm in ECE 108.
  • Final exam on Firday, Nov 30 10:00am in ECE 108.
  • Midterm 2 in class on Nov 15.
  • Homework 4 posted. Quiz on Nov 9, 4:00pm in ECE 108.
  • Homework 3 posted. Quiz on Sep 28, 2:30pm in ECE 108. (Note the different time!)
  • Midterm 1 on Thursday, October 4, 3:30pm-5:00pm in ECE 108.
  • Homework 2 posted. Quiz on Sep 21, 4:00pm in ECE 108.
  • Homework 1 posted. Quiz on Aug 31, 3:30pm in ECE 108.
  • A review session on basic probability will be held by KR Sahasranand on Friday, August 10 at 5pm in ECE 108.
  • First class on Thursday, August 2 at 3:30pm in ECE 108.

Time & Venue

  • Class: Tu-Th 3:30-5:00pm in ECE 108
  • Review session: Friday 4-5pm in ECE 108
  • Instructor office hours: By appointment

Homework

Agenda

This is a roughly 14 weeks course. It will be divided into the following three parts:
  1. Basics: Introduction to Information theory; a simple data compression problem; transmission of two messages over a noisy channel (hypothesis testing); measures of information and their properties.
  2. Source and Channel coding: Data compression; transmission over noisy channels
  3. Special topics (time permitting): Differential entropy; Rate-distortion theory;
Lecture notes will be uploaded before or within a week of the class. However, these are simply a sketch of what was/will be discussed in the class, and I suggest that the students maintain their own class notes.

    Part 1: Basics

  • Lecture 1: Introduction to Information Theory, course logistics; introduction to data compression; review of probability [pdf]
  • Lecture 2: Review of probability, a source coding theorem [pdf]
  • Lecture 3: A source coding theorem (contd.); Memoryless sources and Shannon's lossless source coding theorem [pdf]
  • Lecture 4: Proof of Shannon's lossless source coding theorem; introduction to transmission [pdf]
  • Lecture 5: Introduction to binary hypothesis testing; bounds for error in binary hypothesis testing [pdf]
  • Lecture 6: Bounds for error in binary hypothesis testing (contd.); Stein's lemma; example application: Distinguishing two coins [pdf]
  • Lecture 7: M-ary hypothesis testing and the MAP rule; Union bound for reducing M-ary to binary; Introduction to the channel coding problem [pdf]
  • Lecture 8: Heuristic derivation of Shannon’s channel coding theorem and introduction of mutual information [pdf]
  • Lecture 9: Properties of Information Measures [pdf]
  • Lecture 10: Properties of Information Measures (contd.): Shapes of information measures [pdf]
  • Lecture 11: Properties of Information Measures (contd.): Data processing inequalities; Fano's inequality; Continuity of entropy [pdf]
  • Lecture 12: Properties of Information Measures (contd.): Variational forms of information measures and corollaries [pdf]
  • Part 2a: Data Compression

  • Lecture 13: Introduction to data compression; fixed- and variable-length codes; types of error-free codes: Nonsingular, Uniquely-decodable and Prefix-free [pdf]
  • Lecture 14: Kraft's inequality; Shannon codes; minimum average length for prefix-free and uniquely decodable codes [pdf]
  • Lecture 15: Bounds for length of nonsingular codes; Shannon-Fano code [pdf]
  • Lecture 16: Shannon-Fano-Elias code; Huffman code; codes with error allowed [pdf]
  • Lecture 17: Lower bounds for average length of codes with error allowed; coding a sequence of symbols; Arithmetic code; Introduction to universal codes [pdf]
  • Lecture 18: Fixed-length universally rate optimal codes; introduction to types; duality between probability assignment and universal coding [pdf]
  • Lecture 19: Variable-length universally code using types; Universal arithmetic codes [pdf]
  • Part 2b: Transmission over Noisy Channels

  • Lecture 20: Channel codes and capacity; Bounds for capacity of a binary symmetric channel (sphere packing bound and the GV bound) [pdf]
  • Lecture 21: Feinstein's maximal code construction for a binary symmetric channel [pdf]
  • Lecture 22 (two classes): Extending maximal code construction to a general DMC; method of types [pdf]
  • Lecture 23: Maximal code construction for a general DMC; sphere packing bound for a general DMC; Shannon's channel coding theorem [pdf]
  • Lecture 24: Strong converse for average probability of error; examples; Fano's inequality based week converse; feedback does not improve capacity; source-channel separation [pdf]
  • Lecture 25: Random coding proof of achievability based on typical sets (We skipped this lecture in the 2018 edition) [pdf]
  • Lecture 26: Random coding proof of achievability based on information density [pdf]
  • [Still to come ...]
  • Lecture 27: Additive Gaussian noise channel; capacity of AGN; proof sketch of the converse via sphere packing [pdf]
  • Lecture 28: Random coding achievability proof for AGN [pdf]
  • Summary of what we have covered in channel coding [pdf]

    Part 3: Special topics

  • Lecture 29: Differential entropy; volume of large probability sets; weak converse for AGN capacity
  • Lecture 30: Lossy compression and rate distortion theorem

Prerequisites

Undergraduate level probability and appetite for formal mathematical reasoning.

Grading

Homework (5-6 assignments):   20%   A quiz will be organised based on each homework. Lowest grade will be dropped.
Mid-term 1:   20%   September mid
Mid-term 2:   20%   October end
Final exam:   40%   November end

References

While there are many very good textbooks available for Information Theory, there is no required textbook for this course. Here I list a subset from which I will draw my lecture notes:
  1. T. Cover and J. Thomas, Elements of Information Theory, Second edition, Wiley, 2006.
  2. I. Csiszàr and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, Second edition, Cambridge, 2011.
  3. J. Wolfowitz, Coding Theorems of Information Theory, Probability Theory and Stochastic Processes series, Springer, 1978.
  4. A. Khinchin, Mathematical foundations of information theory, Dover, 2001 edition.
  5. A. Feinstein, Foundations of Information Theory, McGraw-Hill, 1958.
  6. T. S. Han, Information spectrum methods in Information Theory, Stochastic Modeling and Applied Probability series, Springer, 2003.
  7. R. G. Gallager, Information Theory and Reliable Communication, Wiley, 1969.

Course description

This is a graduate level introductory course in Information Theory and will be a prerequisite for many advanced courses. Information Theory is the science for measuring, preserving, transmitting, and estimating information in random data. It was initially proposed by Shannon as a mathematical theory of communication more than five decades ago. It provides the fundamental limits of performance for transmission of messages generated by a random source over a noisy communication channel. On the one hand, Information Theory has been the driving force behind the revolution in digital communication and has led to various practical data compression and error correcting codes that meet the fundamental theoretical limits of performance. On the other hand, over the years, techniques and concepts from Information Theory have found applications well beyond communication theory. In this course, we will introduce the basic notions and results of Information Theory, keeping in mind both its fundamental role in communication theory and its varied applications beyond communication theory. This course, and the follow-up advanced courses to be offered in the future, will be of interest to students from various backgrounds.

Broad list of topics

  • Compression of random data: fixed and variable length source coding theorems and entropy
  • Hypothesis testing: Stein's lemma and Kullback-Leibler divergence
  • Measures of randomness: Properties of Shannon entropy, Kullback-Leibler divergence, and Mutual Information
  • Transmission over a noisy channel: channel coding theorems, joint source-channel coding theorem, the Gaussian channel
  • Lower bounds on minimax cost in parameter estimation using Fano's inequality
  • Quantisation and lossy compression: rate-distortion theorem

Who should take this course

  • Any communication engineer!
  • Computer scientists interested in data compression
  • Computer scientists interested in communication complexity
  • Cryptographers interested in information theoretic security and quantum cryptography
  • Data scientists interested in information theoretic methods and measures of information
  • Statisticians interested in information theoretic lower bounds
  • Physicists interested Quantum Information Theory