|SPEAKER||TITLE OF THE TALK|
|Krishna Narayanan||Analog Joint Source-Channel Coding for Gaussian Sources over AWGN Channels with Deep Learning|
|Arya Mazumdar||Learning Mixtures and Trace Reconstruction|
|Suhas Diggavi||Compression for Learning|
|Jonathan Scarlett||Sample Complexity Lower Bounds for Compressive Sensing with Generative Models|
|Changho Suh||FR-Train: Fair and Robust Training via Information Theory|
Krishna Narayanan is the Eric D. Rubin professor in the Department of Electrical and Computer Engineering at Texas A&M University. His research interests are broadly in coding theory, information theory, joint source-channel coding and signal processing with applications to wireless networks, data storage and data science. He served as an associate editor for coding techniques for the IEEE Transactions on Information Theory from 2015-2018. He served as the area editor for the coding theory and applications area of the IEEE Transactions on Communications from 2007 until 2011. In 2014, he received the Professional Progress in Engineering award given to an outstanding alumnus of Iowa State University each year under the age of 44. He was elected as Fellow of the IEEE for contributions to coding for wireless communications and data storage. He has won several awards within Texas A&M University including the 2017 University level distinguished achievement award for teaching.
Abstract: We consider the design of neural network based joint source channel coding (JSCC) schemes for transmitting an independent and identically distributed (i.i.d.) Gaussian source over additive white Gaussian noise (AWGN) channels with bandwidth mismatch when the source dimension is small. Unlike existing deep learning based works on this topic, we do not resort to domain expertise to constrain the model; rather, we propose to employ fine tuning techniques to optimize the model. We show that our proposed techniques can provide performance that is comparable to that of the state-of-the-art when the source dimension is small. Furthermore, the proposed model can spontaneously learn encoding functions that are similar to those designed by conventional schemes. Finally, we empirically show that the learned JSCC scheme is robust to mismatch between the assumed and actual channel signal to noise ratios.
Arya Mazumdar is an Associate Professor of Computer Science with additional affiliations to the Department of Mathematics and Statistics and the Center for Data Science at the University of Massachusetts Amherst (UMass). He is also a researcher in Amazon AI and Search. In the past, he was a faculty member at University of Minnesota-Twin Cities (2013--15), and a postdoctoral scholar at Massachusetts Institute of Technology (2011--12). Arya received his Ph.D. from the University of Maryland College Park, where his thesis won a Distinguished Dissertation Award (2011). Arya is a recipient of multiple other awards, including an NSF CAREER award (2015), an EURASIP Best Paper Award (2020) and an ISIT Jack K. Wolf Paper Award (2010). He is currently serving as an Associate Editor for the IEEE Transactions on Information Theory. Arya's research interests include coding theory (error-correcting codes and related combinatorics), information theory and foundations of machine learning.
Abstract: We present a complex-analytic method of learning mixture of distributions and apply it to learn Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and Poisson mixtures, among others. The complex analytic method was introduced to reconstruct a sequence from their random subsequences, which is called the trace reconstruction problem. We show some new results in trace reconstruction and mention some potential extension of the complex analytic method in learning mixtures.
Suhas N. Diggavi received the B. Tech. degree in electrical engineering from the Indian Institute of Technology, Delhi, India, and the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA, in 1998. After completing his Ph.D., he was a Principal Member Technical Staff in the Information Sciences Center, AT&T Shannon Laboratories, Florham Park, NJ. After that he was on the faculty of the School of Computer and Communication Sciences, EPFL, where he directed the Laboratory for Information and Communication Systems (LICOS). He is currently a Professor, in the Department of Electrical Engineering, at the University of California, Los Angeles, where he directs the Information Theory and Systems laboratory.
His research interests include information theory and its applications to several areas including wireless networks, cyber-physical systems, distributed computation and learning, security and privacy, genomics, data compression; more information can be found at http://licos.ee.ucla.edu. He has received several recognitions for his research including the 2013 IEEE Information Theory Society & Communications Society Joint Paper Award, the 2013 ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc) best paper award, the 2006 IEEE Donald Fink prize paper award and the 2019 Google Faculty Research Award. He served as a Distinguished Lecturer and also currently serves on board of governors for the IEEE Information theory society. He is a Fellow of the IEEE.
He has been an associate editor for IEEE Transactions on Information Theory, ACM/IEEE Transactions on Networking, IEEE Communication Letters, a guest editor for IEEE Selected Topics in Signal Processing and in the program committees of several IEEE conferences. He has also helped organize IEEE and ACM conferences including serving as the Technical Program Co-Chair for 2012 IEEE Information Theory Workshop (ITW), the Technical Program Co-Chair for the 2015 IEEE International Symposium on Information Theory (ISIT) and General co-chair for Mobihoc 2018. He has 8 issued patents.
Abstract: As machine learning gets deployed over edge (wireless) devices (in contrast to datacenter applications), the problem of building learning models with communication-efficient training with local (heterogeneous) data becomes more important. These applications motivate learning when data is collected/available locally, but devices collectively help build a model through wireless links with significant communication rate (bandwidth) constraints. In this talk we review some of our recent work on compression of gradients for distributed optimization using aggressive sparsification with quantization and local computation along with error compensation (by keeping track of the difference between the true and compressed gradients). We begin with a distributed optimization framework, by proposing and analyzing Qsparse-local-SGD, a new gradient compression scheme, for both non-convex and convex objective functions. We demonstrate that Qsparse-local-SGD converges at the same rate as vanilla distributed SGD for many important classes of sparsifiers and quantizers. We extend these ideas to the decentralized setting where we design and analyze SQuARM-SGD, which further improves compression by including event-triggered communication and also incorporates Nesterov's momentum into the optimization framework. We also provide convergence guarantees of SQuARM-SGD for strongly-convex and non-convex smooth objectives. Here again we show that SQuARM-SGD matching the convergence rate of vanilla SGD, effectively demonstrating compression without much degradation in convergence rates. We use Qsparse-local-SGD and SQuARM-SGD to train ResNet-50 on ImageNet and show that it results in significant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy. without sacrificing much on the accuracy and convergence rate.
Jonathan Scarlett is an assistant professor in the Department of Computer Science and Department of Mathematics, National University of Singapore. His research interests are in the areas of information theory, machine learning, signal processing, and high-dimensional statistics. He received the Singapore National Research Foundation (NRF) fellowship, and the NUS Early Career Research Award.
Previously, Jonathan received the B.Eng. degree in electrical engineering and the B.Sci. degree in computer science from the University of Melbourne, Australia. From October 2011 to August 2014, he was a Ph.D. student in the Signal Processing and Communications Group at the University of Cambridge. From September 2014 to September 2017, he was a post-doctoral researcher at LIONS, EPFL.
Abstract: Recently, it has been shown that for compressive sensing, significantly fewer measurements may be required if the sparsity assumption is replaced by the assumption the unknown vector lies near the range of a suitably-chosen generative model. In particular, in (Bora et al., 2017) it was shown roughly O(k log(L)) random Gaussian measurements suffice for accurate recovery when the generative model is an L-Lipschitz function with bounded k-dimensional inputs, and O(kd log(w)) measurements suffice when the generative model is a k-input ReLU network with depth d and width w. In this paper, we establish corresponding algorithm-independent lower bounds on the sample complexity using tools from minimax statistical analysis. In accordance with the above upper bounds, our results are summarized as follows: (i) We construct an L-Lipschitz generative model capable of generating group-sparse signals, and show that the resulting necessary number of measurements is O(k log(L)); (ii) Using similar ideas, we construct ReLU networks with high depth and/or high width for which the necessary number of measurements scales as O(kd log(w)/log(n)) (with output dimension n), and in some cases O(kd log(w)). As a result, we establish that the scaling laws derived in (Bora et al., 2017) are optimal or near-optimal in the absence of further assumptions.
Changho Suh is an Associate Professor in the School of Electrical Engineering at Korea Advanced Institute of Science and Technology (KAIST). He received the B.S. and M.S. degrees in Electrical Engineering from KAIST in 2000 and 2002 respectively, and the Ph.D. degree in Electrical Engineering and Computer Sciences from UC-Berkeley in 2011. From 2011 to 2012, he was a postdoctoral associate at the Research Laboratory of Electronics in MIT. From 2002 to 2006, he had been with the Telecommunication R&D Center, Samsung Electronics. Dr. Suh received the 2018 IEIE/IEEE Joint Award, the 2015 IEIE Haedong Young Engineer Award, a 2015 Bell Labs Prize finalist, the 2013 IEEE Communications Society Stephen O. Rice Prize, the 2011 David J. Sakrison Memorial Prize (top research award in the UC-Berkeley EECS Department), and the 2009 IEEE ISIT Best Student Paper Award. He also founded Suh & Cho Academy in 2020, and received the Google Education Grant in 2019, and the Department Teaching Awards in 2012 and 2019. He is currently a Distinguished Lecturer for the IEEE Information Theory Society (2020-21) and a Member of Young Korean Academy of Science and Technology (Y-KAST).
Abstract: Trustworthy AI becomes a critical issue in a widening array of applications where one needs to ensure fair and robust model training in the presence of data bias and poisoning. We take an information-theoretic approach using mutual information (MI) to develop an integrated framework that holistically performs fair and robust training for an interested classifier. We provide an MI-based interpretation of an adversarial learning framework, and apply this insight to architect two discriminators that serve to ensure fairness and robustness respectively. We conduct extensive experiments both on synthetic and benchmark real datasets to demonstrate that FR-Train shows almost no decrease in fairness-accuracy tradeoff performance in the presence of data poisoning.