Welcome to NCC 2019!

Generative Text-to-Speech Synthesis: From HMM-based speech synthesis to neural end-to-end TTS

Heiga Zen


Speaker:  Heiga Zen
Google Brain, Tokyo



Abstract: Generative model-based text-to-speech (TTS) synthesis has grown in popularity in the last a few years. Thanks to the recent progress of neural end-to-end approaches, it has reached the human-level quality in naturalness with flexibility to change its voice characteristics for synthesizing isolated sentences. Some of them have already been in production systems and served millions of queries. This tutorial will follow the progress of this technology from its fundamental concept to the real implementation, including the conventional statistical parametric speech synthesis to the latest neural end-to-end models.

Bio: Dr. Heiga Zen received his AE from Suzuka National College of Technology, Suzuka, Japan, in 1999, and PhD from the Nagoya Institute of Technology, Nagoya, Japan, in 2006. He was an Intern/Co-Op researcher at the IBM T.J. Watson Research Center, Yorktown Heights, NY, USA (2004--2005), and a Research Engineer at Toshiba Research Europe Ltd. Cambridge Research Laboratory, Cambridge, UK (2008--2011). At Google, he was with the Speech team from July 2011 to July 2018, then joined the Brain team from August 2018. Currently he is a Senior Staff Research Scientist at the Google Brain Tokyo team, Tokyo, Japan. His research interests include speech technology and machine learning. He was one of the original authors and the first maintainer of the HMM-based speech synthesis system (HTS).

mmWave Radio: The Next Era in Wireless Communication

Aditya K. Jagannatham


Speaker:  Aditya K. Jagannatham
IIT, Kanpur



Abstract: 5G wireless networks are envisaged to support gigabit links to meet the ever-increasing demand for higher data rates. In this context, millimeter wave (mmWave) communication, which leverages the vast spectral opportunities in the mmWave band (30 – 300 GHz), has shown significant promise towards realizing the goals of next generation wireless systems. Further, due to the lower wavelength, multiple-input multiple-output (MIMO) is a natural technology of choice for such systems, which can significantly enhance their throughput. However, practical realization of mmWave MIMO technology is fraught with challenges since it is vulnerable to significantly higher losses arising from its higher carrier frequencies and also due to the increased hardware complexity for signal processing required to support the high bandwidth communication. To overcome these barriers, hybrid RF-baseband processing has emerged as a popular choice due to its lower complexity and improved RF/ baseband load distribution. This tutorial will present a brief analytical introduction to transceiver design and signal processing for mmWave MIMO systems, coupled with practical insights.

Bio: Aditya K. Jagannatham received his Bachelors degree from the Indian Institute of Technology, Bombay and M.S. and Ph.D. degrees from the University of California, San Diego, U.S.A. From April '07 to May '09 he was employed as a senior wireless systems engineer at Qualcomm Inc., San Diego, California, where he was a part of the Qualcomm CDMA technologies (QCT) division. His research interests are in the area of next-generation wireless cellular and WiFi networks, with special emphasis on various 5G technologies such as massive MIMO, mmWave MIMO, FBMC, NOMA, Full Duplex and others. He is currently a Professor in the Electrical Engineering department at IIT Kanpur.

Modern Methods of Machine Learning

Sargur N. Srihari


Speaker:  Sargur N. Srihari
University at Buffalo, USA



Abstract: Artificial Intelligence (AI) methods of today are based on learning from examples--a methodology commonly known as machine learning. This tutorial provides an overview of modern methods of machine learning while introducing topics of current research interest—which include deep learning, probabilistic graphical models and reinforcement learning.

There will be four parts to the tutorial:
  1. overview of machine learning
  2. deep learning
  3. probabilistic graphical models
  4. reinforcement learning

We will begin with an overview of artificial intelligence and how simple machine learning techniques for classification and regression have largely replaced earlier knowledge-based methods. We will describe supervised/unsupervised learning, discriminative/generative models and specific algorithms such as logistic regression and SVMs.

The second part is about how deep learning differs from simple machine learning. We will describe commonly used deep learning architectures such as convolutional neural networks and recurrent neural networks. We will also describe several deep learning research topics, such as learning representations where variables are disentangled and decision making is simplified.

The third part will review probabilistic graphical models (PGMs). These are methods that allow reasoning with a diverse set of variables. They also allow explanations to accompany decisions. Methods of inference that overcome computational intractability will also be discussed.

The final part will concern reinforcement learning. Here the learning data is from the environment when an autonomous agent performs actions. We will describe methods known as Q-learning and policy learning. In particular deep reinforcement learning will be described.

Bio:Srihari is a SUNY Distinguished Professor in the Department of Computer Science and Engineering at the University at Buffalo, The State University of New York. He held the Rukmini Govindachar chair in the School of Automation, Indian Institute of Science during 2018.

A laboratory that Srihari founded at Buffalo, known as CEDAR, developed the world’s first automated system for reading handwritten postal addresses. It was deployed by the United States Postal Service-- which saved hundreds of millions of dollars in labor costs helping keep US postal rates lowest in the western world. A side-effect of this project was that it led to the task of recognizing handwritten digits to be considered the fruit-fly of AI methods.

Srihari also spent a decade developing AI and machine learning methods for forensics—focusing on pattern evidence such as latent prints, handwriting and footwear impressions. In particular, quantifying the value of handwriting evidence-- to allow presenting such testimony in US courts. Srihari has served on the National Academy of Sciences Committee on Identifying the Needs of the Forensic Science Community which led to an influential report. He has also served on NIJ-NIST committees on Human Factors in Fingerprint Analysis and Handwriting Comparison. At present he serves on the Houston Forensics Technical Advisory Board.

At Buffalo he teaches a sequence of three courses in artificial intelligence and machine learning: (i) introduction to machine learning, (ii) probabilistic graphical models and (iii) deep learning. During 2018-19 he is teaching these subjects to over 400 graduate and undergraduate students.

Srihari's honors include: Fellow of the Institute of Electronics and Telecommunications Engineers (IETE, India) , Fellow of the IEEE, Fellow of the International Association for Pattern Recognition and distinguished alumnus of the Ohio State University College of Engineering. He received an Excellence in Graduate mentorship award from the University at Buffalo in 2018. Srihari received a B.Sc. in Physics and Mathematics from the Bangalore University, a B.E. in Electrical Communication Engineering from the Indian Institute of Science and a Ph.D. in Computer and Information Science from the Ohio State University.

Open-Prototyping of 4G/5G systems with OpenAirInterface

Rajeev Gangula


Speaker:  Rajeev Gangula



Abstract: This tutorial starts with a brief introduction to the OpenAirInterface Software Alliance (OSA) which aims at creating a global community developing open source software and hardware for 3GPP cellular networks. We then give an overview of the internals of the radio access network component of this project called openairinterface5g. We will describe the real-time processing architecture of OpenAirInterface (OAI) eNodeB (or basestation) with special emphasis on topics related to physical and access layers of the protocol stack. Finally, to show the usage of OAI in research, we present a scenario where an OAI enabled drone is used as a flying relay to enhance LTE connectivity to mobile users.s

Bio:Rajeev Gangula obtained his M.Tech degree from Indian Institute of Technology, Guwahati, in 2010, M.Sc. and Ph.D. degrees from Télécom ParisTech (Eurecom), France, in 2015. From October 2015 to December 2016 he was with Sequans communications, Paris, developing physical layer algorithms for LTE CAT-M chipsets. He is currently working at Eurecom as a research engineer building prototypes for autonomous aerial cellular relay drones capable of providing flexible and enhanced (LTE, 5G) connectivity to mobile users.

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

Amod Anandkumar Pallavi Kar


Speakers:  Dr. Amod Anandkumar, Ms. Pallavi Kar
Mathworks, India



Click for Tutorial slides

Abstract: Deep learning can achieve state-of-the-art accuracy for many tasks considered algorithmically unsolvable using traditional techniques. In this session, you can gain practical knowledge of the domain of deep learning and discover new MATLAB® features that simplify these tasks and eliminate the low-level programming. From prototype to production, you’ll see demonstrations on training and deploying neural networks for signal processing and communications applications, including:

  • Building deep learning models from scratch and performing transfer learning
  • Understanding network behaviour using visualizations and other techniques
  • Simplifying data labelling using MATLAB apps
  • Accelerating deep learning training using multiple GPUs and computer clusters
  • Generating code automatically from MATLAB algorithms and deploying it on enterprise applications and embedded devices

Bio: Dr. Amod Anandkumar is a senior team lead for signal processing and communications in the Application Engineering group at MathWorks India. Prior to this, he was a lead engineer with the Advanced Technology group at Samsung Research India, Noida where he developed physical layer techniques for LTE wireless communications systems and novel healthcare applications for smartphones. He was also a post-doctoral research fellow at the Biomedical Signal Analysis Lab, GE Global Research Bangalore, working on advanced beamforming techniques for ultrasound imaging and novel signal processing solutions for ICU patient monitoring systems, resulting in one US patent filing. Amod holds a B.Tech degree from National Institute of Technology Karnataka and a Ph.D. degree from Loughborough University, UK. His research interests include applied signal processing, next-generation wireless networks, computer vision, game theory, and convex optimization. He has published and reviewed papers in numerous international conferences and journals.

Ms. Pallavi Kar works as an application engineer at MathWorks in the area of Language of Technical Computing. She primarily focuses on the area of data analytics from intuition building and preprocessing of data to model development. Pallavi has five years of experience working across many industries. Over the years, she has worked on prognostics, lithium-ion batteries, model development and simulation, telematics, and server management. She has worked as a senior member of the Advanced Technologies team at Mahindra Reva Electric Vehicles in Bangalore. Pallavi holds a bachelor’s degree in electronics and communication engineering and a master’s degree in energy.