E0 259 : Data Analytics
Ramesh Hariharan (Strand Genomics and CSA, IISc) and Rajesh Sundaresan (ECE and RBCCPS, IISc)
A. Y. Sarath
Lectures: Tuesdays and Thursdays, 11:30 - 13:00 hrs
Location: CSA 117
09:00 - 17:00 hrs, Sunday 24 November 2019
09:00 - 12:00 hrs, Thursday 05 December 2019 (revised schedule)
2019 Lectures and assignments
2018 complementary lectures and assignments
2017 complementary lectures and assignments
2016 complementary lectures and assignments
- Module 11: Functional connectivity patterns of the brain (Rajesh Sundaresan)
2015 complementary lectures and assignments
Data sets from astronomy, genomics, neuroscience, sports, surveillance cameras, and social networks will be analysed to answer specific scientific questions. Statistical tools and modeling techniques will be introduced as needed to analyse the data and eventually address the scientific question.
Data Analytics has assumed increasing importance in recent times. Several industries are now built around the use of data for decision making. Several research areas too, genomics and neuroscience being notable examples, are increasingly focused on large-scale data generation rather than small-scale experimentation to generate initial hypotheses. This brings about a need for data analytics. This course will develop modern statistical tools and modeling techniques through hands-on data analysis in a variety of application domains.
The course will illustrate the principles of hands-on data analytics through several case studies (10 such studies). On each topic, we will introduce a scientific question and discuss why it should be addressed. Next, we will present the available data, how it was collected, etc. We will then discuss models, provide analyses, and finally touch upon how to address the scientific question using the analyses.
In 2017, we covered the following case studies.
- Astronomy: From Tycho Brahe's observations to the conclusion that Mars moves in an elliptical orbit.
- Visual Neuroscience: Neural correlates predict search difficulty.
- Genomics: Understanding the causes of cancer.
- Sports: The Duckworth-Lewis-Stern method for setting targets in shortened limited overs cricket matches.
- Genomics: The basis for red-green colour blindness.
- Genomics: Population history of India.
- Signal Processing: Video background separation.
- Networks: Community detection.
- Recommendation systems.
- Random Processes (E2 202) OR Probability and Statistics (E0 232) OR equivalent.
There will be about eight assignments, one on each of the first eight modules. A fair amount of hands-on work is expected. Students will use Python.
- 50/100 : Assignments
- 20/100 : Final examination
- 30/100 : Course project and presentation
- There is no text book for this course. Slides of lectures will be available online.