E0 259 : Data Analytics
August 2023
Instructors
Ramesh Hariharan (Strand Genomics and ECE, IISc)
Vikram Srinivasan (Founder, Needl.ai and ECE, IISc)
Rajesh Sundaresan (ECE and RBCCPS, IISc)
Teaching Assistants
Parag Dutta
Harsh Gupta
Aastha Vijay Balpande
Soumyadeep Chatterjee
Lecture Hours
Lectures: Tuesdays and Thursdays 15:30 - 17:00 hrs
Location: MP 20
First class: Thursday 03 August 2023, 15:30 - 17:00 hrs
Teaching Assistant Hours
TA hour: Thursdays, 18:00 - 19:00 hrs
Location: CSA 112
Project Presentation
25 November 2023, 09:00 hrs onward, day long, attendance compulsory
Location: MP20
Final Examination
Friday 08 December 2023, 14:00 - 17:00 hrs, as per SCC final exam timetable.
2023 Lecture Progression
- Cricket - Duckworth-Lewis-Stern method
- Primer on hypothesis testing (in preparation for the following modules)
- Community detection
- Effects of smoking
- Visual neuroscience
- Mars orbit
- Covid-19 modelling
- Colour blindness
- Recommendation systems
- Natural language processing
2021 Lectures and assignments
2018 complementary lectures and assignments
2017 complementary lectures and assignments
2016 complementary lectures and assignments
- Module 12: Functional connectivity patterns of the brain (Rajesh Sundaresan)
2015 complementary lectures and assignments
Course syllabus
Data sets from astronomy, genomics, neuroscience, sports, surveillance cameras, and social networks will be analysed to answer specific scientific questions. Statistical tools and modeling techniques will be introduced as needed to analyse the data and eventually address the scientific question.
Course Description
Data Analytics has assumed increasing importance in recent times. Several industries are now built around the use of data for decision making. Several research areas too, genomics and neuroscience being notable examples, are increasingly focused on large-scale data generation rather than small-scale experimentation to generate initial hypotheses. This brings about a need for data analytics. This course will develop modern statistical tools and modeling techniques through hands-on data analysis in a variety of application domains.
The course will illustrate the principles of hands-on data analytics through several case studies (10 such studies). On each topic, we will introduce a scientific question and discuss why it should be addressed. Next, we will present the available data, how it was collected, etc. We will then discuss models, provide analyses, and finally touch upon how to address the scientific question using the analyses.
In one of the previous offerings, we covered the following case studies.
- Astronomy: From Tycho Brahe's observations to the conclusion that Mars moves in an elliptical orbit.
- Visual Neuroscience: Neural correlates predict search difficulty.
- Genomics: Understanding the causes of cancer.
- Sports: The Duckworth-Lewis-Stern method for setting targets in shortened limited overs cricket matches.
- Genomics: The basis for red-green colour blindness.
- Genomics: Population history of India.
- Signal Processing: Video background separation.
- Networks: Community detection.
- Recommendation systems.
Prerequisites
- Random Processes (E2 202) OR Probability and Statistics (E0 232) OR equivalent.
Course Grade
There will be about seven assignments, one on each of the first seven modules. A fair amount of hands-on work is expected. Students will use Python.
- 50/100 : Assignments
- 20/100 : Final examination
- 30/100 : Course project and presentation
Reference Texts
- There is no text book for this course. Slides of lectures will be available on the Moodle page (2023) or from here (previous years).