Module 7 of E0 259, Data Analytics, August 2017

Population history of India data set

Lectures

Lecture 1 (Ramesh Hariharan)
Lecture 2 (Ramesh Hariharan)
Material on PCA (Ramesh Hariharan)

Data set

We do not have permission to make this data set public. Registered students will get a link to download the data.

Assignment 7

Due: 23:55 hrs, Thursday 16 November 2017. Discussion is encouraged. But write your own code. Please comply with the ethics policy.

Instructions: Question 1 below is exploratory. Question 2 requires you to provide some writing in addition to programming. Please prepare a single pdf file of your responses (to Q1 and Q2) and include it in the zipped directory that contains your code. Submit the zipped directory.

Please read the README (txt) file for more information on how the data is arranged.

1. Explore the pc-subsets.csv file, in the Avadis software (also available in the folder) or any other you like, and tabulate any observations about the data that you can find, which were not already made in class.

2. Recall the notion of Fixation Index. Argue that the estimator covered in class is close to being an unbiased estimator for the Fixation Index. Then compute this index for all pairs of Indian groups and report an average value.