Module 6 of E0 259, Data Analytics, August 2019

Visual neuroscience data set

Lectures

Lecture 1 (Rajesh Sundaresan)
Lecture 2 (Rajesh Sundaresan)
Lecture 3 (Rajesh Sundaresan)

Data set (Thanks to Arun Sripati and Carl Olson.)

Search times data (.csv file) :
This file contains data on search times on certain image pairs. See the slide 10/22 of Lecture 1 of this module.
a) There are four sets of data. Each set has six groups of experiments. Each group is for a fixed oddball image against its distracting pair.
b) Column A contains search times of six individuals for the "Set 1 - Colour" pair in slide 10/22 of Lecture 1. The oddball image is the picture on the left in the picture slide (Set 1 - Colour, left image with red on top of green). This column is labeled Oddball L. Column B contains the search times when the oddball image is the one on the right (Oddball R, green on top). Each individual was presented this oddball test 12 times in random order. The data is similarly arranged for columns C through R (Sets 1 through 3).
c) Set 4 gives data for a 'compound search'. Consider column S. This corresponds to 'Far' pair on slide 10/22. The oddball image is one of the worms, while the distracter is one of the bugs. The worm is either the picture on the left (quarter circles on left bottom and right top) or its flip around the horizontal line (quarter circles on left top and right bottom). The bug is either the picture on the right or its flip around the horizontal line. One of the two worms is picked at random as the oddball, and one of the two bugs is picked at random as the distracter.

Firing rate data (.csv file) :
This file contains data on the firing rates of neurons.
a) Each set has a different number of neurons whose firing rates were recorded.
b) Consider columns A,B. Column A records the average firing rates of 174 neurons on "Set 1 - Colour - left image with red on top", while Column B refers to "Set 1 - Colour - right image with green on top".
c) Column S records average firing rates of the appropriate number neurons on "Set 4 - Far - Worm" with the worm as shown on page 10/22 of Lecture 1. Column T is for "Set 4 - Far - Worm flipped" for the worm flipped around the horizontal line.

Assignment 6

Due: Skipped for year 2019. Discussion is encouraged. But write your own code. Please comply with the ethics policy.

1. For each image pair on which the behavioural tests were conducted, compute the average search delay, the relative entropy "distance" per neuron, and the L1 distance between the firing rates per neuron. When computing average search delays, remember to subtract the baseline reaction time of 328 ms. Remember also to treat the compound searches appropriately. Finally, use the relative entropy estimates suggested in class to reduce estimation bias.
(i) With relative entropy estimates on the x-axis and inverse of search delays on the y-axis, find the best straight line passing through the origin that explains the observations.
(ii) Now, with L1 distance on the x-axis and inverse of search delays on the y-axis, find the best straight line passing through the origin. Which of the two is better?

2. Compute and compare the AM/GM measure of spread for the products "search delay x relative entropy" and "search delay x L1 distance".

3. In this part, we will try to fit a Gamma distribution to the search delays. The Gamma distribution Gamma(a,b) has two parameters (shape a and rate b). It also has the property that the standard deviation to mean ratio is not arbitrary, but tied to the shape parameter a. If the shape parameter is 1, we get the exponential distribution. If the shape parameter is 2, we get a random variable that is the sum of two exponential random variables with the same rate parameter, and so on. In general, the shape parameter need not be an integer.
(i) Randomly select half the number of groups. Plot the standard deviation of the search times against their means. Estimate the shape parameter.
(ii) On each of the groups that did not contribute to the shape parameter estimation, randomly select one half of the samples and estimate the rate parameter. Justify your estimation procedure.
(iii) On each of the groups of (ii), taking the samples that did not contribute to the rate parameter estimation, plot the empirical cumulative distribution function (cdf). On the same figure, plot the Gamma(a,b) cdf where a is the shape parameter estimated in (i) and b is the rate parameter estimated in (ii) on this group. Output the so-called Kolmogorov-Smirnov statistic which is the max separation between the two cdfs. Comment on the outputs, and in particular, tell us if the search delays obey the Gamma distribution (with some indication on the confidence level). You are welcome to use a statistical package.

4. (Theory question): In the third lecture, we proposed a lower bound on the expected time to stop under hypothesis h = H0 for the problem of detecting a single image. Go back to the initial experiment of search for the oddball. What is the lower bound on the expected search time, given image 0 is the oddball at location ℓ (hypothesis h = (ℓ,0,1)), when the subject can control where to look at the beginning of each time slot?