$24
Instructions:
Prepare a report (including your answers/plots) to be uploaded on Moodle.
The report should be typeset (for lengthy derivations, the solution can be scanned and embedded into the report).
Show all the steps of your work clearly.
Unclear presentation of results will be penalized heavily.
No partial credits to unjusti ed answers.
Use Matlab or Python for computations.
Return all Matlab/Python code that you wrote in a single le.
Code should be commented, code for di erent HW questions should be clearly sepa-rated.
The code le should NOT return an error during runtime.
If the code returns an error at any point, the remaining part of your code will not be evaluated (i.e., 0 points).
Question Points Your Score
Q1 25
Q2 25
Q3 25
Q4 25
TOTAL 100
Question 1. [25 points]
Stimuli consisting of face images have been used in a study on visual perception. Experimen-tal stimuli are provided in the le hw4_data1.mat, which contains face images downsampled to a 32 32 square grid. The images are stored in a matrix f aces, with 1000 rows (number of di erent images) and 1024 columns (number of image pixels). Answer the questions below.
The experimenter would like to t encoding models between the stimuli (i.e., face im-ages) and the measured neural responses. However, one rst needs to de ne explanatory variables (i.e., regressors) that capture important variations in stimulus properties during the experiments. To accomplish this goal, perform PCA on the 1000 faces images. Plot the proportion of variance explained by each individual PC, for the rst 100 PCs. Display the rst 25 PCs using the function dispImArray.m.
The experimenter would like to know how many PCs are su cient to obtain a reasonable representation of the stimuli. Reconstruct each image in the matrix f aces using their PC projections (i.e., reconstructed images are a linear weighted combination of stimulus PCs). Obtain separate reconstructions based on rst 10, 25, and 50 PCs. Display the original images and the reconstructions using dispImArray.m, for the rst 36 images. Find the mean-squared error (MSE) between the original and reconstructed images, and report the mean and standard deviation of MSE across 1000 images. Interpret the results.
Instead of PCA, nd explanatory variables to capture stimulus properties using inde-pendent component analysis (ICA). Use the FastICA package available on Moodle, and use PCA-based reduction to 50 dimensions during the calls (i.e., by setting lastEig). Note that unlike PCA, ICA results are not deterministic. Recall fastica.m to return 10 ICs, 25 ICs, and 50 ICs. Display the obtained ICs using dispImArray.m. Reconstruct face images based on their IC projections. Display the original and reconstructed images based on 10, 25, and 50 ICs. Report the mean and std of MSE in each case. Compare your results with part b.
Finally, nd explanatory variables to capture stimulus properties using non-negative ma-trix factorization (NNMF). NNMF requires its input to be strictly positive, so add a single scalar constant to all entries of the matrix f aces to satisfy this constraint (but only add the minimum amount required). Recall nnmf.m to return 10, 25, and 50 MFs. Display the ob-tained MFs using dispImArray.m. Reconstruct face images based on their MF projections. Display the original and reconstructed images based on 10, 25, and 50 MFs. Report the meanand std of MSE in each case. Compare your results with parts b and c.
Question 2. [25 points]
BOLD responses evoked by two distinct auditory stimulus categories, human speech versus nature sounds, have been recorded across human auditory cortex. The experimental data are provided in hw4_data2.mat. This le contains an array stype, identifying the stimulus category as 1 (speech) or 2 (nature sound), and a matrix vresp, representing the responses to each of 181 stimuli across 1626 voxels in auditory cortex. Answer the questions below.
Examine whether the two stimulus categories are discriminable based on the response pat-terns they elicit across auditory cortex. Using pdist2.m, nd the similarity of the response patterns across 181 stimuli. Obtain separate confusion matrices (i.e., similarity estimates) based on Euclidean, Cosine and Correlation distance metrics. Show the confusion matrices for each distance metric with imagesc. Comment on the results. Which distance metric seems to discriminate best among the stimulus categories?
It is hard to visualize the stimulus categories in the space of response patterns, due to the sheer number of voxels. Obtain a low-dimensional visualization of the data, using multi-dimensional scaling (MDS, cmdscale.m). Perform MDS analysis with Euclidean, Co-sine and Correlation distance metrics. Show scatter plots of the stimuli, where the x-axis represents projections onto the rst MDS component, and the y-axis represents projections onto the second MDS component. Based on the information provided in the array stype, use di erent colors and symbols for two distinct stimulus categories. Are these categories distinguishable in the MDS space?
Since the experimenter used mutually-exclusive stimulus categories, you would like to separate the 181 stimulus instances into two groups based on their response patterns. Run a simple k-means algorithm to separate the stimuli into two clusters. Perform separate cluster analyses based on Euclidean, Cosine and Correlation distance metrics. Show scat-ter plots of the stimuli in the two-dimensional MDS space that was computed in part b. However this time, use di erent colors/symbols to label the data points for the two stimulus clusters returned by the k-means algorithm. Compare the results to those in part b. Are your results consistent with ground truth?
Question 3. [25 points]
Consider a population of 21 independent neurons with Gaussian-shaped tuning curves:
2
2
fi(x) = A e (x i)
=(2 i )
(1)
The tuning curves have an amplitude of 1 and a standard deviation of i = 1, with centers i evenly spaced between -10 and 10 along the x-axis. Answer the questions below.
Plot all tuning curves in the population on the same axis. Simulate the population re-
sponse to the stimulus x = 1, and plot the population response as a function each neuron's preferred stimulus value.
Perform a simulated experiment with 200 trials. In each trial, sample a stimulus inten-
sity uniformly from the interval [ 5 5], simulate the 21-long vector of population response ~r. Assume that each neuron's response is corrupted by an additive Gaussian noise with zero
mean and =20 standard deviation. Implement a winner-take-all decoder, and calculate the stimulus estimate xW T A for each trial. Plot the actual and estimated stimulus on the same graph. Compute the mean and standard deviation of error in stimulus estimation across 200 trials.
For the experimental trials simulated in part b, implement a maximum-likelihood de-coder, and calculate the stimulus estimate xML for each trial. Plot the actual and estimated
stimulus on the same graph. Compute the mean and standard deviation of error in stimulus estimation across 200 trials. (Hint: To nd xML, you can calculate the log-likelihood for the entire stimulus range. Note that the tuning curve gives you the expected value (i.e., mean) of the neural response, and there is additional variability due to additive noise that you need to consider.)
For the experimental trials simulated in part b, implement a maximum-a-posteriori de-coder, and calculate the stimulus estimate xMAP for each trial. Assume that the prior of the stimulus value x follows a Gaussian distribution with a mean of 0 and a standard deviation of 2.5. Plot the actual and estimated stimulus on the same graph. Compute the mean and standard deviation of error in stimulus estimation across 200 trials. Interpret your results.
Perform an experiment with 200 trials of stimulus intensity. In each trial, sample a stim-
ulus intensity from the interval [ 5 5]. For the resulting stimulus vector (of length 200), separately simulate the population response vectors (~r) for i = 0:1, i = 0:2, i = 0:5, i = 1, i = 2, and i = 5. In each case, assume additive Gaussian noise with zero mean and 1=20 standard deviation. Calculate MLE estimates of the stimulus xML based on each population response separately. Compare the mean and standard deviation of error in stimulus estimation for various values. Is it better to have narrow or wide tuning curves?
Question 4. [25 points]
Response patterns elicited by `face' and `building' stimuli have been recorded across 1626 voxels in ventral-temporal cortex. The experimental data are provided in hw4_data3.mat. This le contains an array stype, identifying the category of each stimulus 1 (face) or 2 (building), and a matrix vresp, representing the responses to 181 stimuli across 1626 voxels. Answer the questions below.
Perform multi-dimensional scaling analysis on vresp with a Euclidean distance metric. Build an LDA-based classi er using a two-dimensional MDS representation of the data. Measure classi cation accuracy using leave-one-out cross validation. Show a scatter plot of vresp in MDS space, labeling responses for the two stimulus categories with di erent colors. On the same gure, plot the decision boundary of the classi er with a line.
Perform multi-dimensional scaling analysis on vresp with a correlation distance metric. Build an LDA-based classi er using a two-dimensional MDS representation of the data. Measure classi cation accuracy using leave-one-out cross validation. Show a scatter plot of vresp in MDS space, labeling responses for the two stimulus categories with di erent colors. On the same gure, plot the decision boundary of the classi er with a line. Compare your results to part a.
Perform multi-dimensional scaling analysis on vresp with a correlation distance met-ric. Build separate LDA-based classi ers using MDS representations in [1:1:5] dimensions. Measure classi cation accuracy using leave-one-out cross validation. Show a bar plot of classi cation accuracy as a function of number of MDS dimensions. During display, zoom in on the appropriate range of classi cation accuracies to clearly show di erences between classi ers (e.g., [95 100]). How many dimensions are su cient?