Starting from:
$35

$29

Computational Statistics Homework 2 Solution



This homework is due on Feb. 7, 2019.




Please write your team member's name is you collaborate.







 
Medical image estimation.




Suppose xi, i = 1; : : : ; n are i.i.d. Poisson with

e i k

P (xi = k) = i




k!




with unknown mean i. The variables xi represent the number of times that one of n possible independent events occurs during a certain period. In emission tomography, they may represent the number of photons emitted by n sources.




We consider an experiment designed to determine the means i. The experiment involves m detectors. If event i occurs, it is detected by detector j with probability pji. We assume

the probabilities pji are given (with pji 0 and
m
pji


1. The total number of events
recorded by detector j is denoted by yj,
P
j=1






n










Xi










yj = yji;
j = 1; : : : ; m:


=1













Formulate the maximum likelihood estimation problem of estimating the means i, based on observed values of yj, j = 1; : : : ; m. Will the maximum likelihood function returns a unique maximizer? (Hint: the variables yji have Poisson distribution with means pji i. The sum of n independent Poisson variables with means 1; : : : ; n has a Poisson distri-bution with mean 1 + + n.




 
Logistic regression.




Given n observations (xi; yi), i = 1; : : : ; n, xi 2 Rp, yi 2 f0; 1g, parameters a 2 Rp and b 2 R. Consider the log-likelihood function for logistic regression:

n

X

`(a; b) = fyi log h(xi; a; b) + (1 yi) log(1 h(xi; a; b))g

i=1




 
Derive the Hessian H of this function and show that H is negative semi-de nite (this implies that ` is concave and has no local maxima other than the global one.)




 
Use data logit-x.dat and logit-y.dat from Tsquare, which contain the predictors xi 2 R2 and response yi 2 f0; 1g respectively for logistic regression problem. Implement Newton's method for optimizing `(a; b) and apply it to t a logistic regression model to the data. Initialize Newton's method with a = 0, b = 0. Plot the value of the log likelihood function versus iterations. (You may use load logit-x.dat to load data.)




What are the coe cients a and b from your t?




1



 
Find a value of step-size that gives you convergence, and another value of step-size (larger) where your algorithm diverges.




 
Locally weighted linear regression.




Consider a linear regression problem in which we want to weight di erent training examples di erently. Speci cally, suppose we want to minimize




n

J( ) = 12 Xwi( T xi yi)2:




i=1




In class, we have worked out what happens for the case where all the weights are the same. In this problem, we will generalize some of those ideas to the weighted setting, and also implement the locally weighted linear regression algorithm.




(a) Show that J( ) can also be written as




J( ) = (X y)T W (X y)




for an appropriate diagonal matrix W , matrix X and vector y. State clearly what these matrices and vectors are.




 
Suppose we have samples (xi; yi), i = 1; : : : ; n of n independent examples, but in which the yi's were observed with di erent variances, and




p(y
x ; ) =


1
exp(
(yi T xi)2
)








ij i
q
2


2 2






2 i


i





i.e. yi has mean T xi and variance i2 (where i2 are xed, known, constants). Show that nding the maximum likelihood estimate of reduces to solving a weighted linear regression problem. State clearly what the wis are in terms of i2's.




 
Use data rx.dat and ry.dat, which contain the predictors xi and response yi respec-tively for our problem. Implement gradient descent for (unweighted) linear regression that we derived in class on this dataset, and plot on the same gure the data and the straight line resulting from your t. (Remember to include the intercept term.)




 
Implement locally weighted linear regression on this dataset, using gradient descent, and plot on the same gure the data and the line resulting from your t. Using the following weights

wi = exp( x2i=(20)):




Plot the J( ) versus iterations.




 
Exponential family and Fisher information.




A PDF f(xj ) of a random variable is called to be from an exponential family if we can write

 
(xj ) = g(x)e ( )+h(x) ( )

for some g(x), ( ), h(x) and ( ).




2



 
Show that Bernoulli, Binomial, Poisson, Exponential and Gaussian distributions all belong to exponential family. Here the PDF for them are given by




Bernoulli: f(xjp) = px(1 p)1 x;
x = f0; 1g






Binomial: f(xjn; p) =
x px(1 p)n x;
x = f0; 1; : : : ; ng






n




















Poisson: f(x
) = e x=x!; x =
f
0; 1; : : :






j
















g






Exponential: f(x ) = e x ;
x


0












j


















Gaussian: f(x ; ) =
1
e
1
(x ) 1(x ); x


R
p
2










j


p(2 )pj j














2


 
Find the Fisher information for Bernoulli distribution.




 
House price dataset.




The HOUSES dataset contains a collection of recent real estate listings in San Luis Obispo county and around it. The dataset is provided in RealEstate.csv.




The dataset contains the following elds:




MLS: Multiple listing service number for the house (unique ID).




Location: city/town where the house is located. Most locations are in San Luis Obispo county and northern Santa Barbara county (Santa Maria-Orcutt, Lompoc, Guadelupe, Los Alamos), but there some out of area locations as well.




Price: the most recent listing price of the house (in dollars).




Bedrooms: number of bedrooms.




Bathrooms: number of bathrooms.




Size: size of the house in square feet.




Price/SQ.ft: price of the house per square foot.




Status: type of sale. Thee types are represented in the dataset: Short Sale, Foreclo-sure and Regular.




Fit linear regression model to predict Price using remaining factors (except Status), for each of the three types of sales: Short Sale, Foreclosure and Regular, respectively.











































3

More products