$29
1. (10 points) Use the Matlab function randn() to generate a data sample of N points drawn from a Gaussian distribution with mean true = 10 and standard deviation true = 4. Consider the problem of using the data to get an estimate b of this Gaussian mean, assuming it is unknown, when the standard deviation true is known.
Consider using one of the two prior prior distributions on the mean: (i) a Gaussian prior with mean
prior = 10:5 and standard deviation prior = 1 and (ii) a uniform prior over (9:5; 11:5).
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103; 104. For each sample size N, repeat the following experiment M 100 times: generate the data, get the maximum likelihood estimate bML, get the maximum-a-posteriori estimates bMAP1 and bMAP2, and measure the relative errors jb truej= true for all three estimates.
Plot a single graph that shows the relative errors for each value of N as a box plot (use the Matlab boxplot() function), for each of the three estimates.
Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which of the three estimates will you prefer and why ?
2. (10 points) Use the Matlab function rand() to generate a data sample of N points from the uniform distribution on (0; 1). Transform the resulting data x to generate a transformed data sample where each datum y := ( 1= ) log(x) with = 5. The transformed data y will have some distribution with parameter ; what is its analytical form ? Use a Gamma prior on the parameter , where the Gamma distribution has parameters = 5:5 and = 1.
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103; 104. For each sample size N, repeat the following experiment M 100 times: generate the data, get the maximum like-lihood estimate bML, get the Bayesian estimate as the posterior mean bPosteriorMean, and measure the relative errors jb truej= true for both the estimates.
Derive a formula for the posterior mean.
Plot a single graph that shows the relative errors for each value of N as a box plot (use the Matlab boxplot() function), for both the estimates.
Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which of the two estimates will you prefer and why ?
3. (5 points) Consider a 2-dimensional data sample (assuming an extremely large sample size N) such that the data are drawn from a uniform distribution on a circle (i.e., a ring; the boundary of a disc, but not its interior) with center as the origin and radius r. Suppose you decide to fit a multivariate (2-dimensional) Gaussian distribution to this data by maximizing the likelihood
p
function. The multivariate Gaussian is P (x; ; C) := 1= (2 )djCj) exp( 0:5(x )>C 1(x )), where, for our case, dimension d = 2, x and are vectors of size d 1, C is a matrix of size d d, and jCj is the determinant of C.
Derive the mathematical formula, in terms of r, for the estimated mean and the estimated covariance matrix. You may use http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus. html to compute derivatives of the likelihood function with respect to the vector parameter and matrix parameter C.
Where is the mode of this Gaussian situated within R2 ? Do you think this Gaussian fits the data well ? Is it a good model ? Why or why not ?
Generate a large sample that is uniformly distributed on a circle with center origin and radius r. Compute the maximum likelihood estimates for the mean and covariance and report them, along with the sample size. Do they match the theoretically predicted values ?
4. (10 points) Suppose random variable X has a uniform distribution over (0; ), where the param-eter is unknown. Consider a Pareto distribution prior on , with a scale parameter m > 0 and a shape parameter > 1, as P ( ) / ( m= ) for m and P ( ) = 0 otherwise.
Find the maximum-likelihood estimate bML and the maximum-a-posteriori estimate bMAP. Does bMAP tend to bML as the sample size tends to infinity ? Is this desirable or not ?
Find an estimator of the mean of the posterior distribution bPosteriorMean.
Does bPosteriorMean tend to bML as the sample size tends to infinity ? Is this desirable or not ?
3