$23.99
Multivariate Gaussian RV
Generator via Metropolis-Hasting and Gibbs Sampling
A Multivariate Gaussian RV generator has to generate i.i.d. (d × 1) vectors
x, whose (d × 1) mean is µ and whose (d × d) covariance-matrix is Σ. That is,
1 1 T −1
∼
x e− 2 (x−µ) Σ
p(2π)d det(Σ)
(x−µ)
(1)
| {z }
=f (x)
We write the above expression in short-hand as x ∼ N (x, µ, Σ).
In this programming assignment we will write two C++ programs that can
i=1
produce a set of vectors {xi }∞
, where xi ∼ N (x, µ, Σ). The approach that uses
the Cholesky decomposition of Σ (see section 1.2.1 of Probability and Statistics
Review, for example) would not work if d is very large. We will cover two (of many) approaches to overcome this issue that uses concepts from the theory of Markov-Chain Monte Carlo (MCMC) methods. The first approach uses the Metropolis-Hasting Algorithm ; the second approach uses Gibbs Sampling.
1 Part 1: Metropolis-Hasting Algorithm
The Proposal Distribution is the d-dimensional Multivariate Gaussian with zero-mean (i.e. µ = 0) and Unit-Covariance (i.e. Σ = I). Assume that for right
i=1
now we have generated {xi }j
, where xi ∼ N (x, µ, Σ). We can generate RVs
b x = xj + y, where y ∼ N (y, 0, I) (which can be generated
by d-many calls to get gaussian()).
Since the Proposal Distribution is symmetric, following equation 2 of section
4 of my notes on the material in Chapter 4 of the text, we accept x with
b
probability
(
x 1 T −1 )
p(x, x ) = min
j
b
1, f (b)
f (xj )
= min
e− 2 (x−µ) Σ (x−µ)
1,
b b
e−
. (2)
1
2 (x−
µ)T
Σ−1
(x−µ)
x is rejected, the process is repeated with no change. If x is accepted, then
b
{xi }j+1
j x}, and the process repeats with j ← (j + 1). That is,
i=1 ← {xi }i=1 ∪ {b
b
we present x as the (j + 1)-th RV xj+1 ∼ N (x, µ, Σ).
2 Part 2: Gibbs Sampling
For this part, you are going to use Gibbs Sampling to generate Multivariate
Gaussian RVs. The C++ STL has a Univariate Gaussian Generator. That is,
it can generate i.i.d. scalars x ∼ N (µ, σ), with mean µ and standard-deviation
σ.
The following web page1 derives the marginal- and conditional distributions of a Multivariate Gaussian Distribution. Suppose the (d × 1) vector x is parti- tioned as
x = x1
x2
where x1 is p × 1 x2 is q × 1 where p + q = d, where
x ∼ N (x, µ, Σ); µ =
µ1
µ2
; and Σ =
Σ11 Σ12 .
Σ21 Σ22
It follows that ΣT = Σ. Then, it is known that the conditional distribution of
xi given the values of xj is also Normally distributed
(xi | xj ) ∼ N (xi , (µi | µj ), (Σi | Σj ))
where
and
jj
(µi | µj ) = µi + Σij Σ−1 (xj − µj )
(Σi | Σj ) = Σii − ΣT Σ−1 Σji
ji jj
In your implementation of a Multivariate Gaussian RV generator that uses Gibbs Sampling for this programming assignment, I ask that you assume p = d − 1 and q = 1 in the above interpretation. We know that (x2 | x1 ) is es- sentially a Univariate Gaussian (with known mean and variance; that can be computed using the above formula). By making a call to the C++ STL Univari- ate Gaussian RV generator, you can find a (random) value for x2 . The following YouTube video describes the MCMC-part of the algorithm. Watch it before you start this programming assignment.
The Programming Assignment
For the first part of the programming assignment you will write a Multivariate Gaussian Generator using the MCMC-MH algorithm. I have provided a hint for the first part on Compass. Although the code is meant to run for high- dimensions, I am verifying it for the case when d = 2. You will do the same. The code runs on command-line, the first variable is the number of trials, the second is the name of the CSV file that contains the experimental 2D-PDF, the third contains the theoretical 2D-PDF. You can plot the two figures in MATLAB, which is used in lieu of a formal verification of the correctness of the code. Figures 1 and 2 show a sample run on my Mac.
For the second part of the programming assignment you will write a Multi- variate Gaussian Generator using Gibbs Sampling. This routine takes as input
1 Save a small typo.
Figure 1: Sample run of the MCMC-MH based Multivariate Gaussian RV Gen- erator; d = 2 for this illustration; no of trials = 10,000,000.
Figure 2: A comparison of the experimentally observed PDF/histogram plot
(on the left) vs. the theoretical PDF (on the right) for the trial shown in figure
1 (no of trials = 10,000,000).
the previous sample (i.e. Previous value); and returns a d-dimensional Multi- variate Gaussian RV x, where
Previous valuei if i = index
xi =
∼ N (µ, σ) if i = index; and µ and σ are carefully selected.
Keep in mind, for this to work properly you have to cycle-through all d-dimensions. That is, index ∈ {1, 2, . . . , d} in a cyclic-fashion.
In my hint.cpp I am verifying the correctness of the code by generating a 2D Multivariate Gaussian. The code should be self-explanatory. A sample run of the code is shown in figure 3. The experimental PDF is shown on the left of figure 4, while the theoretical PDF is shown on the right of figure 4. Your code should be verifiable for 2D (although it will be written for d-dimensions).
Figure 3: Sample run of the Gibbs Sampling based Multivariate Gaussian RV Generator; d = 2 for this illustration; no of trials = 10,000,000.
Figure 4: A comparison of the experimentally observed PDF/histogram plot
(on the left) vs. the theoretical PDF (on the right) for the trial shown in figure
1 (no of trials = 10,000,000).