Starting from:
$29.99

$23.99

Programming Assignment #4 Solution

Multivariate Gaussian RV

Generator via  Metropolis-Hasting and Gibbs Sampling

 

 

 

 

 

A Multivariate Gaussian  RV generator has to generate  i.i.d.  (d × 1) vectors

x, whose (d × 1) mean is µ and whose (d × d) covariance-matrix is Σ. That  is,

 

1                1              T    −1




 
x                                           e− 2 (x−µ)  Σ

p(2π)d det(Σ)


(x−µ)


(1)

|            {z                         }

=f (x)

 

We write the above expression  in short-hand as x ∼ N (x, µ, Σ).

In this programming assignment we will write two C++ programs  that  can



i=1
 
produce a set of vectors {xi }∞


, where xi ∼ N (x, µ, Σ). The approach that uses

the Cholesky decomposition  of Σ (see section 1.2.1 of Probability  and Statistics

Review, for example)  would not  work if d is very large.  We will cover two  (of many)  approaches to  overcome  this  issue that uses concepts  from  the  theory of Markov-Chain  Monte Carlo  (MCMC)  methods.   The first approach uses the Metropolis-Hasting Algorithm ; the second approach uses Gibbs Sampling.

 

 

1    Part 1:  Metropolis-Hasting Algorithm

 

The  Proposal  Distribution is the  d-dimensional  Multivariate Gaussian  with zero-mean (i.e.  µ = 0) and Unit-Covariance (i.e.  Σ = I). Assume that for right



i=1
 
now we have generated {xi }j


, where xi ∼ N (x, µ, Σ). We can generate  RVs

b                               x = xj  + y,  where y ∼ N (y, 0, I) (which  can be generated

by d-many  calls to get gaussian()).

Since the Proposal  Distribution is symmetric, following equation  2 of section

4 of my  notes  on  the  material   in  Chapter 4 of the  text,   we accept  x  with

b

probability



(
 
x                               1              T    −1                    )

 

p(x, x  ) = min



j
 
b


1,  f (b)  

f (xj )


 

= min


e− 2 (x−µ) Σ    (x−µ)

1,



b                 b
 
e−


 

.           (2)



1
 
2 (x−


µ)T


Σ−1


(x−µ)

 

x is rejected,  the  process is repeated with  no change.  If x is accepted,  then

b

{xi }j+1


j         x}, and  the  process repeats  with  j ←  (j + 1).  That is,

i=1  ←  {xi }i=1 ∪ {b



b
 
we present x as the (j + 1)-th  RV xj+1  ∼ N (x, µ, Σ).

 

 

2    Part 2:  Gibbs Sampling

 

For this  part,  you are going to use Gibbs  Sampling  to generate  Multivariate

Gaussian  RVs.  The  C++ STL has a Univariate Gaussian  Generator. That is,

 

 

it can generate  i.i.d.  scalars  x ∼ N (µ, σ), with mean µ and  standard-deviation

σ.

The following web page1  derives the marginal-  and conditional distributions of a Multivariate Gaussian  Distribution. Suppose the (d × 1) vector  x is parti- tioned  as

x =    x1

x2

where x1   is p × 1 x2   is q × 1 where p + q = d, where

 

x ∼ N (x, µ, Σ); µ =


 

   µ1

µ2


 

; and Σ =


 

   Σ11    Σ12      .

Σ21    Σ22

 

It follows that ΣT = Σ. Then,  it is known that the  conditional distribution of

xi given the values of xj  is also Normally  distributed

 

(xi | xj ) ∼ N (xi , (µi | µj ), (Σi | Σj ))

where

 

 

and


 



jj
 
(µi | µj ) = µi + Σij Σ−1 (xj  − µj )

 

 

(Σi | Σj ) = Σii − ΣT Σ−1 Σji

ji   jj

In your implementation of a Multivariate Gaussian  RV generator that uses Gibbs  Sampling  for this  programming assignment, I ask that you assume  p = d − 1 and  q = 1 in the  above  interpretation.  We  know that (x2   | x1 ) is es- sentially  a Univariate Gaussian  (with  known  mean  and  variance;  that can  be computed using the above formula).  By making a call to the C++ STL Univari- ate Gaussian  RV generator, you can find a (random) value for x2 . The following YouTube  video describes the MCMC-part of the algorithm. Watch  it before you start this programming assignment.

 

The Programming Assignment

 

For the first part  of the programming assignment you will write a Multivariate Gaussian  Generator using the  MCMC-MH  algorithm.  I have  provided  a hint for the  first  part  on Compass.   Although  the  code is meant to  run  for high- dimensions,  I am verifying it for the  case when d = 2.  You will do the  same. The  code runs  on command-line, the  first variable  is the  number  of trials,  the second  is the  name  of the  CSV file that contains  the  experimental 2D-PDF, the  third  contains  the  theoretical 2D-PDF.   You  can  plot  the  two figures  in MATLAB,  which is used in lieu of a formal verification  of the correctness  of the code.  Figures  1 and 2 show a sample run on my Mac.

For the second part  of the programming assignment you will write a Multi- variate  Gaussian  Generator using Gibbs  Sampling.  This routine  takes  as input

 

1 Save  a small  typo.

 

 

 

Figure 1: Sample run of the MCMC-MH based Multivariate Gaussian  RV Gen- erator;  d = 2 for this illustration; no of trials = 10,000,000.

 

 

 

Figure  2:  A comparison  of the  experimentally observed  PDF/histogram  plot

(on the left) vs. the theoretical PDF  (on the right) for the trial shown in figure

1 (no of trials = 10,000,000).

 

 

 

 

the  previous  sample  (i.e.   Previous value); and  returns a d-dimensional  Multi- variate  Gaussian  RV x, where

 

   Previous valuei      if i = index

xi =


∼ N (µ, σ)            if i = index; and µ and σ are carefully selected.

 

Keep in mind, for this to work properly you have to cycle-through all d-dimensions. That is, index ∈ {1, 2, . . . , d} in a cyclic-fashion.

In my hint.cpp I am verifying the correctness  of the code by generating  a 2D Multivariate Gaussian.   The  code should  be self-explanatory.  A sample  run  of the  code is shown in figure 3.  The  experimental PDF  is shown on the  left of figure 4, while the theoretical PDF  is shown on the right of figure 4. Your code should be verifiable for 2D (although it will be written for d-dimensions).

 

 

 

Figure  3:  Sample  run  of the  Gibbs  Sampling  based  Multivariate Gaussian  RV Generator; d = 2 for this illustration; no of trials = 10,000,000.

 

Figure  4:  A comparison  of the  experimentally observed  PDF/histogram  plot

(on the left) vs. the theoretical PDF  (on the right) for the trial shown in figure

1 (no of trials = 10,000,000).

More products