Problem Set III Solution

Starting from:

$30

For this problem set you have to use the same data from last time, in the ascii file yo-gurt 2018.txt. The data consists of observations on a number of households making multiple yogurt purchases. They purchase each time one of three brands. The five variables in the data set are, (i) the household id, running from 1 to 430, (ii) the choice made by the house-hold, running from 1 to 3, (iii) the price for that household when they made their decision, in cents, of yogurt brand 1, (iv) the price, in cents, of yogurt brand 2, (v) the price, in cents, of yogurt brand 3.

Let j index the choice, running from 1 to 3, t index the purchase, running from 1 to Ti, and i the household. The number of purchases made by each household diﬀers. For example, the first two purchases come from household 1, the next two from household 2, and the next eight from the third household.

We focus on a discrete choice model where the utility for individual i associated with choice j, in purchase t is

Uijt = αj + β × Pijt + ijt,

where Pijt is the price of brand j for household i at purchase time t. We assume the εijt are independent across time, choice and household, with a normal distribution with mean zero and unit variance. Normalize α1 = 0, so that there are three free parameters, α2, α3 , and β.

Use independent prior distributions for α2, α3 , and β that are normal with mean zero and variance σ2 = 100. Report the posterior mean and standard deviation using markov-chain-monte-carlo methods. Use the gibbs sampler and data augmentation. The steps are

Imbens, Problem Set III, MGTECON640/ECON292 Fall ’18
2

Start with starting values β = 0, α2 = α3 = 0.

Impute all N × 3 missing latent utilities in a data augmentation step given the parameters.

Draw from the posterior distribution of (α2 , α3 , β) given the latent utilities using the normal linear model with 3N observations and 3 regressors and a known variance 1.

Repeat these steps 1M times, and report the averages and standard deviations for the last 500K values.

Use other starting values and repeat the exercise. Assess the convergence based on the evidence from the two chains.

More products

Data Modeling Assignment 1 Solution