Starting from:
$30

$24

Pencil-and-paper execution of EM on coin toss Solution




This assignment concerns the scenario considered in the slides, where a coin Z is tossed to choose between one of two other coins A and B. The chosen coin is then tossed N times. The whole procedure is repeated D times.From the complete data set it would be easy to determine the probabilities of Z indicating A, and for each of A and B, their probabilities of turning up heads. The hidden variable variant is where as data you just know on each trial what the N coin tosses yielded with the chosen coin, but you don’t know which coin was being tossed: the outcome on




is hidden.



Suppose the following data set, where there are just 2 trials, and the chosen coin is tossed just 2 times (this was considered in the slides):




Z tosses of chosen coin



1
?
H
H
2
?
T
T



Below there is a detailed working through of a first iteration of the EM procedure for estimating parameters applied to this. For the assignment you have to give a similar working through of the second iteration.




Suppose the following notation




θA

θB




θH|A θT|A θ

θ

#(d, h)




#(d, t)




P (Z = a)




P (Z = b)




P (h|a) ie. the head prob of coin A




P (t|a) ie. the tail prob of coin A




P (h|b) ie. the head prob of coin B




P (t|b) ie. the tail prob of coin B




num of heads in trial d




num of tails in trial d



If XD is a particular trial – ie. outcomes of the N tosses of a chosen coin – the probability of the version where the chosen coin was A is given by







P (Z = a, XD) = P (Z = a) × P (h|a)#(D,H) × P (t|a)#(D,T)




θA × θH#(|AD,H) × θT#(|AD,T)



and likewise the probability of the version where the chosen coin was B is given by







P (Z = b, XD) = P (Z = b) × P (h|b)#(D,H) × P (t|b)#(D,T)




= θB × θH#(|BD,H) × θT#(|BD,T)






1






From these joint probability formula can work out the conditional probabilities for the hidden variable:







P (Z = a|XD)
=
P (Z = a, XD)




C
P (Z = c, XD)










P (Z = b|XD)
=
P (Z = b, XD)




C
P (Z = c, XD)













In the slides we used the notation γD(Z) for this, where d is index of the data item.




On the particular data set at hand the joint probability formulae are particularly simple







P (Z = a, X1) = θA × θH2|A




P (Z = b, X1) = θB × θH2|B

P (Z = a, X2) = θA × θ2|

T A

P (Z = b, X2) = θB × θ2|

T B




and thus the conditional probalities are:








θA
× θ2




γ1(a)
=




H|A
θA×θ2


+ θB
× θ2




H|A


H|B




θB
× θ2




γ1(b)
=




H|B
θA×θ2


+ θB
× θ2




H|A


H|B




θA×θ2




γ2(a)
=




T|A




θA×θ2


+ θB
× θ2




T|A




T|B




θB×θ2




γ2(b)
=




T|B




θA×θ2


+ θB
× θ2




T|A




T|B



To carry out an EM estimation of the parameters given the data we need some initial setting of the parameters. We will suppose this is:




θA = 12, θB = 12,

θ =3,θ =1

H|A 4 T|A 4
















ITERATION 1




For each piece of data have to first compute the conditional probabilities of the hidden variable given the data:




= 1 : p(Z = A, HH) = 0.5 × 0.75 × 0.75 = 0.28125



= 1 : p(Z = B, HH) = 0.5 × 0.5 × 0.5 = 0.125



= 1 :→ sum = 0.40625
= 1 :→ γ1(A) = 0.692308
= 1 :→ γ1(B) = 0.307692



= 2 : p(Z = A, T T ) = 0.5 × 0.25 × 0.25 = 0.03125



= 2 : p(Z = B, T T ) = 0.5 × 0.5 × 0.5 = 0.125



= 2 :→ sum = 0.15625



2






= 2 :→ γ2(A) = 0.2
= 2 :→ γ2(B) = 0.8



Armed with these γ values we now treat each data item XD as if it splits into two versions, one filling out Z as a, and with ’count’ γD(a), and one filling out Z as b, and with ’count’ γD(b).




We then go through this virtual corpus accumulating counts of certain kinds of event. For events of hidden variable being Z = a and Z = b we get




E(A) = γ1(a) + γ2(a) = 0.692308 + 0.2 = 0.892308

E(B) = γ1(b) + γ2(b) = 0.307692 + 0.8 = 1.10769




Then we need to go through the Z = a cases and count types of coin toss, and likewise for Z = b cases




E(A, H) = D γD(a)#(d, h) = (0.692308 × 2 + 0.2 × 0) = 1.38462

E(A, T ) = D γD(a)#(d, t) = (0.692308 × 0 + 0.2 × 2) = 0.4

E(B, H) = D γD(b)#(d, h) = (0.307692 × 2 + 0.8 × 0) = 0.615385

E(B, T ) = D γD(b)#(d, t) = (0.307692 × 0 + 0.8 × 2) = 1.6




Then from these ’expected’ counts we re-estimate parameters




est(θA) = E(A)/2 = 0.892308/2 = 0.446154

est(θB) = E(B)/2 = 1.10769/2 = 0.553846

est(θH|A) = E(A, H)/
X [E(A, X)] = 1.38462/(1.38462 + 0.4) = 1.38462/1.78462 = 0.775862
est(θT|A) = E(A, T )/
X [E(A, X)] = 0.4/(1.38462 + 0.4) = 0.4/1.78462 = 0.224138
est(θH|B) = E(B, H)/
X [E(B, X)] = 0.615385/(0.615385+1.6) = 0.615385/2.21538 = 0.277778
est(θT|B) = E(B, T )/
X [E(B, X)] = 1.6/(0.615385 + 1.6) = 1.6/2.21538 = 0.722222



Note the denominator 2 in the re-estimation formula for θA. inator as E(A) + E(B), but this is D γD(a) + D γD(b) =




We could have written the denom-D[γD(a) + γD(b)] = D[1] = 2









































































































3

More products