$24
Suppose that a Mars rover is wandering in a region which is modeled as a grid of width 12 and height 8, as shown in Fig 1. We do not know the exact location of the rover over time. Instead, we only get some noisy observations about the rover from a sensor. In this lab, we use hidden Markov model to track the rover’s movement over time.
The rover’s position at time i = 0; 1; 2; : : : is modeled as a random vector (xi; yi) 2 f0; 1; : : : ; 11g
f0; 1; : : : ; 7g. For example, (x2; y2) = (5; 4) means that at time step 2, the rover is in column 5, row 4
illustrated as a blue circle in Fig 1. The movement of the rover is quite predictable. At each time step, it makes one of the ve actions: it stays put, goes left, goes up, goes right, or goes down.
0 1 2 3 4 5 6 7 8 91011
A
Figure 1: A wandering rover (blue circle) in a grid of width 12 and height 8.
The action of the rover at any time i depends on its previous action as well as its current location. Given that the rover’s current location is not at the boundary of the grid, its action is simple: if the rover’s previous action was a movement (left, up, right, or down), the rover moves in the same direction with probability 0:9 and stays put with probability 0.1; if the rover’s previous action was to stay, it stays again with probability 0:2, and moves with each direction (left, up, right, or down) with probability 0:2. The rover’s action can be shown by a transition diagram in Fig. 2a.
In the case that the rover is on the boundary of the grid, the rover’s behavior should adjust such that it will not go outside the region, and meanwhile the behavior should also be consistent with the non-boundary case. For example, when the rover is in location A in Fig. 1, the rover cannot go higher. Therefore, there are only four possible actions: it stays, goes left, goes right, or goes down. Based on its previous action, those probabilities may need to be re-normalized such that they sum to 1. Speci cally, if the rover’s previous action was to go left, the rover moves in the same direction with probability 0:9 and stays put with probability 0.1; if the rover’s previous action was to go up, it stays at A with probability 1 (due to re-normalization); if the rover’s previous action was to stay, it stays again with probability 0:25, and moves with each direction (left, right, or down) with probability 0:25 (due to re-normalization). The resulting transition diagram is depicted in Fig. 2b.
Since the rover’s behavior at any time i depends on its previous action as well as its current location, we model the rover’s hidden state zi at time i as a super variable that includes both the rover’s location (xi; yi) and its most recent action ai, i.e., zi = ((xi; yi); ai), where ai is a random variable that takes the value from
1
0.9
up
up
0.2
0.1
1
0.9
0.2
0.9
1/4
0.1
0.2
0.9
0.1
1/4
0.9
left
stay
right
left
stay
right
0.2
0.1
1/4
0.1
0.1
0.2
1/4
0.9
down
down
(a) Transition diagram if the rover is not on the boundary
(b) Transition diagram if the rover is at A
Figure 2: Transition diagrams of rover’s behavior
fstay, left, up, right, downg. Pay attention that although we use subscript i in ai, it actually represents the action of the rover at time i 1 (most recent action).
Hidden state of the rover, blue circle indi-cates the current location (xi; yi), and the ar-row indicates the most recent action ai.
Observation (^xi; y^i), whose value is taken from one of the ve possible locations (green circles) with equal probability, given the true location shown in the left gure.
Figure 3: Hidden state and observation.
We do not directly observe the rover’s hidden state zi = ((xi; yi); ai) as shown in Fig. 3a. Instead, we have access to a noisy sensor that puts a uniform distribution on the valid grid positions within one grid cell of the rover’s current true position, as shown in Fig. 3b. Note that when the rover is on the boundary, the possible locations of the observation should adjust such that the observation is in the region. We do not directly observe the rover’s action from the sensor, either. In words, at time i the observation is represented by random variable (^xi; y^i) 2 f0; 1; : : : ; 11g f0; 1; : : : ; 7g, and (^xi; y^i) is uniformly distributed over the possible locations determined by zi.
Lastly, we assume that the rover’s initial position (x0; y0) is equally likely to be any of the grid locations, and its initial action a0 is stay.
Download hmm.zip under Files/Lab3 on Quercus and unzip the le. File rover.py contains functions for gen-erating the initial distribution, the transition probabilities given a current hidden state, and the observation probabilities given a current hidden state. Thus you do not need to re-write these. File test.txt contains the
2
data for Question 1. File test missing.txt contains the data for Questions 2, 3, 4, and 5. In both test.txt and test missing.txt, the rst three columns correspond to the hidden states, and the last two columns correspond to the observations.
To help you understand what the code does, we provide a visualization tool in inference.py, which can be turned on by setting enable graphics to True. Note that whether you use the visualization will not a ect the grading. When you run inference.py, three panes will pop up. The left pane shows the true state of the rover, including location and the most recent action, represented by an arrow. The middle pane shows the observed position of the rover, with red signifying the missing data. The right pane shows the estimated state of the rover as well as the marginal distribution by grey level. After you implement your inference algorithms, the right pane will automatically show the results.
Please follow the questions below and complete the le inference.py. This is the only le you need to modify. Questions
1. (a) Write down the formulas of the forward-backward algorithm to compute the marginal distribution p(zij(^x0; y^0); : : : ; (^xN 1; y^N 1)) for i = 0; 1; : : : ; N 1. The formulas include the initializations of the forward and backward messages, the recursion relations of the messages, and the computation of the marginal distribution based on the messages.
Complete function forward backward in le inference.py to implement the forward-backward algo-rithm. Now use the data (^x0; y^0); : : : ; (^x99; y^99) in test.txt to determine the marginal distribution of zi at time i = 99, i.e., p(z99j(^x0; y^0); : : : ; (^x99; y^99)). Only include states with non-zero probability in your answer.
Sanity check: The marginal distribution of the state at time i = 1 is
p(z1j(^x0; y^0); : : : ; (^x99; y^99)) =
(
0:5
if z1
= ((6; 5); right):
(1)
0:5
if z1
= ((6; 5); down);
Some of the observations were lost when they were transmitted from Mars to earth. Modify function forward backward so that it can handle missing observations. In a list of observations, a missing ob-servation is represented by None in inference.py. Now use the data in test missing.txt to determine the marginal distribution at time i = 30, i.e., p(z30j(^x0; y^0); : : : ; (^x99; y^99)) with missing observations. Only include states with non-zero probability in your answer.
Sanity check: The mode of this marginal distribution p(z30j(^x0; y^0); : : : ; (^x99; y^99)) should be ((6,7),right)
with probability 0:91304.
(a) Instead of computing the marginal distributions, we now seek the most likely sequence of the states via the Viterbi algorithm. Write down the formulas of the Viterbi algorithm using zi and (^xi; y^i); i = 0; : : : ; N 1. The formulas include the initialization of the messages and the recursion of the messages in the Viterbi algorithm.
Complete the function Viterbi in le inference.py to implement the Viterbi algorithm. Your im-plementation should be able to handle missing observations. Use the data in test missing.txt to determine the last 10 hidden states of the most likely sequence (i.e., i = 90; 91; 92; : : : ; 99) based on the MAP estimate obtained from the algorithm.
Sanity check: For the MAP sequence, the last 3 states, i.e., the states at i = 97; 98; 99 are: ((8,7),left), ((7,7),left), ((6,7),left).
Let z~i; i = 0; 1; : : : ; 99 be the estimate obtained from Question 3 by Viterbi algorithm, which corre-sponds to the most probable sequence given the observations:
~
~
arg max p(z
; : : : ; z
99j
(^x ; y^ ); : : : ; (^x
; y^ )):
(2)
fz0
; : : : ; z99g =
z0;:::;z99
0
0 0
99
99
3
Let zi; i = 0; 1; : : : ; 99 be the set of the states obtained from Question 2 by forward-backward algorithm, which are individually the most probable at each time step, corresponding to the maximization of the marginal distribution in Question 2, i.e.,
zi = arg max p(zij(^x0; y^0); : : : ; (^x99; y^99)); i = 0; : : : ; 99: (3)
zi
To compare z~i and zi, we let zi; i = 0; 1; : : : ; 99 be the true hidden states, and de ne the error probabilities of z~i and zi, respectively, as
99
I(z~i = zi)
P~e = 1
Pi=0
100
;
(4)
99
I(zi = zi)
Pe = 1
Pi=0
100
;
(5)
where I( ) is the indicator function, i.e., I(X) = 1, if X is true; otherwise I(X) = 0. Please compute
~
and compare Pe and Pe for the data in test missing.txt.
Although the states zi; i = 0; 1; : : : ; 99, obtained from (3), are individually the most probable states,
the sequence z0; z1; : : : ; z99 may not representant a valid sequence. By a valid sequence, we mean that
the rover can behave physically as the sequence z0; z1; : : : ; z99 as described by the Markov transition diagram of Fig. 2. For example,
.
.
(6)
.
zi = ((1,1),stay)
(7)
zi+1 = ((1,1),left)
(8)
.
.
(9)
.
is not a valid sequence because the transition from state ((1,1),stay) at time step i to state ((1,1),left) at time step i + 1 is impossible according to the transition model. Please check z0; z1; : : : ; z99 in Question 4 to see whether or not it is a valid sequence. If not, please nd a small segment zi; zi+1 that violates the transition model for some time step i.
We thank Prof. Greg Wornell of MIT in creating this lab.
4