Homework 11 Solution

Starting from:

~~$35~~

$29

Home

General Instructions: Put your answers to the following problems into a PDF document and submit as an attachment under Content -> Homework 11 for the course CptS 440 Pullman ( sections of CptS 440 and 540 are merged under the CptS 440 Pullman section) on the Blackboard Learn system by the above deadline. Note that you may submit multiple times, but we will only grade the most recent entry submitted before the above deadline.

1. Consider the 2x2 Wumpus world on the right, where the agent starts in (1,1) facing Right, the Wumpus is in (2,1), and the gold is in (2,2). We will represent the agent’s state as [x,y,o], where (x,y) is the agent’s location, and o is the agent’s orientation (Up, Down, Left, Right). The agent has three possible actions: TurnLeft, TurnRight and GoForward. GoForward always works (i.e., moves the agent to the location it is facing, or bumps into a wall and stays in the same location). However, TurnLeft and TurnRight only work 80% of the time; the other 20% of the time the agent’s
orientation does not change. If the agent enters terminal state [2,2,*], it receives reward +1000. If the agent enters terminal state [2,1,*], it receives reward –1000. The agent receives a reward of –1 for all other states.

a. Given the following policy, compute the utilities of each non-terminal state, using the equation on slide 58 of the lecture notes, where = 0.9. Show your work.

State
Action
[1,1,Right]
TurnLeft
[1,1,Up]
GoForward
[1,1,Left]
TurnRight
[1,1,Down]
TurnRight
[1,2,Right]
GoForward
[1,2,Up]
TurnRight
[1,2,Left]
TurnRight
[1,2,Down]
TurnLeft

b. Using temporal difference Q-learning (equation on slide 64 of lecture notes), compute the Q values for Q([1,1,Right],TurnLeft), Q([1,1,Up],GoForward), Q([1,2,Up],TurnRight), Q([1,2,Right],GoForward), after each of five executions of the action sequence: TurnLeft, GoForward, TurnRight, GoForward (starting from [1,1,Right] for each sequence). You may assume  = 1,  = 0.9, and all Q values for non-terminal states are initially zero. Show your work.

2. Given the following bigram model, compute the probability of the sentence “the agent ate the wumpus”. Show your work.
Word 1
Word 2
Frequency
the
wumpus
1,000
wumpus
ate
500
ate
the
10,000
the
agent
5,000
agent
ate
100
agent
shot
500

3. Given the lexicon on slide 23 of the lecture notes and the grammar on slide 24 of the lecture notes, show all possible parse trees of each of the following sentences. If there is no parse, then just state “No parse”.
a. “the wumpus in 1 3 is smelly”

b. “the wumpus is smelly in 2 3”

c. “the wumpus and the agent eat”

4. Given the HMM for the [m] phoneme on slide 40 of the lecture notes, compute the probability of each possible path through the HMM for the sequence of frame features C1,C3,C4,C6. Show your work.

5. CptS 540 Students Only. The Stanford Parser is available at nlp.stanford.edu:8080/parser/. For each of the sentences in problem 3, show the parse tree obtained by the Stanford Parser. You may just copy-and-paste the Parse result into your homework submission; no need to draw the parse tree. But make sure the indentation is preserved.

a. “the wumpus in 1 3 is smelly”