$24
For the environment to the right, the agent tried 6 episodes from the start state A to one of the terminal states (C, D, and E), which are listed below:
Episode #1:
Episode #2:
Episode #3:
Episode #4:
Episode #5:
Episode #6:
state = A, action = R, new state = C, reward = +10 state = A, action = L, new state = B, reward = 0 state = B, action = R, new state = E, reward = –1000 state = A, action = L, new state = B, reward = 0 state = B, action = L, new state = D, reward = +200 state = A, action = L, new state = B, reward = 0 state = B, action = R, new state = E, reward = –100 state = A, action = R, new state = C, reward = +25 state = A, action = L, new state = B, reward = 0 state = B, action = L, new state = D, reward = +400
Your task is to build the Q-table from these results. The Q-table has two states and two actions per state.
Use learning rate = 0.5 and discount factor = 1. All entries of the Q-table are zero initially.
2. For each sentence below, determine whether it is valid, unsatisfiable, or neither. Briefly explain your answers. You can also use the equivalence rules or truth tables to prove your answers.
Smoke Smoke
Smoke Fire
(Smoke Fire) (Smoke Fire)
Smoke Fire Fire
((Smoke Heat) Fire) ((Smoke Fire) (Heat Fire))
Big Dumb (Big Dumb)
(Big Dumb) Dumb
3. Consider the snapshot (shown to the right) of the Minesweeper game. Let A, B, C
and D be four propositional symbols representing the existence of mines in their
locations.
(a) Try to write down propositional-logic sentences according to the game rule. D
A B C
Convert the sentences to CNF.
Try to determine their truth values by repeatedly applying the resolution inference rule.
Here you will practice the inference of the Wumpus world using first-order logic. Some notes about the model and symbols are:
Treat Wumpus as a constant object. (There is only one Wumpus.) Use variables to represent the squares.
Predicates that are unary relations of the squares: Breezy, Smelly, Pit.
Predicates that are binary relations of the squares: Adjacent.
Use the function symbol Position for the square occupied by an object. (Of course, the only object that can occupy a square is Wumpus.)
For the following rules of the Wumpus world, rewrite them into FOL sentences. You can use Breezy, Smelly, Pit, Safe, At, and Adjacent as the predicates. (Adjacent is a relation between two squares. At is a relation between an object and a square. Treat Wumpus as a constant object.)
A square with no Wumpus and no pit is safe. (This is the definition of the predicate Safe.)
A square is breezy if and only if it is adjacent to a pit.
A square is smelly if and only if it is adjacent to the Wumpus.
The Wumpus can only be in a square with no pit.
Given the following information, prove that the Wumpus is at square (2,2) using resolution in FOL. You need to convert all the relevant sentences to CNF first.
The square (1,1) is safe. The square (2,1) is smelly. The square (1,2) is smelly.