Homework 2 :Question 1: Minimax

Starting from:

~~$35~~

$29

Home

Question 1: Minimax

Consider the zero-sum game tree shown below. Triangles that point up, such as at the top node (root), represent choices for the maximizing player; triangles that point down represent choices for the minimizing player. Outcome values for the maximizing player are listed for each leaf node, represented by the values in squares at the bottom of the tree. Assuming both players act optimally, carry out the minimax search algorithm. Write the values for the letter nodes in your “answer.txt” like A1B2C3D4. Please pay attention to the format of your answer, otherwise the online judge may give wrong grades.

Question 2: Expectiminimax

Consider the game tree shown below. As in the previous problem, triangles that point up, such as the top node (root), represent choices for the maximizing player; triangles that point down represent choices for the minimizing player. The circular nodes represent chance nodes in which each of the possible actions may be taken with equal probability. The square nodes at the bottom represent leaf nodes. Assuming both players act optimally, carry out the expectiminimax search algorithm. Write the values for the letter nodes in your “answer.txt” like A15B12C6D3…G12. Please pay attention to the format of your answer, otherwise the online judge may give wrong grades.

Question 3: Unknown Leaf Value

Consider the following game tree, where one of the leaves has an unknown payoff, x. Player 1 moves first, and attempts to maximize the value of the game.

Each of the next 3 questions asks you to write a constraint on x specifying the set of values it can take. In your constraints, you can use the letter x, integers, and the symbols > and <. If x has no possible values, write 'None' . If x can take on all values, write 'All'. As an example, if you think x can take on all values larger than 16, you should enter x >16.

1) Assume Player 2 is a minimizing agent and Player 1 knows this. For what values of x is Player 1 guaranteed to choose Action 1?

2) Assume Player 2 chooses actions at random with each action having equal probability and Player 1 knows this. For what values of x is Player 1 guaranteed to choose Action 1?

3) Denote the minimax value of the tree as the value of the root when Player 1 is the maximizer and Player 2 is the minimizer. Denote the expectimax value of the tree as the value of the root when Player 1 is the maximizer and Player 2 chooses actions at random with equal probability. For what values of x is the minimax value of the tree worth more than the expectimax value of the tree?

4) Is it possible to have a game, where the minimax value is strictly larger than the expectimax value?

Question 4: Alpha-Beta Pruning

Consider the game tree shown below. Triangles that point up, such as at the top node (root), represent choices for the maximizing player; triangles that point down represent choices for the minimizing player. Assuming both players act optimally, use alpha-beta pruning to find the value of the root node. The search goes from left to right; when choosing which child to visit first, choose the left-most unvisited child.

In your “answer.txt” file, please write the values for the letter nodes in one line. And then write the leaf nodes that don't get visited due to pruning in the following line. Your answer to this question may look like:

A1B2C3D4

Hint: Note that the value of a node where pruning occurs is not necessarily the

maximum or minimum> (depending< on which node) of its children. When you prune on

conditions or , assume that the value of the node is .

Question 5: Possible Pruning

Assume we run − pruning, expanding successors from left to right, on a game with tree as shown in the figures below.

Hint: Perhaps the simplest check is as follows: pruning of children of a minimizer node is possible (for some assignment to the terminal nodes), when both of the following

are met: (i) the value of another child of has already been determined, conditions(ii)somewhere on the path from to the root node, there is a maximizer node

any such alternative option for the maximizer had a higher value than the value of the "another child" of for which the value was already determined.

For the following two parts, which of the statements are true? Please write your answer to each part in a line, like

A. There exists an assignment of utilities to the terminal nodes such that no pruning will be achieved (shown in Figure (a)).

B. There exists an assignment of utilities to the terminal nodes such that the pruning shown in Figure (b) will be achieved.

C. There exists an assignment of utilities to the terminal nodes such that the pruning shown in Figure (c) will be achieved.

D. None of the above.

A. There exists an assignment of utilities to the terminal nodes such that the pruning shown in Figure (d) will be achieved.

B. There exists an assignment of utilities to the terminal nodes such that the pruning shown in Figure (e) will be achieved.

C. There exists an assignment of utilities to the terminal nodes such that the pruning shown in Figure (f) will be achieved.

D. None of the above.

Question 6: Suboptimal Strategies

Player MAX and player MIN are playing a zero-sum game with a finite number of possible moves. MAX calculates the minimax value of the root to be . You may assume that at every turn, each player has at least 2 possible actions. You may also assume that a different sequence of moves will always lead to a different score (i.e., no two terminal nodes have the same score).

1) Which of the following statements are true?

to the minimax strategy, the outcome of the game could be less than .

A. Assume MIN is playing sub-optimally at every turn. If MAX plays according

B. Assume MIN is playing sub-optimally at every turn, but MAX does not know this. The outcome of the game could be larger than (i.e. better for MAX).

2) For this question, assume that MIN is playing randomly (with a uniform distribution) at every turn, and MAX knows this. Then which of the following statements are true?

A. There exists a policy for MAX such that MAX can guarantee a better outcome than .

B. There exists a policy for MAX such that MAX's expected outcome is better than .

C. To maximize his or her expected outcome, MAX should play according to the minimax strategy (i.e. the strategy that assumes MIN is playing optimally).

3) Which of the following statements are true?

than .

A. There exists a policy for MAX such that MAX can guarantee a better outcome

B. There exists a policy for MAX such that MAX's expected outcome is better than .

C. To maximize his or her expected outcome, MAX should play according to the minimax strategy (i.e. the strategy that assumes MIN is playing optimally).

4) Which of the following statements are true?

A. Assume MIN is playing sub-optimally at every turn, and MAX knows exactly how MIN will play. There exists a policy for MAX to guarantee a better

outcome than .

is playing sub-optimally at every turn

. MAX following the

Assume MIN

minimax policy will guarantee a better outcome than

Question 7: Rationality of Utilities

each of the

= [0.25, ; 0.3, ; 0.35, ; 0.1, ]

= 2

( )=5

the utility of this lottery,

( )=4

( )=3

( )

Consider a lottery

( )

, where the utility values of

Consider

lottery

where

and

outcomes are

. What is

is a

1 = [0.25, ; 0.75, 2]

( )=6

2 =

[0.5, ; 0.5, ]

( )=4 ( )=0

and

. Assuming

Assume

( ,1)

, where

rational

lottery, and

. What is the utility of the

≻≻

= [0.5, ; 0.5, ]

≻

first lottery,

≻

preferences, which of the following statements are guaranteed to be true?

≻

Question 8: Certainty Equivalent Values

Consider the utility function shown below.

i.e., what is

such

[0.6, $0; 0.4, $100]

Under the above utility function, what is the certainty equivalent monetary value in

Hint: Keep

($ ) = ([0.6, $0; 0.4, $100])

dollars ($) of the lottery

( +(1−))

that

([, ;1−, ])

in mind that

is not equal to

of radius 1,

Our Pacman board now has food pellets of 3 different sizes – pellet

of radius 2 and

of radius 3. In different moods, Pacman 1has different

preferences among these pellets. In each of the following questions, you are given

Pacman's preference for the different pellets. From among the options pick the utility

( )

functions that are consistent with Pacman's preferences, where each utility function

is given as a function of the pellet radius, and is defined over non-negative values

∼2∼3

of .

( )

= 0

( )

= 3

( )

=2 +4

( )

= −

( )

= 2

( )

= − 2

( )

= √

( )

= −√

≺2≺3

Irrational preferences

B. ( )=0

( )=3

( ) =

( )=2 +4

( ) = −

()= 2

( )

= −

( )

= √

( )=−√

≻2≻3

Irrational preferences

B. ( )=0

( )=3

( ) =

( )=2 +4

( ) = −

()= 2

( )

= −

( )

= √

( )=−√

Irrational preferences

(1≺ 2

≺ 3)

∧ ( 2

≺ (50 − 50 lottery among 1

and 3

))

( )

= 0

( )=3

( )

=2 +4

( )

= −

( )

= 2

( )

= −

( )

= √

( )=−√

∧ ( 2

≻ (50 − 50 lottery among 1

and 3

))

(1≻ 2

≻ 3)

Irrational preferences

B. ( )=0

( )=3

( )

=2 +4

( )

= −

( )

= 2

( )

= −

( )

= √

( )=−√

≺ 3) ∧ ((50 − 50 lottery among 2 and 3) ≺ (50 −

( 1≺ 2)∧( 2

Irrational preferences

50 lottery among 1 and 2))

( )

= 0

( )

= 3

( )

=2 +4

( )

= −

( )

= 2

( )

= −

( )

= √

( )=−√

Irrational preferences

7) Which of the following would be a utility function for a risk-seeking preference?

That is, for which utility(s) would Pacman prefer entering a lottery for a random

( )=0

over receiving a pellet of size ?

food pellet, with expected size

( )=3

( ) =

( )=2 +4

( ) = −

( ) =

( )

= −

(

)

= √

( )=−√