Starting from:
$30

$24

Artificial intelligence Homework 3

Question 1: Designing a Bayesian Network [15]

Marv the cat is having a bad day. His brother Harry ate all the food set out by their owner, Shannon, so Marv has to find a way to feed himself. He can try to catch a fish in the lake outside, and if he succeeds, he’ll eat it. If it’s a hot day, Marv will be sluggish, so he’s less likely to catch a fish. Marv can also try to steal Shannon’s sandwich, which does not depend on the outside temperature. However, even if he succeeds in stealing the sandwich, he might not get to eat it (for example, Shannon may notice and snatch it back). Finally, if Marv manages to eat at least something, he might feel content, in spite of everything. However, if it’s hot out, he is less likely to feel content in general.

Consider the Boolean variables: H (it’s a hot day), C (Marv is content), E (Marv eats at least one item), F (Marv catches a fish), S (Marv steals the sandwich).

    a. Draw a Bayesian network for this domain. Only include the Boolean variables listed above, so your network should have 5 nodes.

    b. Suppose the probability that Marv catches the fish is x when it’s hot, and y when it is not. Give the conditional probability table associated with F.

    c. Suppose that if Marv catches a fish, he will eat it with probability 1, and if he successfully steals the sandwich, he will eat it with probability 0.5. If he fails at both hunting and stealing, then he will not eat anything. Give the conditional probability table associated with E.

    d. Suppose Marv is content. Write down the expression for the probability he caught a fish, in terms of the various conditional probabilities in the network.
Question 2: Inference in Bayesian Networks [25]
Consider the following Bayesian Network


    R: rush hour
    B: bad weather
    A: accident
    T: traffic jam
    S: sirens






We denote random variables with capital letters (e.g., R for “rush hour”), and the binary outcomes with lowercase letters (e.g., r and ¬r for “it is rush hour” and “it is not rush hour,” respectively).

The network has the following parameters:

P(b)=0.4
P(r)=0.2

P(t|r, b,a) = 0.95
P(t|r, ¬b,a) = 0.9
P(t|r,b,¬a) = 0.88
P(t|r,¬b,¬a) = 0.83
P(t|¬r,b,a) = 0.6
P(t|¬r,b,¬a) = 0.3
P(t|¬r,¬b,a) = 0.7
P(t|¬r,¬b,¬a) = 0.05

P(s|a) = 0.92
P(s|¬a) = 0.3

P(a|b) = 0.65
P(a|¬b) = 0.25

Compute the following terms using basic axioms of probability and the conditional independence properties encoded in the above graph.
    a. P(a, ¬r)

    b. P(b, a)

For the query P(b|a):
c. Use Bayes Ball to determine the set of nodes that can be pruned from the graph.
d. Compute P(b|a) using the simplifications determined in Part c.
Question 3: Variable Elimination [25]

For the graph above, compute the MAP result of querying P(T|b) using variable elimination with the following order: B=b, R, A, S, T.

Clearly explain each step. For each of the intermediate factors created, explain what probabilistic function it represents.
Question 4: Learning with Bayesian Networks [35]

Consider the following Bayesian network. Assume that the variables are distributed according to Bernoulli distributions.












    a. We are given the following dataset with 129 samples, from which we will estimate the parameters of the model.


A
B
C
D
# Instances

0
0
0
0
22

0
0
0
1
5

0
0
1
0
18

0
0
1
1
3

0
1
0
0
14

0
1
0
1
2

0
1
1
0
9

0
1
1
1
10

1
0
0
0
12

1
0
0
1
0

1
0
1
0
8

1
0
1
1
0

1
1
0
0
0

1
1
0
1
9

1
1
1
0
13

1
1
1
1
4

    i. Enumerate the parameters that must be learned. Specify the parameter name and the probability that it represents (i.e., for each parameter, write something in the form, =     ⁡(  ).

    ii. Give the maximum likelihood estimate for each parameter.

    iii. Give the maximum a posteriori (MAP) estimate for each parameter after applying Laplace smoothing.

    b. Assume that in addition to the data in the table above, you are given the following incomplete data instances:


A
B
C
D
#Instances
S1
1
?
1
?
10
S2
1
1
?
0
10

We will apply the (soft) EM algorithm on these instances. Initialize the model using your parameter estimates from Part a., Subpart ii. (i.e., use the MLE).

    i. Show the computation of the first E-step, providing the weights for each possible assignment of the incomplete data for each sample.

    ii. What are the parameters obtained for the first M-step? Weight each of the samples from the original dataset and the 20 new samples equally (i.e., you now have 149 samples).
    iii. Show the computation of the second E-step.

More products