$24
All questions have multiple-choice answers ([a], [b], [c], ...). You can collaborate with others, but do not discuss the selected or excluded choices in the answers. You can consult books and notes, but not other people’s solutions. Your solutions should be based on your own work. De nitions and notation follow the lectures.
Note about the homework
The goal of the homework is to facilitate a deeper understanding of the course material. The questions are not designed to be puzzles with catchy answers. They are meant to make you roll up your sleeves, face uncertainties, and ap-proach the problem from di erent angles.
The problems range from easy to di cult, and from practical to theoretical. Some problems require running a full experiment to arrive at the answer.
The answer may not be obvious or numerically close to one of the choices, but one (and only one) choice will be correct if you follow the instructions precisely in each problem. You are encouraged to explore the problem further by experimenting with variations on these instructions, for the learning bene t.
You are also encouraged to take part in the forum http://book.caltech.edu/bookforum
where there are many threads about each homework set. We hope that you will contribute to the discussion as well. Please follow the forum guidelines for posting answers (see the \BEFORE posting answers" announcement at the top there).
c 2012-2015 Yaser Abu-Mostafa. All rights reserved. No redistribution in any format. No translation or derivative products without written permission.
1
2 2N
Generalization Error
The modi ed Hoe ding Inequality provides a way to characterize the general-ization error with a probabilistic bound
P [jEin(g) Eout(g)j ] 2M e 2 2N
for any 0. If we set = 0:05 and want the probability bound 2M e
to be at most 0:03, what is the least number of examples N (among the given choices) needed for the case M = 1?
500
1000
1500
2000
More examples are needed.
Repeat for the case M = 10.
500
1000
1500
2000
More examples are needed.
Repeat for the case M = 100.
500
1000
1500
2000
More examples are needed.
Break Point
As shown in class, the (smallest) break point for the Perceptron Model in the two-dimensional case (R2) is 4 points. What is the smallest break point for the Perceptron Model in R3? (i.e., instead of the hypothesis set consisting of separating lines, it consists of separating planes.)
2
4
5
6
7
8
Growth Function
Which of the following are possible formulas for a growth function mH(N):
i)
1
+ N
iv) 2bN=2c
ii)
1
+ N +
N2
v) 2N
p
N
N
M
iii)
P
c
i
ib=1
m = 0 when m M.
where buc is the biggest integer u, and
i, v
i, ii, v
i, iv, v
i, ii, iii, v
i, ii, iii, iv, v
Fun with Intervals
Consider the \2-intervals" learning model, where h: R ! f 1; +1g and h(x) =
+1 if the point is within either of two arbitrarily chosen intervals and 1 oth-erwise. What is the (smallest) break point for this hypothesis set?
3
4
5
6
7
Which of the following is the growth function mH (N) for the \2-intervals" hypothesis set?
3
[a]
N+1
+ 1
[b]
N
2
4
+1
[c]
4
+
N
2
+ 1
N+1
+1
[d]
N+1
+
N+1
+
N+1
+
N+1
+ 1
4
3
2
1
None of the above
Now, consider the general case: the \M-intervals" learning model. Again h : R ! f 1; +1g, where h(x) = +1 if the point falls inside any of M arbitrarily chosen intervals, otherwise h(x) = 1. What is the (smallest) break point of this hypothesis set?
M
M + 1
M2
2M +1
2M 1
Convex Sets: The Triangle
Consider the \triangle" learning model, where h : R2 ! f 1; +1g and h(x) =
+1 if x lies within an arbitrarily chosen triangle in the plane and 1 otherwise. Which is the largest number of points in R2 (among the given choices) that can be shattered by this hypothesis set?
1
3
5
7
9
Non-Convex Sets: Concentric Circles
Compute the growth function mH(N) for the learning model made up of two concentric circles in R2. Speci cally, H contains the functions which are +1 for
a2 x21 + x22 b2
and 1 otherwise. The growth function is
4
[a]
N + 1
[b]
N+1
+ 1
[c]
N+1
+ 1
2
3
2N2+1
None of the above
5