$29
Provide credit to any sources other than the course sta that helped you solve the problems.
This includes all students you talked to regarding the problems.
You can look up de nitions/basics online (e.g., wikipedia, stack-exchange, etc)
Problem 1. (10 points). Suppose AdaBoost is run on n training examples, and suppose on each round that the weighted training error "t of the tth weak hypothesis is at most 12 , for some number > 0. Show that after T > ln2 n2 rounds of AdaBoost the nal combined classi er has zero training error!
Problem 2. (10 points). Recall bagging. Starting from a training set S of size n, we created m bootstrap training sets S1; : : : ; Sm, each of size n each by sampling with replacement from S.
1. For a bootstrap sample Si, what is the expected fraction of the training set that does not appear at all in Si? As n ! 1, what does this fraction approach?
2. Let m > 2 ln n, and n ! 1. Show that the expected number of training examples in S that
appear in at least one Si is more than n 1.
Problem 3. (10 points). The tanh function is tanh(y) = (ey e y)=(ey + e y). Consider the function tanh(w0 + w1x1 + w2x2), with ve inputs, and a scalar output.
1. Draw the computational graph of the function.
2. What is the derivative of tanh(y) with respect to y.
3. Suppose (w0; w1; w2; x1; x2) = ( 2; 3; 1; 2; 3). Compute the forward function values, and back-propagation of gradients.
Problem 4. (30 points). Please see attached notebook for details.
1