Starting from:
$35

$29

Assignment #3 Solution

Note 1: Your submission header must have the format as shown in the above-enclosed rounded rectangle.

Note 2: Homework is to be done individually. You may discuss the homework problems with your fellow students, but you are NOT allowed to copy – either in part or in whole – anyone else’s answers.

Note 3: Your deliverable should be a .pdf file submitted through Gradescope until the deadline. Do not forget to assign a page to each of your answers when making a submission. In addition, source code (.py files) should be added to an online repository (e.g., github) to be downloaded and executed later.

Note 4: All submitted materials must be legible. Figures/diagrams must have good quality.
Note 5: Please use and check the Canvas discussion for further instructions, questions, answers, and hints.

    1. [15 points] Considering the 1-D dataset below and the following bootstrap samples (bagging rounds 1 to 5) randomly generated during the bagging process. Show how a bagging algorithm can perfectly classify this data by drawing and writing the decision stumps for each round, the summary table of the trained decision stumps, and the combination table of your base classifiers with the final predictions. Hint: you might need to test alternative but equally accurate decision stumps on your training set to get maximum accuracy.

x
1
2
3
4
5
Dataset
y
-1
-1
1
1
-1















x
1
1
2
4
5
Round 1







x
3
3
4
4
5
Round 2







x
1
2
2
5
5
Round 3







x
1
3
4
4
5
Round 4







x
1
2
3
3
4
Round 5


    2. [12 points] Considering the different 1-D dataset below and the following rounds from 1 to 3 randomly generated during the boosting process. Show how a boosting algorithm can perfectly classify this data by drawing and writing the decision stumps and weights for each round, the summary table of the trained decision stumps, and the combination table of your base classifiers with the weighted final predictions. Hint: there is a single best decision stump (more accurate) for each round.







Dataset
x
1
2
3
4
5

y
1
1
-1
-1
1


x
1
2
3
4
4
Round 1






Round 2
x
5
5
5
5
5








x
3
3
4
4
5
Round 3



    3. [15 points] Complete the Python program (bagging_random_forest.py) that will read the file optdigits.tra (3,823 samples) that includes training instances of handwritten digits (optically recognized). Read the file optdigits.names to get detailed information about this dataset. Also, check the file optdigits-orig.tra and optdigits-orig.names to see the original format of this data, and how it was transformed to speed-up the learning process (pre-processing phase). Your goal is to build a base classifier by using a single decision tree, an ensemble classifier that combines multiple decision trees, and a Random Forest classifier to recognize those digits. To test the accuracy of those distinct models, you will use the file optdigits.tes (1,797 samples).

    4. [20 points] Say you are given the training dataset shown below. This is a binary classification task in which the instances are described by two integer-valued attributes.

















    a. [2 points] Draw the decision boundary and its parallel hyperplanes for a linear SVM with maximum margin (hard margin formulation) and identify the support vectors.

    b. [2 points] If a black circle is added as a training sample in the position (7,5), does this affect the previously learned decision boundary? Explain why.

    c. [2 points] If a yellow circle is added as a training sample in the position (4,2), does this affect the previously learned decision boundary? Explain why.

    d. [2 points] If a black circle is added as a test sample in the position (7,5), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

    e. [2 points] If a black circle is added as a test sample in the position (6,4), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

    f. [2 points] If a yellow circle is added as a test sample in the position (4,2), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

    g. [2 points] If a yellow circle is added as a test sample in the position (5,3), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

    h. [2 points] If a black circle is added as a test sample in the position (5,3), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

        i. [2 points] If a yellow circle is added as a test sample in the position (6,4), will this sample be classified correctly according to the previously learned decision boundary? Explain why.

        j. [2 points] If a black circle is added as a training sample in the position (4,4), how this will affect the decision boundary if C = 1 and C = ∞? Consider the soft margin formulation.


    5. [11 points] Consider the following 1-dimensional data with two classes:







            a. [3 points] Find the decision boundary of a linear SVM on this data (hard-margin formulation) and identify the support vectors (write the coordinate to provide your answers).

            b. [4 points] Find the solution parameters and for this linear SVM and the width of the margin. Hint: place the identified support vectors (positive and negative) into the formula (  . + ) = 1 since you know this formula holds for them.

            c. [4 points] Suppose we remove the point (1,+) from this training set and train the SVM again. Find the new values of the solution parameters and and the width of the margin.


    6. [12 points] The quadratic kernel (  , ) = (  . + 1)2 should be equivalent to mapping each into a six-dimensional space where

( ) = (  12, 22, √2  1  2, √2  1, √2  2, 1)


for the case where = ( 1, 2). Demonstrate this equivalence by answering the following questions while using the data points: = (1,2), = (2,4).
        a. ( )

        b. ( )

        c. ()()

        d. (  ,  ). Hint: your answers for (c) and (d) should be the same. By using the kernel function, SVM

“cheats” and performs significantly fewer calculations. This is known by “kernel trick”.


    7. [15 points] Complete the Python program (svm.py) that will also read the file optdigits.tra to build multiple SVM classifiers. You will simulate a grid search, trying to find which combination of four SVM hyperparameters (c, degree, kernel, and decision_function_shape) leads you to the best prediction performance. To test the accuracy of those distinct models, you will also use the file optdigits.tes. You should update and print the accuracy, together with the hyperparameters, when it is getting higher.


Important Note: Answers to all questions should be written clearly, concisely, and unmistakably delineated. You may resubmit multiple times until the deadline (the last submission will be considered).

NO LATE ASSIGNMENTS WILL BE ACCEPTED. ALWAYS SUBMIT WHATEVER YOU HAVE COMPLETED FOR PARTIAL CREDIT BEFORE THE DEADLINE!

More products