$24
5. In the following figure, there are di↵erent SVMs with di↵erent decision boundaries. The training data is labeled as zi 2{ 1, 1}, represented as circles and squares respectively. Support vectors are drawn in solid circles. Determine which of the scenarios described below matches one of the 6 plots (note that one of the plots does not match any scenario). Each scenario should be matched to a unique plot. Explain your reason for matching each figure to each scenario. (10 pts)
(a) A soft-margin linear SVM with C = 0.02
(b) A soft-margin linear SVM with C = 20
(c) A hard-margin kernel SVM with (xi, xj ) = xTi xj + (xTi xj )2
(d)
A hard-margin kernel SVM with (xi, xj ) = exp(
5kxi
xj k2)
(e)
A hard-margin kernel SVM with (xi, xj ) = exp(
1
kxi
xj k2)
5
6. Programming Part: Multi-class and Multi-Label Classification Using Sup-port Vector Machines
(a) Download the Anuran Calls (MFCCs) Data Set from: https://archive.ics. uci.edu/ml/datasets/Anuran+Calls+%28MFCCs). Choose 70% of the data ran-domly as the training set.
(b) Each instance has three labels: Families, Genus, and Species. Each of the labels has multiple classes. We wish to solve a multi-class and multi-label problem. One of the most important approaches to multi-class classification is to train a classifier for each label. We first try this approach:
3
Homework 6 EE 559, Instructor: Mohammad Reza Rajati
i. Research exact match and hamming score/ loss methods for evaluating multi-label classification and use them in evaluating the classifiers in this problem.
ii. Train a SVM for each of the labels, using Gaussian kernels and one versus all classifiers. Determine the weight of the SVM penalty and the width of the Gaussian Kernel using 10 fold cross validation.2 You are welcome to try to solve the problem with both normalized3 and raw attributes and report the results. (15 pts)
iii. Repeat 6(b)ii with L1-penalized SVMs.4 Remember to normalize the at-tributes. (10 pts)
iv. Repeat 6(b)iii by using SMOTE or any other method you know to remedy class imbalance. Report your conclusions about the classifiers you trained. (10 pts)
v. Extra Practice: Study the Classifier Chain method and apply it to the above problem.
vi. Extra Practice: Research how confusion matrices, precision, recall, ROC, and AUC are defined for multi-label classification and compute them for the classifiers you trained in above.
2How to choose parameter ranges for SVMs? One can use wide ranges for the parameters and a fine grid (e.g. 1000 points) for cross validation; however,this method may be computationally expensive. An alternative way is to train the SVM with very large and very small parameters on the whole training data and find very large and very small parameters for which the training accuracy is not below a threshold (e.g., 70%). Then one can select a fixed number of parameters (e.g., 20) between those points for cross validation. For the penalty parameter, usually one has to consider increments in log(λ). For example, if one found that the accuracy of a support vector machine will not be below 70% for λ = 10−3 and λ = 106, one has to choose log(λ) 2{ 3, 2, . . . , 4, 5, 6}. For the Gaussian Kernel parameter, one usually chooses linear increments,e.g.
• 2 {.1, .2, . . . , 2}. When both σ and λ are to be chosen using cross-validation, combinations of very small and very large λ’s and σ’s that keep the accuracy above a threshold (e.g.70%) can be used to determine the ranges for σ and λ. Please note that these are very rough rules of thumb, not general procedures.
3It seems that this dataset is already normalized!
4The convention is to use L1 penalty with linear kernel.
4