Homework #1 Solution

Starting from:

$30

Home

Please submit your homework report to iLMS before 23:59 on April 22. A link to a shared Google document ‘hw1 demo registration’ will be announced later for you to reserve a time slot for an individual demonstration with TA. You are encouraged to consult or collaborate with other students while solving the homework problems, however you are required to turn in your own version of the report and programs written in your own words with supporting materials. Copying will not be tolerated.

Aim

Please identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. You have to use various classification models taught in class up to Chapter Five of the textbook. Other than the necessary data preprocessing such as partitioning a dataset into separate training and test datasets and scaling, you are required to investigate the eﬀectiveness of feature selection/extraction. You may apply new methods or use new packages to improve the classification performance, but if you do so, you have to give a brief introduction of the key concepts and provide necessary citations, instead of just direct copy paste or importing. However, in this assignment, you are not allowed to use any neural network related models (e.g., multilayer perceptron, LeNet, etc). In case any neural network related method is applied, you will receive no credits. Once an algorithm package is merged or imported into your code, please list the package link in your reference and describe its mathematical concepts in your report followed by the reason for adoption.

Dataset Description

The dataset can be downloaded from UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/Letter+Recognition.

1

There are 20000 instances in the dataset. Each instance has 16 features and one class label.

You can see the dataset information in the above web page.

Submission Format

You have to submit a compressed file hw1 studentID.zip which contains the following files:

hw1 studentID.ipynb: detailed report, Python codes, results, discussion and math-ematical descriptions;

hw1 studentID.tplx: extra Latex related setting, including the bibliography;

hw1 studentID.bib: citations in the ”bibtex” format;

hw1 studentID.pdf: the pdf version of your report which is exported by your ipynb with

%% jupyter nbconvert –to latex –template hw1 studentID.tplx hw1 studentID.ipynb

%% pdflatex hw1 studentID.tex

%% bibtex hw1 studentID

%% pdflatex hw1 studentID.tex

%% pdflatex hw1 studentID.tex

Other files or folders in a workable path hierarchy to your jupyter notebook (ipynb).

Coding Guidelines

For the purpose of individual demonstration with TA, you are required to create a func-tion code in your jupyter notebook, as specified below, to reduce the data dimensionality, learn a classification model, and evaluate the performance of the learned model.

hw1 student ID handwritten(in x, in label, mode, feature engr, f para, clas-si cation, c para, other para)

{ in x: [string] csv file or a folder path for handwritten letter image data.

{ in label: [string] csv file or a folder path, which contains labels to the corre-sponding instances in in x.

{ mode: [string] ’featengr’ for reducing the data dimensionality by feature engi-neering; ‘training’ for building models; ‘test’ for using built model to evaluate performance.

2

{ feature engr: [None or string] described in Report Requirement.

{ f para: [None or numpy array] default None, declaring necessary parameter(s) for feature selection/extraction.

{ classi cation: [None or string] described in Report Requirement.

{ c para: [None or numpy array] default None, declaring necessary parameter(s) for classification.

{ other para: [None or numpy array] default None, declaring necessary parame-ter(s) for your program other than the ones for feature engr and classification.

When Mode=\test", please dump the results to files, * hw1 studentID results.csv: one column with header ‘label’; * hw1 studentID performance.txt: showing the perfor-mance (accuracy) in ‘%’. Only output one number of the type “float”, without any extra ‘string’ words.

Report Requirement

List names of packages used in your program;

Describe the keywords in the argument of your function hw1 student ID handwritten(in x, in label, mode, feature engr, f para, classi cation, c para, other para)

{ a list of feature engr methods, for example

None: (default) no feature engineering (selection/extraction)

‘L1’ : L1-regularization feature selection

‘SFS’ : sequential feature selection

‘Forest’: assessing feature importance with random forest

‘PCA’: principal component analysis

‘GKPCA’: Gaussian kernel principal component analysis

‘LDA’: linear discriminant analysis and so on;

{ a list of classi cation methods, for example

None: used when Mode = ’featengr’

‘SVM’: Support vector machine

‘GKSVM’: Gaussian kernel Support vector machine

‘logReg’ : Logistic regression

‘Perceptron’ : Perceptron

‘KNN’: K-nearest neighbors

‘Decision’: Decision tree

3

‘Forest’: Random forest and so on;

For better explanation, draw owcharts of the methods or procedures used in the program;

Describe the mathematical concepts of any new algorithms or models employed as well as the roles they play in your feature selection/extraction or classification task in Markdown cells [?];

Discuss the performance among diﬀerent classifiers with/without feature selection/extraction.

5.1 Basic Requirement

Use the original grayscale image data without any feature selection/extraction to do classification. Then compare results after feature selection (such as L1 regularization, sequential feature selection, or feature importance assessing with random forest) or feature extraction (such as PCA, kernel PCA or LDA) is applied.

All the classifiers taught in class should be investigated and compared by performance. For SVM, you should investigate both linear-SVM and kernel-SVM. Also for percep-tron, logistic-regression and SVM classifiers, you should investigate their stochastic gradient descent (SGD) versions provided in scikit-learn to handle large datasets [?][?].

If you apply new methods or use new packages to improve the classification perfor-mance, you have to give a brief introduction of the key concepts and provide necessary citations/links, instead of just direct copy paste or importing.

4