Homework #6: Feedforward CNN Design and Its Application to the MNIST Dataset Solution

Starting from:

$30

Home

General Instructions:

Read Homework Guidelines for the information about homework programming, write-up and submission. If you make any assumptions about a problem, please clearly state them in your report.

Do not copy sentences directly from any listed reference or online source. Written reports and source codes are subject to verification for plagiarism. You need to understand the USC policy on academic integrity and penalties for cheating and plagiarism. These rules will be strictly enforced.

You can use the online github codes.

In this homework, please read and understand the paper “Interpretable Convolutional Neural Networks via Feedforward Design” [1] introduced by professor Kuo.

1) Understanding of feedforward-designed convolutional neural networks (FF-CNNs) (15%)

An FF-CNN consists of two modules in cascade: 1) the construction of conv layers using the Saab

(Subspace approximation with adjusted bias) transforms: and 2) the construction of fully-connected (FC) layers using the multi-stage linear least squared regressor (LSR).

Summarize FF-CNNs with a flow chart and explain it in your own words.

Explain the similarities and differences between FF-CNNs and backpropagation-designed CNNs (BP-CNNs).

Do not copy any sentences from [1] or other papers directly. It is plagiarism. The scores will depend on your degree of understanding.

Image reconstructions from Saab coefficients (35%) Apply Saab transforms to images in the MNIST dataset [2].

Compute the Saab coefficients (you can use online source codes [3] or implement by yourself) of four handwritten digits images as shown in Figure 1 and implement the reconstruction algorithm (write your own codes) to transform the Saab coefficients back to images.

To evaluate the reconstruction results, you need to show the reconstructed images and compute PSNR scores between original images and reconstructed images.

Architecture setting:

In this problem, you should use two stage Saab transforms where the spatial size of the transform kernels is 4x4. The stride of each stage is 4 (non-overlapping). Thus, at the output, the dimension of your Saab coefficients of an image should be 2x2xN, where N is the number of transform kernel in the second stage. You need to evaluate on four different settings (different kernel numbers of each stage) and discuss your results.

Professor C.-C. Jay Kuo Page 1 of 2
EE 569 Digital Image Processing: Homework #6

Figure 1

3) Handwritten digits recognition using ensembles of feedforward design (50%)

In this problem, you will apply an FF-CNN to solve handwritten digits recognition. Train an FF-CNN using the 60,000 training images from the MNIST dataset. Adopt the LeNet-5-like architecture where the filter numbers of the first- and the second-conv layers and the first- and the second-FC layers are 6, 16, 120 and 80, respectively. The spatial size of the transform kernels is 5x5 and the stride is 1 for each conv layer. To reduce the spatial dimension, max-pooling layer is adopted.

Report the training and testing classification accuracy for individual FF-CNN on the MNIST dataset.

One way to improve the performance is building the ensemble systems of FF-CNNs. Train ten different FF-CNNs and ensemble their results following the method in [4]. Diversity is the key to have successful ensembles, and paper [4] introduces three strategies to increase diversities in an ensemble of FF-CNNs which you can refer to. Explain and justify your strategies to generate various FF-CNNs in an ensemble and report the training and testing classification accuracy of your ensemble system.

Error analysis: Please compare classification error cases arising from BP-CNNs (use best result in your HW#5) and FF-CNNs. What percentages of errors are the same? What percentages are different? Please give explanations to your observations. Also, please propose ideas to improve BP-CNNs, FF-CNNs or both and justify your proposal. There is no need to implement your proposed ideas.

References

[1][ Kuo, C. C. J., Zhang, M., Li, S., Duan, J., & Chen, Y. (2019). Interpretable convolutional neural networks via feedforward design. Journal of Visual Communication and Image Representation.

[2][MNIST] http://yann.lecun.com/exdb/mnist/

https://github.com/davidsonic/Interpretable_CNN

Chen, Y., Yang, Y., Wang, W., & Kuo, C. C. J. (2019). Ensembles of feedforward-designed convolutional neural networks. arXiv preprint arXiv:1901.02154.

Professor C.-C. Jay Kuo Page 2 of 2