Problem Set 1 (Efficient Python/NumPy Programming) Solution

Starting from:

~~$30~~

$24

Home

Introduction

For computational e ciency of typical operations in machine learning applications, it is very bene cial to use NumPy arrays together with vectorized commands, instead of explicit for loops. The vectorized commands are better optimized, and bring the performance of Python code (and similarly e.g. for Matlab) closer to lower level languages like C. In this exercise, you are asked to write e cient implementations for three small problems that are typical for the eld of machine learning.

Getting Started

Follow the Python setup tutorial provided on our github repository here:

github.com/epfml/ML course/tree/master/labs/ex01/python setup tutorial.md

After you are set up, clone or download the repository, and start by lling in the template notebooks in the folder /labs/ex01, for each of the 3 tasks below.

To get more familiar with vector and matrix operations using NumPy arrays, it is also recommended to go through the npprimer.ipynb notebook in the same folder.

Note: The following three exercises could be solved by for-loops. While that's ok to get started, the goal of this exercise sheet is to use the more e cient vectorized commands instead:

Useful Commands

We give a short overview over some commands that prove useful for writing vectorized code. You can read the full documentation and examples by issuing help(func). At the beginning: import numpy as np

a * b, a / b: element-wise multiplication and division of matrices (arrays) a and b

a.dot(b): matrix-multiplication of two matrices a and b

a.max(0): nd the maximum element for each column of matrix a (note that NumPy uses zero-based indices, while Matlab uses one-based)

a.max(1): nd the maximum element for each row of matrix a

np.mean(a), np.std(a): compute the mean and standard deviation of all entries of a

a.shape: return the array dimensions of a

a.shape[k]: return the size of array a along dimension k

np.sum(a, axis=k): sum the elements of matrix a along dimension k

linalg.inv(a): returns the inverse of a square matrix a

A broader tutorial can be found here: http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf

For users who were more familiar with Matlab, a nice comparison of the analogous functions can be found here:

https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 1: Two sets of points in the plane. The circles are a subset of the dots and have been perturbed randomly.

Task A: Matrix Standardization

The di erent dimensions or features of a data sample often show di erent variances. For some subsequent operations, it is a bene cial preprocessing step to standardize the data, i.e. subtract the mean and divide by the standard deviation for each dimension. After this processing, each dimension has zero mean and unit variance. Note that this is not equivalent to data whitening, which additionally de-correlates the dimensions (by means of a coordinate rotation).

Write a function that accepts data matrix x 2 Rn d as input and outputs the same data after normalization. n is the number of samples, and d the number of dimensions, i.e. rows contain samples and columns features.

Task B: Pairwise Distances in the Plane

One application of machine learning to computer vision is interest point tracking. The location of corners in an image is tracked along subsequent frames of a video signal (see Figure 1 for a synthetic example). In this context, one is often interested in the pairwise distance of all points in the rst frame to all points in the second frame. Matching points according to minimal distance is a simple heuristic that works well if many interest points are found in both frames and perturbations are small.

Write a function that accepts two matrices P 2 Rp 2; Q 2 Rq 2 as input, where each row contains the (x; y) coordinates of an interest point. Note that the number of points (p and q) do not have to be equal. As output, compute the pairwise distances of all points in P to all points in Q and collect them in matrix D. Element Di;j is the Euclidean distance of the i-th point in P to the j-th point in Q.

Task C: Likelihood of a Data Sample

In this exercise, you are not required to understand the statistics and machine learning concepts described here yet. The goal here is just to practically implement the assignment of data to two given distributions, in Python.

A subtask of many machine learning algorithms is to compute the likelihood p(xnj ) of a sample xn for a given density model with parameters . Given k models, we now want to assign xn to the model for which the likelihood is maximal: an = arg maxm p(xn j m), where m = 1; : : : ; k. Here m = ( m; m) are the parameters of the m-th density model ( m 2 Rd is the mean, and m is the so called covariance matrix).

We ask you to implement the assignment step for the two model case, i.e. k = 2. As input, your function receives a set of data examples xn 2 Rd (indexed by 1 n N) as well as the two sets of parameters 1 = ( 1; 1) and 2 = ( 2; 2) of two given multivariate Gaussian distributions:

p(xn j ; ) = (2 )d=2j j1=2
exp
2
(xn) 1(xn)
:
1

1

j j is the determinant of and 1 its inverse. Your function must return the 'most likely' assignment an 2 f1; 2g for each input point n, where an = 1 means that xn has been assigned to model 1. In other words in the case that an = 1, it holds that p(xn j 1; 1) p(xn j 2; 2).

2
Theory Questions

In addition to the practical exercises you do in the labs, as for example above, we will in future labs also provide you some theory oriented questions, to prepare you for the nal exam. As the rest of the exercises, it is not mandatory to solve them, but we would recommend that you at least look at - and try - some of them during the semester. From last year's experience, many students where surprised by the heavy theoretical focus of the nal exam after having worked on the two very practical projects. Do not fall for this trap! Passing the course require acquiring both a practical and a theoretical understanding of the material, and those exercises should help you with the latter.

However, please note that the di culty of these exercises might not be of the same level as the exam and are not enough by themselves; you should read the additional material given at the end of the lectures and do the exercises in the recommended books - see The course info sheet.

Note that we will try to, but might not, provide solutions.

This week, as we just started the course, there are no exercises. You should refresh your mind of the prerequisites, especially on the following topics.

Make sure your linear algebra is fresh in memory, especially { Matrix manipulation (Multiplication, Transpose, Inverse) { Ranks, Linear independence

{ Eigenvalues and Eigenvectors

You can use the following resources to help you get up to speed if needed.

{ The Linear Algebra handout

{ Gilbert Strang's Introduction to Linear Algebra. Some chapters are available online, and the book (along with many other textbooks on linear algebra) is available at the EPFL Library.

If it has been long since your last calculus class, make sure you know how to handle Gradients. You can nd a quick summary and useful identities in The Matrix Cookbook.

For probability and statistics, you should at least know about { Conditional and joint probability distributions

{ Bayes theorem

{ Random variables, independence, variance, expectation { The Gaussian distribution

If you need a refresh, check Chapter 2 in Pattern Recognition and Machine Learning by Christopher Bishop, available at the EPFL Library.

3