Starting from:
$30

$24

Multiple Instance Learning: image classification Solution

In this exercise we will make an image classi er, using a Multiple Instance Learning approach. To keep it simple, we will use a relatively tiny dataset, with simple features and a simple classi er.




We will try to distinguish images of apples from bananas. Below are two examples of images, the rst containing an apple, the second containing a banana. Notice that the backgrounds in the images are quite similar.




Note that it is very important to explain how you come to an answer. For some boring questions you don’t need to give an answer, and I will put




No answer needed in the question.

















































To make an image classi er based on MIL, we have to make several steps:




De ne the instances that constitute a bag. Here we will make use of a mean-shift image segmentation to segment an image into subparts.






1















De ne the features that characterize each instance. Here we will just use the (average) red, green and blue color.



De ne the MIL classi er. Here we will use a naive approach, that is using a standard classi er trained on the individual instances.



De ne the combination rule that combines the predicted labels on the individual instances to a predicted label for a bag.



In the coming exercise we will go though it step by step.




Note: for the mean-shift algorithm and the Liknon classi er (also called, the L1 support vector classi er), I supplied Matlab functions. If you want to use another programming language, feel free, but you have to nd (or make) your own implementations.







The Naive MIL classi er



No answer needed Get the data sival apple banana.zip and some additional Matlab functions additionalcode.zip from Brightspace. The data should contain two folders, one with apple, the other with banana. The additionalcode contains a function to segment an im-age im meanshift, and a function to convert a cell-array of bags to a Prtools dataset bags2dataset.



Implement a script that reads all images from a given directory. You can use the Matlab functions dir and imread.




Next implement a function extractinstances that segments an image using the Mean Shift algorithm (using im meanshift), computes the average red, green and blue color per segment, and returns the resulting features in a small data matrix.



Notice that the number of segments that you obtain depends on a width-parameter that you have to supply to im meanshift. Set this parameter such that the background and the foreground are segmented reasonably well. Would it be better to oversegment, or undersegment the images? What value of the width parameter did you set?




Create a function gendatmilsival that creates a MIL dataset, by go-ing through all apple and banana-images, extracting the instances per









2















image, and storing them in a Prtools dataset with bags2dataset. Note that, in addition to the class labels, also the bag identi ers are stored in the dataset. If you are interested, you can retrieve them using bagid = getident(a,’milbag’).




How many bags did you obtain? How many features do the instances have? How many instances are there per bag? Make a scatterplot to see if the instances from the two classes are a bit separable.




Create a function combineinstlabels that accepts a list of labels, and outputs a single label obtained by majority voting.



Now we are almost ready to classify images... First we have to train a classi er; let’s use a Fisher classi er for this. Now apply the trained classi er to each instance in a bag, classify the instances (using labeld), and combine the label outputs (using your combineinstlabels) to get a bag label.



How many apple images are misclassi ed to be banana? And vice versa? Why is this error estimate not trustworthy? Estimate the clas-si cation error in a trustworthy way!




Invent at least two ways in which you may improve the performance of this classi er. Argue why it may improve the performance.






MILES



The classi er that we used in the previous section was very simple. In this section we implement one of the most successful classi ers, called MILES. Also have a look at the article "MILES: Multiple-instance learning via em-bedded instance selection." by Chen, Yixin, Jinbo Bi, and James Ze Wang, IEEE Transactions on Pattern Analysis and Machine Intelligence, (2006): 1931-1947.




Implement a function bagembed that represents a bag of instances Bi by a feature vector m(Bi), using equation (7) from the article.



How large will this feature vector m(Bi) become for our apple-banana problem?










3















Make a Prtools dataset with the vectors m(Bi) and their corresponding labels yi. Choose a sensible value for such that not all numbers become 0 or 1. Train on this large dataset a L1-support vector classi er (or, more correctly called, LIKNON): liknonc.



Test the LIKNON classi er on this dataset: how many errors is this classi er making? Is this classi er better than the naive MIL classi er trained in the previous section? What can you do to make this MILES classi er perform better?






Another MIL classi er



Finally, implement your own MIL classi er. Any classi er may do, except for the Naive approach1 and the MILES classi er (obviously). It may be something you invented yourself, or some classi er from literature.



Explain what MIL classi er you are implementing, give the code, and compare its performance with that of the Naive classi er (i.e. the Fisher classi er with a majority vote), and the performance of the MILES classi er.









































































1No, just replacing the Fisher classi er in the rst section is not enough!




4

More products