Starting from:
$35

$29

Project 3, Part C: Video Mosaicing Solution

Instructions




 
Video Mosaicing is a team project. The maximum size of a team is two students.




A team is not allowed to collaborate with another team. Only one individual from each team must submit the code for this part.

Maximum late days allowed for this part of the project is the average of the late days of the two students in the team. The late days used will be subtracted from the individual tally of late days for each student.




 
If you prefer, you can do the entire part on your own. However, completion of the project is MANDA-TORY and no extra credit will be offered if you do it individually.




 
You must make one submission on Canvas. We recommend that you include a README.txt file in your submissions to help us execute your code correctly.




– Place your code and resulting videos for part C into a folder named "Video_Mosaicing". Sub-mit this as a zip file named <Group_Number_Project3C.zip




 
Your submission folder should include the following:




– your .m or .py scripts for the required functions




– .m or .py demo scripts for generating the video mosaic




– any additional .m or .py files with helper functions for your code, e.g. Harris corner detectors or SIFT features




– the input videos you use




– the resulting video mosaic




– a .pdf document containing the results of corner detection (in red dots), adaptive non-maximal suppression (in red dots) and post-RANSAC matching (outliers in blue dots) for at least five distinct frames, additional features of implementation and references to third-party code




 
This handout provides instructions for two versions of the code: MATLAB and Python. You are free to select either one of them for this project.




 
Feel free to create your own functions as and when needed to modularize the code. For MATLAB, ensure that each function is in a separate file and that all files are in the same directory. For Python, add all functions in a helper.py file and import the file in all the required scripts.




 
Start early! If you get stuck, please post your questions on Piazza or come to office hours!




 
Follow the submission guidelines and the conventions strictly! The grading scripts will break if the guidelines aren’t followed.




1 Video Mosaicing




You will create a video mosaic from multiple videos (at least 3 videos) taken from three cellphone cameras.



1.1 Capture and Sync Videos




For this section, you and your teammate will need two other persons to capture multiple videos of the same scene, which you will use to create the video mosaic. You should capture three videos from three phones (ideally, the same model) in landscape mode, of the same scene. Please do ensure that you keep the cameras level. The overlapping area for the videos should be around 30% to 40%.




You can get creative with the scene that you want to capture but to keep it simple, capture a scene with a person walking by, across the three cameras.




In order to ensure that you are stitching the same frame across the three videos, you will have to sync the videos accordingly. A crude way of doing that would be to start recording in all the three cameras at the same time. A better way to do it would be, to have either an audio cue or a visual cue common to the three videos, so that you can trim the videos appropriately later using a video editor.




We recommend that you shoot all the videos at the lowest resolution possible in order to speed-up mosaicing.




1.2 Feature Detection and Matching




In this section, you will detect features in an image frame and find the best matching features in other frames. These features should be reasonably invariant to translation and rotation.




 
Feature Detection:




In this section, you will identify features in an image using Harris corners or SIFT features. We rec-ommend you to use detectHarrisFeatures or detectMinEigenFeatures in MATLAB and corner_harris in Python.

Complete the following function: [cimg]=corner_detector(img)

– (INPUT) img: H ⇥W matrix representing the gray scale input frame




– (OUTPUT) cimg: H ⇥W matrix representing the corner-metric matrix for the image




 
Adaptive Non-Maximal Suppression:




Loop through all the feature points, and for each feature point, compare the corner strength to all the other feature points. Keep track of the minimum distance to a larger magnitude feature point (within 0.9 as large). After you have computed this minimum radius for each point, sort the list of interest points by descending radius and take the top N.




Complete the following function: [x,y,rmax]=anms(cimg,max_pts)

– (INPUT) cimg: H ⇥W matrix representing the corner-metric matrix.




– (INPUT) max_pts: The desired number of corners




– (OUTPUT) x: N ⇥1 matrix representing the column coordinates of the corners




– (OUTPUT) y: N ⇥1 matrix representing the row coordinates of the corners




– (OUTPUT) rmax: Supression radius used to obtain max_pts corners




 
Feature Descriptors:




Now that you have identified points of interest, the next step is to come up with a descriptor for the feature centered at each interest point. This descriptor will be the representation you will use to com-pare features in different images to see if they match.




Given an oriented interest point, you should sample a 8 ⇥ 8 patch of pixels around around the sub-pixel location of the interest point, using a spacing of s = 5 pixels between samples. After sampling, the descriptor vector should be normalized so that the mean is 0 and the standard deviation is 1. It is important to sample these patches from a larger 40 ⇥ 40 window to have a nice, big and blurred descriptor.




Complete the following function:







We suggest 1000 iterations, a minimum consensus of 10 and an error of 0.5 This means trying 1000 times to find the homography based on 4 points in which at least 10 other transformed points have at most an error of 0.5 (half a pixel) to their actual correspondence points. We strongly recommend you to play around with these values.




1.3 Frame Mosaicing




Once you have the homography, you will need to warp the images. Figure out how large the final stitched image will be and their absolute displacements in the panorama. You should warp the first and the third images to the second or the center image. You should map the pixels in the warped image to pixels in the input image so that you don’t end up with holes in your image. You can use imwarp in MATLAB and scipy.ndimage.geometric_transform or scipy.ndimage.interpolation.map_coordinates in Python for the same.




You can stitch three (or more) frames to make a mosaic. You should composite the mosaic using




Gradient Domain Blending. Complete the following function:




[img_mosaic]=mymosaic(img_input)




– (INPUT) img_input: M ⇥N cell where M is the total number of frames in the video and N is three if the number of input videos is 3




– (OUTPUT) img_mosaic: M ⇥1 cell vector representing the stitched image mosaic for every frame




1.4 Video Mosaicing




Once you have the mosaic for each of the frames in the video, combine them together to obtain a mosaiced video in .avi or .mp4 format. Complete the following function:




[video_mosaic]=myvideomosaic(img_mosaic)




– (INPUT) img_mosaic: M ⇥ 1 cell vector representing the stitched image mosaic for every frame




– (OUTPUT) video_mosaic: Video file in either .avi or .mp4 format




2 Extra Credits:




The following tasks are for extra credit. Implementing any or all of them are optional.




 
Generative creative video mosaics




 
Projecting your mosaic onto a cylinder or sphere




 
Add multi-scale processing for corner and feature detection




 
Add rotation invariance to the feature descriptors




 
Create your own feature descriptor. You will need to compare it with the other descriptors




 
Use SIFT features




 
Implement a method that beats the "ratio test" for deciding if a feature is a valid match




 
Incorporate Graphcut textures for blending image frames and compare with Poisson Blending

More products