Assignment 1 Solution

Starting from:

~~$35~~

$29

Home

1 Perspective Projection

Humans can estimate the spatial layout of a scene from a single image. Can computers also do this? In computer vision, we have made some progress towards solving this problem (i.e. recovering depth/spatial layout from a single image; interested students may want to have a look at [1]). There are various methods through which spatial layout can be recovered from a single image. Some of these methods rely on the knowledge of perspective projections and vanishing points. Here, we will further our understanding of perspective projections.

Show that the vanishing points of lines on a plane lie on the vanishing line of the plane.

Show that, under typical conditions, the silhouette of a sphere of radius r with center (X,0, Z) under planar

perspective projection (XY is the image plane, and the center of projection is at the origin) is an ellipse of p

eccentricity X/ (X2 + Z2 r2). Are there circumstances under which the projection could be a parabola or hyperbola? (Hint: See the definition of conic sections at: http://en.wikipedia.org/wiki/Conic_section# Features)

An observer of height h is standing on a ground plane looking straight ahead. We want to calculate the accuracy with which she will be able to estimate the depth Z of points on the ground plane, assuming that she can visually discriminate angles to within 10. Derive a formula relating depth error δZ to Z. For simplicity, just consider points straight ahead of the observer(x = 0). Given a Z value (say 10 m), your formula should be able to predict the δZ.

Criminisi, Antonio, Ian Reid, and Andrew Zisserman. ”Single view metrology.” Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on. Vol. 1. IEEE, 1999.

2 Rotations

Rotations are routinely encountered in both Computer Vision and Graphics. One application where rotations turn out to be very handy is rendering images of a object from di↵erent viewpoints. Rodrigues’ formula provides an elegant way to deal with rotations. Wikipedia can serve as a good reference.

Rodrigues’ formula converts a rotation φ about an axis of rotation bs, a unit vector. Performing a cross product with vector bs is equivalent to multiplying with a skew-symmetric matrix S 2 <3x3. In other words, bs ⇥ b = Sb for any b 2 <3. What is S in terms of bs?

Rodrigues showed that the matrix exponential of φS produces a rotation φ about an axis of rotation bs. In other words, R = exp(φS). Use this to derive the Rodrigues’ formula, stated below.

R = I + (sin φ)S + (1 cos φ)S2

Write a Matlab or Python function for computing the matrix R corresponding to rotation φ about axis vector bs. Choose a random unit vector bs and a random initial point with unit magnitude. Plot the axis of rotation, along with the point after rotations of φ 2 {0, 12⇡ , ⇡8 , ⇡6 , ⇡4 , ⇡2 , ⇡, 32⇡ }. Do this for a few pairs of axes and points and include this in your report.

Find the eigenvalues of R and their corresponding eigenvectors and verify analytically. Express eigenvectors in terms of unit vectors ub, vb, bs, where ub ⇥ vb = bs.

5. Analytically verify the formula cos φ =
1
(trace(R) 1). Hint: Eigenvalues are your friend.
2

Write a function in rot to ax phi.m or rot to ax phi.py for computing the axis of rotation bs and φ from matrix R. Include the function in your report.

1
3 Make yourself famous!

Find a real image of your choosing (e.g. a portrait of yourself) as your source image. Pick another image to map onto (e.g. your hometown). Pick at least 5 planar surfaces of interest (e.g. building fa¸cades, posters, billboards, ground) in /images/times square.jpg and a few planar surfaces for your target image. Mark the corner coordinates for each. They should be distributed across the image and have varying normal directions in 3D. You will be projecting your source image onto Times Square and your target image!

Given points {ui}, {vi}, where ui is a corner in your chosen image, and point vi is the corresponding corner in the Times Square image, both in homogenous coordinates, we would like to find the transformation vi = T (ui). Function T : <3 ! <3 is described as follows:

2
vix
3

2
Vix/Viz
3

2
Vix
3

2
h11
h12
h13
3 2
uix
3

vi =
viy
=
Viy/Viz
, where
Viy
=
h21
h22
h23
uiy
= Hui

4
1
5

4
1
5

4
Viz
5

4
h31
h32
1
5 4
1
5

Include the following in your report:

1. Derive the least squares solution for H⇤ = arg minH P
4
||T (ui) vi||2 when using
i=1

2D aﬃne transform (translation + rotation), and

homography.

What constraints are placed on H for the aﬃne transform? How about for the homography?

Is the aﬃne transform able to exactly transform the points from one image to the other? Why or why not? How about for the homography?

Implement both aﬃne and homography transforms in Matlab or Python to map your chosen image onto the planar surfaces. In the report, you should include the following functions as well as the resulting images (saved as times square affine[homography].jpg and myimage affine[homography].jpg) and any observations.

H = affine solve(u,v), H = homography solve(u,v), where u,v are 2xN matrices, representing N corresponding points. Functions should return H, a 3x3 matrix.

v = homography transform(u,H), where u is a 2xN matrix and H is a 3x3 matrix. Function should return v, a 2xN matrix.

Matrices in Python should be represented as 2-D numpy arrays. Only use basic matrix operations to write these functions (e.g. multiplication, addition, concatenation, dot products, reshaping, etc.). Do not use a pre-written psuedo-inverse function. For reference, example results are shown in

/images/times square affine efros.jpg and /images/times square homography efros.jpg.

Similar to 3.4, implement both aﬃne and homography transforms in Matlab or Python to rectify the computer screen in /images/computer screen.jpg and the black/whitefloor shown in /images/the flagellation.jpg. For the computer screen, mark four points on the four corner of the screen and the floor mark as many points you deem appropriate for computing the transformations.

Instructions

This assignment is to be done individually.

Please submit the assignment using bCourses. Upload the following files:

A PDF file. The top of the first page should contain your name, student ID, and date of submission. The file should contain answers to all questions and all supporting images. Questions should be answered in order. Each problem should be on a new page.

A tar/zip file, containing any code you wrote for the assignment.

The HW is due on: Sunday, Feb 4, 2018, 11:55pm.

2