Homework 2 Robot Perception and Decision-Making Solution

Starting from:

~~$30~~

$24

Home

Extended Kalman Filter with a Nonlinear Observation Model

Consider the scenario depicted in Figure 1 where a robot tries to catch a y that it tracks visually with its cameras. To catch the y, the robot needs to estimate the 3D position pt 2 R3

Figure 1

and linear velocity t 2 R3 of the y with respect to its camera coordinate system. The y is moving randomly in a way that can be modelled by a discrete time double integrator:

pt+1 = pt + t t
(1a)
t+1 = 0:8 t + tat
(1b)

where the constant velocity value describes the average velocity value over t and is just an approximation of the true process. Variations in the y’s linear velocity are caused by random, immeasurable accelerations at. As the accelerations are not measurable, we treat it as the process noise, w = tat, and we model it as a realization of a normally-distributed white-noise random vector with zero mean and covariance Q: w N(0; Q). The covariance is given by Q = diag(0:05; 0:05; 0:05)

The vision system of the robot consists of (unfortunately) only one camera. With the camera, the robot can observe the y and receive noisy measurements z 2 R2 which are the pixel coordinates (u; v) of the projection of the y onto the image. We model this projection mapping y’s 3D location to pixels as the observation model h:

zt = h(xt) + vt (1c)

where x = (p; s)T and v is a realization of the normally-distributed, white-noise observation noise vector: v N(0; R). The covariance of the measurement noise is assumed constant of value, R = diag(5; 5).

We assume a known 3x3 camera intrinsic matrix:

2
500
0
320
0
3

K =
0
500
240
0
(1d)

4
0
0
1
0
5

Let t = 0:1s. Find the system matrix A for the process model (Implement your answer in the system matrix function in Q1.py).

De ne the observation model h in terms of the camera parameters (Implement your answer in the observation function in Q1.py).

Initially, the y is sitting on the ngertip of the robot when it is noticing it for the rst time. Therefore, the robot knows the y’s initial position from forward kinematics to be at p0 = (0:5; 0; 5:0)T (resting velocity). Simulate in Python the 3D trajectory that the y takes as well as the measurement process. This requires generating random acceleration noise and observation noise. Simulate for 100 time steps. Attach a plot of the generated trajectories and the corresponding measurements. Please add your code to the report.

Find the Jacobian H of the observation model with respect to the y’s state x. (Imple-ment your answer of H in function observation state jacobian in Q1.py.)

Now let us run an Extended Kalman Filter to estimate the position and velocity of the y relative to the camera. You can assume the aforementioned initial position and the following initial error covariance matrix: P0 = diag(0:1; 0:1; 0:1). The measurements can be found in data/Q1E measurement.npy. Plot the mean and error ellipse of the predicted measurements over the true measurements. Plot the means and error ellipsoids of the estimated positions over the true trajectory of the y. The true states are in data/Q1E state.npy

Discuss the di erence in magnitude of uncertainty in the di erent dimensions of the state.

From Monocular to Stereo Vision

Now let us assume that our robot got an upgrade: Someone installed a stereo camera and calibrated it. Let us assume that this stereo camera is perfectly manufactured, i.e., the two cameras are perfectly parallel with a baseline of b = 0:2. The camera intrinsics are the same as before in Question 1.

Now the robot receives as measurement z a pair of pixel coordinates in the left image (uL; vL) and right image (uR; vR) of the camera. Since our camera system is perfectly parallel, we will assume a measurement vector z = (uL; vL; dL) where dL is the disparity between the projection of the y on the left and right image. We de ne the disparity to be positive. The y’s states are represented in the left camera’s coordinate system.

Find the observation model h in terms of the camera parameters (Implement your answer in function observation in Q2.py).

Find the Jacobian H of the observation model with respect to the y’s state x. (Imple-ment H in function observation state jacobian in Q2.py)

What is the new observation noise covariance matrix R? Assume the noise for (uL; vL), and (uR; vR) are independent and same distribution with Q1, respectively. (Implement R in function observation noise covariance in Q2.py).

Now let us run an Extended Kalman Filter to estimate the position and velocity of the y relative to the left camera. You can assume the same initial position and the initial error covariance matrix as in the previous questions. Plot the means and and error ellipses of the predicted measurements over the true measurement trajectory in both the left and right images. The measurements can be found in data/Q2D measurement.npy. Plot the means and error ellipsoids of the estimated positions over the true trajectory of the y. The true states are in data/Q2D state.npy Include these plots here.

Discuss the advantages of using (uL; vL; dL) over (uL; vL; uR; vR)

Particle Filter with Nonlinear Process Model and Linear Observation Model Now assume that the robot moves on to visually tracking an object that it’s being manipulated with its own end-e ector. Therefore, it does not only need to estimate the 3D position and velocity but also its orientation and angular velocity relative to the camera, and the predictions will be strongly a ected by robot’s actions.

We assume that we can model the object’s motion again by a discrete time double integrator:

Tt+1
= e[ t] tTt
(3a)
t+1
= t + tut + t t
(3b)

where the constant velocity value describes the average velocity value over t and is just an approximation of the true process. Variations in the objects twist velocity are caused by robot actions that we directly obtain as accelerations on the object, ut, and additional randomness in the motion is modeled as additional accelerations w = t t. We model w as a realization of a normally-distributed white-noise random vector with zero mean and covariance Q: w N(0; Q). Q is a diagonal matrix, Q = diag(0:01; 0:001; 0:001; 0:0; 0:005; 0:0025).

The robot receives noisy pose estimates of the object relative to the camera frame. We assume the measurements is the true pose T perturbed with noise in the tangential Lie algebra space. I.e., the noise, represented in exponential coordinates, is a realization of a normally-distributed white-noise random vector with zero mean and covariance R = diag(0:1; 0:1; 0:1; 0:05; 0:05; 0:05). The likelihood of a pose given an observation is thus de ned by the Gaussian likelihood in 6D of the di erence between the pose and the observation (in exponential coordinates).

Assume the initial pose of the object wrt. the camera is (q0 = (w = 1; 0; 0; 0); p0 = (1:0; 2:0; 5:0)). The rst 100 actions of the robot (i.e. the acceleration u) are given in le data/Q3 data/q3 commands.npy. Simulate the 6D trajectory that the object takes in Python. Generate plot of this trajectories where you represent the object as a 3D coordinate frame in space. In your report, please paste your code for the simulate function in Q3.py.

Now let us run a Particle Filter to estimate the pose and twist of the object relative to the camera. You can assume the aforementioned initial position and zero initial pose uncertainty. The measurements from the camera can be found in

data/Q3 data/q3 measurements.npy. Plot the most likely pose at each time step as a 3D coordinate frame. Include these plots here. In the report, please paste your code for the particle f ilter function in Q3.py.

Linear Kalman Filter with a Learned Inverse Observation Model

Now the robot is trying to catch a ball. So far, we assumed that there was some vision module that would detect the object in the image and thereby provide a noisy observation. In this part of the assignment, let us learn such a detector from annotated training data and treat the resulting detector as a sensor.

If we assume the same process as in the rst task, but we have a measurement model that observes directly noisy 3D locations of the ball, we end up with a linear model whose state can be estimated with a Kalman lter. (For this problem, you should re-use code from Questions 1 and 2, so we will not be supplying starter code. You might nd Q4 helper.py helpful for loading the data).

In the folder data/Q4A data you will nd a training set of 1000 images in the folder training set and the le Q4A positions train.npy that contains the ground truth 3D position of the red ball in the image. Design a detector that performs regression from the images to the 3D position. Use standard Deep Learning methodology for training this detector with the training set data. The test set has 100 images in the folder testing set. The le Q4A positions test.npy contains the corresponding ground truth. Describe the architecture here and the rationale behind it. In the write-up, post your code, your hyperparemeters (learning rate, batch-size, etc), and how many epochs you trained for. Report the training and test set mean squared error in your write-up.

In the folder data/Q4B data you will nd a set of 1000 images that show a new tra-jectory of the red ball. Run your linear Kalman Filter using this sequence of images as input, where your learned model provides the noisy measurements. Tune a constant measurement noise covariance appropriately, assuming it is a zero mean Gaussian and the covariance matrix is a diagonal matrix. Plot the resulting estimated trajectory from the images, along with the detections and the ground truth trajectory (which can be found in the le Q4B positions gt.npy). Please paste your code in the report.

Because the images are quite noisy and the red ball may be partially or completely oc-cluded, your detector is likely to produce some false detections. In the folder data/Q4D data you will nd a set of 1000 images that show a trajectory of the red ball where some im-ages are blank (as if the ball is occluded by a white object). Discuss what happens if you do not reject these outliers but instead use them to update the state estimate. Like the last question, run your linear Kalman Filter using this sequence of images with occlusion as input. Plot the resulting estimated trajectory from images and the ground truth trajectory. Also plot the 3-D trajectory in 2-D (x vs. z) and (y vs. z) to better visualize what happens to your lter.

Design an outlier detector and use the data from data/Q4D data. Plot the 2D projec-tion of the observation trajectory, assuming the camera of Question 1, and label the outliers you have detected in the plot (with a di erent color). Also plot the ground truth projection of the trajectory (found in le Q4D positions gt.npy). Explain how you implemented your outlier detector and post your code in your write-up. Hint: Your observation model predicts where your measurement is expected to occur and its uncer-tainty.