$24
Deliverable: Report submitted to Gradescope by Wednesday October 2nd, 23:59pm. Your report should be a PDF containing the following:
Page 1: Solution to 1(a)
Page 2: Solution to 1(b)
Pages 3 onwards: PDF rendering of completed notebook provided in 2
1. Theoretical Exercises
Consider the following optimal control problem, which considers a linear system with additive noise and quadratic cost:
1
X
min E[ xtQxt + utRut] + E[xTQxT ]
x;u
t=0
s.t. xt+1 = Axt + But + wt; 8t = 0; 1; 2; :::; T 1
with wt independent random vectors with E[wt] = 0, and E[wtwt] = w.
Find an LQR-like sequence of matrix updates that computes the optimal cost-to-go at all times and the optimal feedback controller at all times. Describe the expected cost incurred in excess of the expected cost in the case when there is no noise.
(b) Now consider a linear system with multiplicative noise and quadratic cost:
min E[xTQxT ]
x;u
s.t. xt+1 = Axt + (B + Wt)ut; 8t = 0; 1; 2; :::; T 1
Here Q 2 RnX nX ; A 2 RnX nX ; B 2 RnX nU are given and xed. Wt 2 RnX nU ; t = 0; 1; :::; T 1 are independent random matrices with E[Wt] = 0. Higher-order expectations involving Wt will show up. These higher-order expectations are not assumed to be zero, and you should just keep these expectations around, i.e., no need to try to simplify these (and not possible anyway unless additional assumptions are made).
Find an LQR-like sequence of matrix updates that computes the optimal cost-to-go at all times and the optimal feedback controller at all times.
2. Programming Exercise
See lqr.ipynb and follow the instructions within which will walk you through each question. But rst, set up your environment by following this link to install Anaconda and this link for MuJoCo installation {if you do not already have a key, please see Piazza to obtain one. After installation, navigate to your homework directory, create your environment via conda env create -f environment.yml, run jupyter notebook within the environment to start a server, and get started on the implementation.