Starting from:

$35

Homework#8 Solution

PART A: Theory and Algorithms    [100 points]           *  See PART B Prog Assignment on Page 3
Please - clearly write your full name on the first page.  Submit a single PDF file.
Please provide brief but complete explanations, using diagrams where necessary, and suitably using your own words.  While presenting calculations, explain the variables and.

Pl. refer to Dr. Sutton's book Ch 1 and Ch 2 only as needed.
http://incompleteideas.net/book/bookdraft2017nov5.pdf 
Then, answer the below:

1. Consider the use case (application) of a Robot driving a car.  In this context, what is RL?  How can the ADP and TD methods be used for this?  What about the Active RL method?    [ 30 points]

    2. Based on Ch. 21 from textbook Fig. 21.9                    [50 points]
For the problem shown in Fig. 21.9 (balancing a long pole on a moving cart):
        a. Construct a Q-Learning representation and explain this as an Active RL problem.  Show the details of Policy and Transitions and explain why it is an Active RL rpoblem.

3. Answer from our textbook Norvig & Russell page 858  Question 21.1 - this is a Python implementation.                                        [ 80 points]

4. Implement  a Q Learning algorithm similar to this tutorial:
https://www.learndatasci.com/tutorials/reinforcement-q-learning-scratch-python-openai-gym/ 
but to use the maze problem we learned in class (see Q-Learning Example.docx) and prove your implementation using this data set.                            [140 points]
            

More products