Reinforcement Learning Assignment 3 Solution

Starting from:

~~$35~~

$29

• Introduction

The goal of this assignment is to do experiment with model-free control, includ-ing on-policy learning (Sarsa) and o -policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the di er-ences between them, you will implement Sarsa and Q-learning at the application of the Cli Walking Example, respectively.

• Cli Walking

Figure 1: Cli Walking

Consider the gridworld shown in the Figure 1. This is a standard undis-counted, episodic task, with start state (S), goal state (G), and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions ex-cept those into the region marked \The Cli ". Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start.

• Experiment Requirments

Programming language: python3

You should build the Cli Walking environment and search the optimal travel path by Sara and Q-learning, respectively.

Di erent settings for can bring di erent exploration on policy update. Try several (e.g. = 0:1 and = 0) to investigate their impacts on performances.

2

• Report and Submission

Your reports and source code should be compressed and named after "stu-dentID+name".

The les should be submitted on Canvas before Apr. 24, 2020.

3

More products

$6.00 OFF

Programming Assignment #1 Solution

$35

$29

Buy now

$6.00 OFF

Programming Assignment #2 Solution

$35

$29

Buy now

$6.00 OFF

Final Project Proposals Solution

$35

$29

Buy now