Starting from:
$35

$29

Final Course Project Solution

Applied Deep Reinforcement Learning


Description

The goal of the Final Course Project is to explore advanced methods and/or applications in reinforcement learning. You will be expected to prepare a proposal, milestone report, final report, and final presentation. All projects should evaluate novel ideas that pertain to deep RL or its applications. The project must involve reinforcement learning algorithms. You are encouraged to use your ongoing research work as a project in this course, provided that this work relates to deep reinforcement learning. You may discuss the topic of your final project with course staff by email, private message in Piazza, or in office hours. If you are not sure about the topic, we encourage you to speak with us. There are few directions, each have its own checkpoints.

Multiagent RL

Building and solving multiagent tasks (including but not limited to agents communications, transportation problems, multi-agent cooperation, etc) - all might potentially lead to a research project. Steps:

    1. Building a multiagent environment from scratch (can be an extension of your work at Assignment 1).

    2. Solving the environment using any tabular methods

    3. Solving the environment using any deep RL methods (DQN, DDQN, AC, A2C, DDPG, TRPO, PPO, etc) and compare the results

Checkpoints:

Solving the environment using any tabular methods

RC Cars

Setting up the simulator, training the cars in the simulator, applying results on the real RC cars. These steps may require prior knowledge in robotics or autonomous vehicles.

Steps:

    1. Install and explore the DeepRacer RC cars simulator

    2. Check existing solutions and apply any RL methods to teach the car to drive in the simulator



1
    3. Apply learnt knowledge in the real RC cars (with the ultimate goal of making a car move forward using only the RL algorithm)

Checkpoints:

Apply any RL methods to teach the car to drive in the simulator

Exploring Deep RL Algorithms

Explore recent advances in RL. This may include solving ANY of the below environments using deep RL algorithms.

Possible environments include:

Google Research Football Environment [blog post, github, includes participation in the tournament] MALMO (platform built on top of Minecraft) [github]

Robotics by OpenAI [details, blog post] Atari by OpenAI [details]

Steps:

    1. Set up the environment

    2. Check the existing baseline methods applied to solve it

    3. Apply deep RL to improve the results

Checkpoints:

Check existing baseline methods applied to solve it

You can propose your own topic, thus you will get individual checkpoint.

If you get interesting results, we would encourage to share your project with the public in terms of participating in the CSE Demo Days, or some other events, so it would be beneficial, if you choose topic that you are really interested in.

If you do not know what to choose - go with Exploring Deep RL Algorithms on Atari OpenAI.

You may also come with your topic proposal. Please talk to Alina [avereshc@buffalo.edu].

Registering your team

Deadline: March 10

Google Form link will be added later.

Writing the proposal

Deadline: March 13

The project proposal should be a one page single-spaced extended abstract motivating and outlining the project you plan to complete. You proposal should have the following structure:

    1. Topic

    2. Objective. Explain the objective of the project and why that objective is relevant and important.


2
    3. Related Work. Briefly review the most relevant prior work, and highlight where those works fall short of meeting the objectives described above.

    4. Technical Outline. Explain your approach at a high-level, making clear the novel technical contribution. What environment and algorithm you are planning to use.

Before submitting, your proposal should be approved by any of the course staff.

Submitting the checkpoint

Deadline: April 10

Each direction have individual expectations for the middle checkpoint. If you do your own project - the checkpoint has to be confirmed during the proposal.

Submitting the Project

Deadline: May 1

Complete your project in either Jupyter Notebook or python script. In your report include:

The main motivation of your project (Why is it important/novel?)

Preliminary materials (Discuss the algorithms, some background info you need to know) Implementation details

Your results

Present your work

Presentation Days: will be added later

Present your work during the Presentation Days. Registration slots will be available around a week prior to dates. The whole team should present the work. Note: your presentation should represent the work you have submitted. If you take part in CSE Demo Days, you will make a short presentation during that day.


Presentation details

Length: 10 mins + followup questions

Presentation Templates: UB branded ppt templates or UB CSE PowerPoint template

Suggested presentation structure:

– Project Title / Team’s Name / Course / Date [1 slide]

– Project Description [1 slide]

– Background [max 2 slides]

– Implementation [max 2 pages]

– Results (Graphs & Any Visuals) [max 4 slides]

– Key Observations / Summary [1 slide]

– Thank you Page [1 slide]

Important Information

This project can be done in a team (up to three people) or individually. The standing policy of the Department is that all students involved in an academic integrity violation (e.g. plagiarism in any way, shape, or form) will receive an F grade for the course.

3
Late Days Policy

If you are working in a team, the max number of late days left for any of your teammates can be used. Thus if one teammate is left with 3 days and another has 5 days left, your team has 5 days that can be used for late day submission without penalty. Please note that final submission of the project has a hard deadline.

Important Dates

March 10, 11:59pm - Register your team

March 13, 11:59pm - Approve your project proposal with any of the course staff

April 10, 11:59pm - Checkpoint is due

May 3, 11:59pm - Project is Due
















































4

More products