$24
Problem
1.1 Description
As you encountered in the rst project, replication of previously published results can be an interesting and challenging task. You learned that researchers often leave out important details that cause you to perform extra experimentation to produce the right results.
For this project, you will be reading \Correlated Q-Learning" by Amy Greenwald and Keith Hall Greenwald, Hall, and Serrano 2003. You are then asked to replicate the results found in Figure 3(parts a-d). You can use any programming language and libraries you choose.
1.2 Procedure
Read the paper.
Develop a system to replicate the experiment found in section "5. Soccer Game" { This will include the soccer game environment
{ This will include agents capable of Correlated-Q Foe-Q Friend-Q and Q-learning
Run the experiment found in section "5. Soccer Game"
{ Collect data necessary to reproduce all the graphs in Figure 3
Create graphs demonstrating
{ The Q-value di erence for all agents
{ Anything else you may think appropriate
We’ve created a private Georgia Tech GitHub repository for your code. Push your code to the personal repository found here: https://github.gatech.edu/gt-omscs-rldm
The quality of the code is not graded. You don’t have to spend countless hours adding comments, etc. But, it will be examined by the TAs.
Make sure to include a README.md le for your repository
{ Include thorough and detailed instructions on how to run your source code in the README.md
{ If you work in a notebook, like Jupyter, include an export of your code in a .py le along with your notebook
{ The README.md le should be placed in the project 3 folder in your repository.
You will be penalized by 25 points if you:
{ Do not have any code or do not submit your full code to the GitHub repository { Do not include the git hash for your last commit in your paper
Write a paper describing your agent and the experiments you ran
{ Include the hash for your last commit to the GitHub repository in the header on the rst page of your paper.
{ Correlated-Q
2
{ Make sure your graphs are legible and you cite sources properly. While it is not required, we recommend you use a conference paper format. For example https://www.ieee.org/conferences/ publishing/templates.html
{ 5 pages maximum { really, you will lose points for longer papers. { Describe the game
{ Describe the experiments/algorithms replicated: implementation/outcome/etc { Explain your experiments.
{ The paper should include your graphs and discussions regarding them { Discuss your results in the context of the game and the algorithm
How well do they match? Signi cant di erences?
Justify your results. Why do they make sense?
What was the purpose of the experiment ie. what is the signi cance of your results?
{ Describe any problems/pitfalls you encountered (e.g. unclear parameters, contradictory descriptions of the procedure to follow, results that di er wildly from the published results)
What steps did you take to overcome them
What assumptions you made and how you justify said assumptions { Save this paper in PDF format.
{ Submit to Canvas!
Using a Deep RL library instead of providing your own work will earn you a 0 grade on the project and you will be reported for violating the Honor Code.
1.3 Resources
1.3.1 Lectures
Lesson 11A: Game Theory
Lesson 11B: Game Theory Reloaded
Lesson 11C: Game Theory Revolutions
1.3.2 Readings
Greenwald-Hall-2003.pdf Greenwald, Hall, and Serrano 2003
1.4 Submission Details
The due date is indicated on the Canvas page for this assignment. Make sure you have set your
timezone in Canvas to ensure the deadline is accurate.
Due Date: Indicated as "Due" on Canvas
Late Due Date [20 point penalty per day]: Indicated as "Until" on Canvas The submission consists of:
Your written report in PDF format (Make sure to include the git hash of your last commit) Your source code in your personal repository on Georgia Tech’s private GitHub
To complete the assignment, submit your written report to Project 2 under your Assignments on Canvas:
https://gatech.instructure.com
You may submit the assignment as many times as you wish up to the due date, but, we will only consider your last submission for grading purposes. Late submissions will receive a cumulative 20 point penalty per day. That is, any projects submitted after midnight AOE on the due date get a 20 point penalty. Any projects submitted after midnight AOE the following day get a 40 point penalty and so on. No project will receive a score less than a zero no matter what the penalty. Any projects more than 4 days late and any unsubmitted projects will receive a 0.
{ Correlated-Q
3
Note: Late is late. It does not matter if you are 1 second, 1 minute, or 1 hour late. If Canvas marks your assignment as late, you will be penalized. Additionally, if you resubmit your project and your last submission is late, you will incur the penalty corresponding to the time of your last submission.
Finally, if you have received an exception from the Dean of Students for a personal or medical emergency we will consider accepting your project up to 7 days after the initial due date with no penalty. Students requiring more time should consider withdrawing from the course (if possible) or taking an incomplete for this semester as we will not be able to grade their project.
1.5 Grading and Regrading
When your assignments, projects, and exams are graded, you will receive feedback explaining your errors (and your successes!) in some level of detail. This feedback is for your bene t, both on this assignment and for future assignments. It is considered a part of your learning goals to internalize this feedback. This is one of many learning goals for this course, such as: understanding game theory, random variables, and noise.
If you are convinced that your grade is in error in light of the feedback, you may request a regrade within a week of the grade and feedback being returned to you. A regrade request is only valid if it includes an explanation of where the grader made an error. Send a private Piazza post to only Miguel Morales and Timothy Bail. In the Summary add \[Request] Regrade Project 2". In the Details add su cient explanation as to why you think the grader made a mistake. Be concrete and speci c. We will not consider requests that do not follow these directions.
It is important to note that because we consider your ability to internalize feedback a learning goal, we also assess it. This ability is considered 10% of each assignment. We default to assigning you full credit. If you request a regrade and do not receive at least 5 points as a result of the request, you will lose those 10 points.
References
[GHS03] Amy Greenwald, Keith Hall, and Roberto Serrano. \Correlated Q-learning". In: ICML. Vol. 20. 1.
2003, p. 242.