Project 3: Virtual Memory Simulator

Starting from:

$30

Home

- Your code files (in C, C++, Java, Python, or Perl) for the virtual memory simulator
- `p3-writeup.md` (the writeup in [GitHub Markdown](https://guides.github.com/features/mastering-markdown/))

Project overview
----------------

In class, we have discussed various page replacement algorithms that an Operating System implementer may choose to use. We have also discussed local and global page replacement policies. In particular, in local page replacement, each process is allocated a certain number of physical memory frames, and when a page is to be evicted, the victim page is selected among the pages of the same process. In this project, you will simulate the **Least Recently Used (LRU)** and **Optimal (OPT)** replacement algorithms by reading traces of memory references that were generated by two processes while running on a 32-bit system and collecting relevant metrics. Because it is local page replacement algorithm, each process has its own frames (i.e., a percentage of the total available physical memory). While simulating the algorithm, you will collect statistics about its performance, such as the number of page faults that occur and the number of dirty frames that had to be written back to disk. When you are done with your program, you will write up your results and provide a graph that compares the performance of the two algorithms under various ways to split physical memory up between the two processes. The page size, the total number of frames, and the memory split will be command-line arguments to the execution of your program.

_You may write your program in C/C++, Java, Perl, or Python as long as it runs on thoth.cs.pitt.edu with the “./vsim” command specified below._

Project Details
---------------

Your virtual memory simulator, called vmsim, takes the following command line arguments:
```shell
./vmsim -a <opt|lru> –n <numframes> -p <pagesize in KB> -s <memory split> <tracefile>
```
The memory split between the two processes is provided in the form a:b, where a and b are positive integers > 0 that represent the ratios of each process's memory allocation. For example, a 1:2 split means that the second process gets twice as many frames as the first process (i.e., the first process gets one third and the second two thirds). The sum of a and b evenly divided the total number of frames.

The program/simulator will run through the memory references of the trace file and decide the action taken for each address (memory hit, page fault with no eviction, page fault and evict clean page, page fault and evict dirty page).
When the trace is over (that is, after dealing with all the memory references for both simulated processes), `vmsim` prints out summary statistics **in the following format (as you know, Gradescope grading requires very strict formatting)**:
```
Algorithm: LRU
Number of frames: 8
Page size: 8 KB
Total memory accesses: %d
Total page faults: %d
Total writes to disk: %d
```

Implementation
-------------------

We are providing three sample memory traces. The traces are available within this repository and at `/u/OSLab/original/` in the files `1.trace.gz`, `2.trace.gz`, and `3.trace.gz`. We will use more trace files to test your program when grading; these traces are for you to start testing your program.
Each trace is gzip compressed, so you will have to copy each trace to your directory under `/u/OSLab/USERNAME/` and then decompress it like:
```
gunzip 1.trace.gz
```
Your simulator takes a command-line argument (among others) that specifies the trace file that will be used to compute the output statistics. The trace file will specify all the data memory accesses that occurred while running two processes. Each line in the trace file will specify a new memory reference and the process that made the access. Each line in the trace will therefore have the following three fields:

- Access Type: A single character indicating whether the access is a load ('l') or a store ('s'). The ‘s’ mode modifies the address and sets the dirty bit to true.
- Address: A 32-bit integer (in unsigned hexadecimal format) specifying the memory address that is being accessed. For example, "0xff32e100" specifies that memory address 4281524480 (in decimal) is accessed.
- Process: Either 0 or 1, representing which of the two processes made the memory access.
Fields on the same line are separated by a single space. Example trace lines look like:
```
l 0x012ff200 0
s 0xfe7fefc8 1
```
If you are writing in C/C++, you may parse each line with the following code:
```c
unsigned int address;
char mode;
int process;

fscanf(file, "%c %x %d", &mode, &addr, &process);
```
Important Notes
--------------------
1. As mentioned in class, when a dirty page is evicted, it has to be written to disk; if the page is clean, the evicted page is just abandoned.
2. If you are using Python, name your file vmsim and add the following shebang line at the beginning of the file: `#!/usr/bin/env python`
3. Implementing OPT in a naïve fashion will lead to unacceptable performance. It should not take more than 5 minutes to run your program.
4. **In case of a tie in the OPT algorithm, break the tie by selecting the least recently used page.**
5. If you are using Java, the autograder has a script that will detect your main class and run it as `.\vmsim ...`; you don't have to worry about that.

Write Up
-------------
Using one or more graphs, describe in a document the resulting page fault statistics for the following memory splits: 1:1, 1:3, 3:1, 3:5, 5:3, 7:9, 9:7. Run your simulation with the following memory configurations:

- a total of 16 frames and a page size of 4 KB
- a total of 1024 frames and a page size of 4KB
- a total of 16 frames and a page size of 4 MB
- a total of 1024 frames and a page size of 4 MB

Use OPT as the baseline for your comparisons. For example, you can have a graph for each of the memory configurations and within each graph 14 curves (two per memory split: one for OPT and one for LRU). Other more (or less) creative ideas for plotting the comparisons are welcome.

Extra Credit (10 points)
-------------------------
You can do the following tasks for extra credits.

Read 64-bit memory trace files. Each line in the trace will therefore have the following three fields:
Access Type: A single character indicating whether the access is a load ('l') or a store ('s'). The ‘s’ mode modifies the address and sets the dirty bit to true.

- Address: A 64-bit integer (in unsigned hexadecimal format) specifying the memory address that is being accessed. For example, "0x7894ff32e100" specifies that memory address 132581332017408 (in decimal) is accessed.
- Process: Either 0 or 1, representing which of the two processes made the memory access.
Fields on the same line are separated by a single space. Example trace lines look like:
```
l 0x4567ab12ff200 0
s 0x111234ffe7fefc8 1
```
If you are writing in C/C++, you may parse each line with the following code:
```c
unsigned long int address;
char mode;
int process;

fscanf(file, "%c %lx %d", &mode, &addr, &process);
```
We are providing three sample 64-bit memory traces. The traces are available within this repository and at `/u/OSLab/original/` in the files `1-64.trace.gz`, `2-64.trace.gz`, and `3-64.trace.gz`. We will use more trace files to test your program when grading; these traces are for you to start testing your program.

File Backups
------------

The `/u/OSLab/` partition and your home folder on thoth are **not** part of AFS space. Thus, any files you modify under these folders are not backed up. If there is a catastrophic disk failure, all of your work will be irrecoverably lost. As such, it is our recommendation that you:

**Commit and push all the files you change to your GitHub repository frequently!**

**BE FOREWARNED:** Loss of work not backed up is not grounds for an extension.

Submission
----------

You need to submit onto Gradescope:
- Your well-commented program’s source
- The writeup (named `p3-writeup.md`) detailing the results of your simulation as described above

**The autograder should be able to run your program as:**
```shell
./vmsim -a <opt|lru> –n <numframes> -p <pagesize in KB> -s <memory split> <tracefile>
```

Grading rubric
--------------

|Item|Grade|
|----|-----|
|Test cases on the autograder|80%|
|Comments and style|10%|
|Report|10%|
|Extra Credit|10%|

**If the autograder score is below 40 points (out of 80), a partial grade will be assigned using manual grading of your code. The maximum points that you can get in this manual grading is 50 points. Again, non-compiling code gets **zero** points.**

Please note that the score that you get from the autograder is not your final score. We still do manual grading. We may discover bugs and mistakes that were not caught by the test scripts and take penalty points off. Please use the autograder (or the local test harness `make test`) only as a tool to get immediate feedback and discover bugs in your program. Please note that certain bugs (e.g, deadlocks) in your program may or may not manifest themselves when the test cases are run on your program. It may be the case that the same exact code fails in some tests then the same code passes the same tests when resubmitted. The fact that a test once fails should make you debug your code, not simply resubmit and hope that the situation that caused the bug won't happen the next time around.