$24
The goal of this assignment is to trace the execution of the following code fragment through an out-of-order processor pipeline on a cycle-by-cycle basis. During each cycle of execution, you need to show the state of the key processor structures used to support out-of-order execution according to the instructions given below. PLEASE TYPE YOUR ANSWERS AND USE THE ANSWER FORMAT PROVIDED BELOW. WE WILL NOT ACCEPT HAND-WRITTEN ANSWERS FOR THIS ASSIGNMENT.
Code fragment:
I1: DIV R1, R2, R3
I2: STORE R4, R5, #50
I3: ADD R7, R1, R5
I4: OR R1, R5, R2
I5: LOAD R4, R6, #25
I6: SUB R3, R1, R4
Assume that two execution units (ALUs) are available: one to execute the DIV instruction, and one to execute all other instructions. Assume that the execution latency of DIV instruction is 7 cycles, and that the execution latency of all other instructions is 1 cycle. Assume that all memory accesses take a single cycle and that the LOAD instruction (I5) loads from the same address that the STORE instruction (I2) writes into. Further assume that this code is executed on an out-of-order pipeline with register renaming and the following pipeline stages are used: fetch, decode, renaming, scheduling, execution (one or more cycles), memory access, writeback and commit. Assume that all registers are maintained inside a physical register file and that commit-time rename table is used to detect which physical registers represent the mapping of each ISA register at each instruction’s commit.
Assume that the initial state of the front-end rename table is such that logical register R0 maps to physical register P0, R1 maps to P1, R2 maps to P2, … R31 maps to P31 for all logical registers. Assume that 40 physical registers are available and that registers 32 to 40 are in the free list.
Trace the execution of the above code through this pipeline on a cycle-by-cycle basis and determine how many cycles are required to execute this code fragment. For each cycle, show the state of the front-end rename table, the commit-time-rename table, the instruction queue, the reorder buffer, the load/store queue, and the free list of registers. Also indicate what pipeline stage each instruction is in at every cycle. Explicitly show how the physical registers are deallocated!
Example for the first cycle is given below. You can simply replicate it for the subsequent cycles and fill in the structures with the new values. Please provide your answers such that each cycle is shown on a separate page. If nothing changes in a given cycle from the previous cycle, you can skip that cycle in your answer. (Yes, there will be quite a few pages in your submission).
Cycle 1:
Pipeline Stages (and instructions in them):
Pipeline Stage
Instructions in this Stage
Fetch
I1
Decode
Empty
Rename
Empty
Scheduling
Empty
Execution
Empty
Memory Access
Empty
Writeback
Empty
Commit
Empty
Front End Rename Table:
Logical Register
Physical Register
R0
P0
R1
P1
R2
P2
R3
P3
R4
P4
R5
P5
R6
P6
R7
P7
Commit Time Rename Table