$18
Instructions 1. There are 6 questions with total 10 points.
2. Please create electronic document with your answer.
3. There is no need to include the question itself. However, you MUST include question number and sub-part index if any. Example: 9(b)
4. Please create a PDF document hw2.pdf and upload that in Canvas assignment page by the due date.
5. NO handwritten document is accepted.
6. NO LATE SUBMISSION.
1. For a non-pipeline implementation of data and control path in following diagram for a processor implementing CS147DV show the control signal logic values at different phase of the processor executing the following instructions. You need to construct 10 tables similar to Table 1 (may use hexadecimal for multi-bus signal, leave blank if don't care). Assume that the memory with read=0, write=0 is hold configuration (hold the previous read data) and read=1, write=1 causes electrical isolation of the memory (HiZ) [5 pts]
I. add r3, r2, r1
II. srl r4, r3, 0x3a
III. jr r20
IV. muli r3, r4, 0xa5a5
V. andi r3, r4, 0xa5a5
VI. lui r5, 0x5a5a
VII. bneq r2, r3, 0xa5a5; // r2 = 0x1; r3 = 0x1
VIII. lw r24, r6, 0x5a5a
IX. jal 0x2a55aa5
X. push
Table 1: Sample Control Signal Table
CTRL Sig Name IF ID/RF EXE MEM WB
pc_load
pc_sel_1
pc_sel_2
pc_sel_3
mem_r
mem_w
r1_sel_1
reg_r
reg_w
wa_sel_1
wa_sel_2
wa_sel_3
wd_sel_1
wd_sel_2
wd_sel_3
sp_load
op1_sel_1
op2_sel_1
op2_sel_2
op2_sel_3
op2_sel_4
alu_oprn
ma_sel_1
dmem_r
dmem_w
md_sel_1
2. Regarding performance and CPI answer the following. [1 pt]
(a) Consider two processor P1 and P2 of the same instruction set. There are five classes of instructions in this ISA
as in the following Table 2 with the corresponding CPI. P1 has clock rate of 5GHz and P2 has clock rate of
3GHz. Assuming equal distribution of each class of instructions in a benchmark program, what is the performance of P1 and P2 in terms of instructions per second?
(b) If the A type of instruction happens in a program as thrice and type B instruction happens twice as C,D, and E
type (C,D,E are equally distributed) which processor is faster and by how much?
Table 2: CPI information
Class CPI-P1 CPI-P2
A 1 3
B 2 1
C 3 1
D 4 4
E 3 5
3. Consider program P, which runs on a 2GHz machine M in 30 seconds. An optimization is done to P by compiler, replacing some instances of multiplying a value by 4 (x ← x * 4) with 2 instructions of adding x to x (x ← x + x; x
← x + x). Let's call the new optimized program as P'. The CPI of multiplication is 8 and CPI of add is 2. After recompiling the new program is now running in 20 seconds on machine M. How many multiplications were replaced by the compiler? [1 pt]
4. A computer architect needs to design the pipeline of a new micro-processor. An example workload of 108 instructions is used to design the pipeline. Each instruction takes 250ps to finish. How long does it take to execute this workload on a non-pipelined processor? The pipelined implementation has been done using 30 pipeline stages. Assuming a perfect pipeline (i.e. no hazard) how much speedup has been achieved compared to non-pipelined design? [1 pt]
5. Show the forwarding path needed to execute the following four instructions in a 5-staged pipeline as discussed in the class. [1 pt]
add r8, r4, r6 sub r7, r2, r8
lw r7, r7, 0x1000 add r8, r2 r7
6. In a 5-staged pipelined processor, identify all the data dependencies in the following code. Which dependencies are the data hazards that will be resolved via forwarding? Which dependencies are data hazards that will cause a stall? [1 pt]
add r3, r2, r4 sub r5, r1, r3
lw r6, r3, 0x2000 add r7, r3, r6