Starting from:
$24

$18

Homework #2 Solution

Instructions           1.    There are 6 questions with total 10 points.

2.    Please create electronic document with your answer.

3.    There is no need to include the question itself. However, you MUST include question number and sub-part index if any. Example: 9(b)

4.    Please create a PDF document  hw2.pdf and upload that in Canvas assignment page by the due date.

5.    NO handwritten document is accepted.

6.    NO LATE SUBMISSION.

 

1.   For a non-pipeline implementation of data and control path in following diagram for a processor implementing CS147DV show the control signal logic values at different phase of the processor executing the following instructions. You need to construct 10 tables similar to Table 1 (may use hexadecimal for multi-bus signal, leave blank if don't care). Assume that the memory with read=0, write=0 is hold configuration (hold the previous read data) and read=1, write=1 causes electrical isolation of the memory (HiZ) [5 pts]

 

I.   add r3, r2, r1

 

II.   srl r4, r3, 0x3a

 

III.    jr r20

 

IV.  muli r3, r4, 0xa5a5

 

V.  andi r3, r4, 0xa5a5

 

VI.   lui r5, 0x5a5a

 

VII.  bneq r2, r3, 0xa5a5; // r2 = 0x1; r3 = 0x1

 

VIII.    lw r24, r6, 0x5a5a

 

IX.   jal 0x2a55aa5

 

X.  push

 

 

 

 

Table 1: Sample Control Signal Table

 

CTRL Sig Name  IF            ID/RF     EXE        MEM     WB

pc_load                                                                

pc_sel_1                                                                               

pc_sel_2                                                                               

pc_sel_3                                                                               

mem_r                                                                  

mem_w                                                                

r1_sel_1                                                                

reg_r                                                                      

reg_w                                                                    

wa_sel_1                                                                              

wa_sel_2                                                                              

wa_sel_3                                                                              

wd_sel_1                                                                              

wd_sel_2                                                                              

wd_sel_3                                                                              

sp_load                                                                 

op1_sel_1                                                                            

op2_sel_1                                                                            

op2_sel_2                                                                            

op2_sel_3                                                                            

op2_sel_4                                                                            

alu_oprn                                                                               

ma_sel_1                                                                             

dmem_r                                                                

dmem_w                                                                              

md_sel_1                                                                             

 

 

2.   Regarding performance and CPI answer the following. [1 pt]

(a)  Consider two processor P1 and P2 of the same instruction set. There are five classes of instructions in this ISA

as in the following Table 2 with the corresponding CPI. P1 has clock rate of 5GHz and P2 has clock rate of

3GHz. Assuming equal distribution of each class of instructions in a benchmark program, what is the performance of P1 and P2 in terms of instructions per second?

(b)  If the A type of instruction happens in a program as thrice and type B instruction happens twice as C,D, and E

type (C,D,E are equally distributed) which processor is faster and by how much?

 

 

Table 2: CPI information

 

Class      CPI-P1   CPI-P2

A             1              3

B             2              1

C             3              1

D             4              4

E             3              5

 

3.   Consider program P, which runs on a 2GHz machine M in 30 seconds. An optimization is done to P by compiler, replacing some instances of multiplying a value by 4 (x ← x * 4) with 2 instructions of adding x to x (x ← x + x; x

← x + x). Let's call the new optimized program as P'. The CPI of multiplication is 8 and CPI of add is 2. After recompiling the new program is now running in 20 seconds on machine M. How many multiplications were replaced by the compiler? [1 pt]

 

4.   A computer architect needs to design the pipeline of a new micro-processor. An example workload of 108 instructions is used to design the pipeline. Each instruction takes 250ps to finish. How long does it take to execute this workload on a non-pipelined processor? The pipelined implementation has been done using 30 pipeline stages. Assuming a perfect pipeline (i.e. no hazard) how much speedup has been achieved compared to non-pipelined design?  [1 pt]

 

5.   Show the forwarding path needed to execute the following four instructions in a 5-staged pipeline as discussed in the class. [1 pt]

 

add r8, r4, r6 sub r7, r2, r8

lw  r7, r7, 0x1000 add r8, r2 r7

 

 

 

6.   In a 5-staged pipelined processor, identify all the data dependencies in the following code. Which dependencies are the data hazards that will be resolved via forwarding? Which dependencies are data hazards that will cause a stall? [1 pt]

 

add r3, r2, r4 sub r5, r1, r3

lw  r6, r3, 0x2000 add r7, r3, r6

More products