Thursday, 14 May 2020

Homework on Textbook Sections 4.1 to 4.6

For Questions 1 to 11: Questions 1 to 11 deal with the processor datapath and control shown in Figure 4.17 of the textbook, executing the following instruction, located at address 0x0000000000204014 in the instruction memory:
sd x7, 0x18(x24)

Question 1: What is the value of the Branch control signal?

Solution:


0

Question 2:  What is the value of the MemRead control signal?

Solution:


0

Question 3: What is the value of the MemtoReg control signal?

Solution:


X

Question 4: What is the value of the ALUOp control signal, expressed in binary?

Solution:


00

Question 5: What is the value of the MemWrite control signal?

Solution:


1

Question 6: What is the value of the ALUSrc control signal?

Solution:


1

Question 7: What is the value of the RegWrite control signal?

Solution:


0

Question 8: What is the output value of the ALU Control block, expressed in binary?

Solution:


0010

Question 9: What is the output value of the ALU source multiplexer, expressed in hex?

Solution:


0X0000000000000018

Question 10: What is the value of the Zero output flag of the ALU?

Solution:

0

Question 11: What is the output value of the branch target adder, expressed in hex?

Solution:


0X0000000000204044

For Questions 12 and 13: Suppose the blocks in Figure 4.17 of the textbook have latencies shown in the following table.

 Block Latency
 PC read 20ps
 PC setup 15ps
 I-Mem 250ps
 Add 70ps
 Shift-left-1
 5ps
 Mux 20ps
 Regs read 150ps
 Regs setup 15ps
 Imm Gen 10ps
 ALU 100ps
 D-Mem 350ps
 Control 80ps
 ALU Ctrl 30ps

Question 12: What is the minimum clock period for this processor (in ps)?

Solution:


905

Question 13: Adding a multiplier to the ALU results in 100ps additional latency. However, it reduces the number of instructions needed in a program, since multiplications no longer need to be emulated in software. What fraction of the original instruction count must the reduced instruction count be to match the performance of the original processor?

Solution:


0.9005

For Questions 14 to 16: A 5-stage pipelined version of the RISC-V processor has the following latencies for the stages, including the overhead for pipeline registers between stages:

 IF ID EX MEM WB
 500ps 300ps 400ps 600ps 100ps

Question 14: What is the maximum clock frequency (in GHz) for this processor?

Solution:


1.67

Question 15: Suppose you can divide the IF stage into two stages, each with latency 300ps, but with a cost increase of 5%. Similarly, you can divide the MEM stage into two stages, each with 350ps latency, but with cost increase of 5%. If you are only allowed to chose one of these options, what would the resulting maximum clock frequency be (in GHz)?

Solution:


2

Question 16: If you are allowed to chose both options from Question 15 based on best cost/performance, decide whether to chose one or both options. What is the maximum clock frequency (in GHz) based on your choice?

Solution:


2.5

Question 20: In Section 4.5 of the textbook, on page 272, it is suggested that the hardware for branch processing (register comparison, target address calculation and PC update) all be included in the second stage of the pipeline (the ID stage). What is the speedup of this hardware change compared to the pipelined datapath using the ALU and adder in the EX stage and completing the branch in the MEM stage? Assume that branches account for 17% of instructions, as suggested on page 270, that no branch prediction is used, that the average CPI for all other instructions is 1.2, and that there is no effect on clock period.

Solution:


1.25

No comments:

Post a Comment