Random Questioning: Homework on Textbook Sections 2.9 to 2.14, 2.17, 2.19, 3.1 to 3.7, 3.9

Q1: Write a C function equivalent to the following RISC-V assembly language code.

clear: addi x5, x0, 0
       addi x7, x0, '-'
       jal   x0, L2
L1:    sb    x7, 0(x6)
       addi x5, x5, 1
L2:    add   x6, x10, x5
       lbu   x28, 0(x6)
       bne   x28, x0, L1
       jalr x0, 0(x1)
Solution: void clear (char s[]){
int i;
i=0;
while (s[i] != '\0'){
s[i] = '-';
i++;
}
}

Q2: What doubleword, in hexadecimal, is placed in register x8 by the following instructions?

    lui x8, 0xA9344
    ori x8, x8, 0x01C
Solution: 0xFFFFFFFFA934401C

Q3: What word, in hexadecimal, encodes the branch instruction in the following code sequence?

    beq   x9, x24, L1
    slli x8, x7, 2
    add   x8, x8, x15
    ld    x8, 0(x8)
    sd    x8, 0(x23)
L1: addi x7, x7, 1

Solution: 0x01848A63

Q4: Suppose the following code sequence is executed on two processors in parallel without synchronization:

ld   x6, 0(x10) // load x
add x6, x6, x6 // double x
sd   x6, 0(x10) // store x

If the variable x is initially 8, what are the possible final values for x? Assume that memory only does one load or store at a time, and that a pending load or store waits until the memory is not busy.

Solution: 16 or 32

Q5: The swap procedure on page 135 of the textbook is called by the sort procedure in Figure 2.25 on page 139. By how much would the dynamic instruction count change per iteration of the inner loop if the swap procedure were inlined?

Solution: 4 fewer instructions

Q6: Measurements of programs running on a processor show the following relative frequencies of execution and CPI values:

Arithmetic instructions: 50%, 1 cycle

Load instructions: 15%, 3 cycles

Store instructions: 10%, 2 cycles

Branch instructions: 25%: 2 cycles

The clock frequency of the processor is 2GHz.
Suppose we augment a processor’s instruction set by adding a load indexed instruction that forms the effective address by adding two register values. The new instruction allows a sequence such as:
add rtmp, rs1, rs2
ld rd, 0(rtmp)
to be replaced by a single instruction
ldx rd, (rs1+rs2) // Load from address rs1+rs2 to rd
20% of the loads in the original processor are preceded by add instructions such that the pair can be replaced by a ldx instruction in the new processor. The CPI for the ldx instruction is 3 cycles, but its inclusion slows down the clock frequency of the processor.
What is the minimum clock frequency in GHz required for the new processor to ensure its performance is at least that of the original processor?

Solution: 1.963

Q7: Consider a 32-bit multiplier organized in a similar way to Figure 3.7 in the textbook. Suppose the time required for each adder is 4ns. How long (in ns) does the multiplier take to multiply two 32-bit operands?
Solution: 20

Q8: Use the RISC-V instructions described in the textbook (Sections 3.3 and 3.4 and the green card) to write assembly language code for the following C statement. Assume all variables are of type int, with a, b, c and d in x10, x11, x12 and x18, respectively.
d = (a * b) % c;

Solution: mul x18, x10, x11
rem x18, x18, x12

Q9: Write the hexadecimal word for the IEEE 754 single-precision representation of the decimal +81.75.

Solution: 0x42A38000

Q10: What decimal number does the hexadecimal word 0xBFF00000 represent as an IEEE 754 single precision floating point value?

Solution: -1.875

Q11: Write RISC-V instructions for the following C statement, assuming the variables y and a are of type double and are in floating-point registers, x is an array of double with the base address in x9, and i is of type int in x18.
y = y + a * x[i];

Solution:

Assumption: a is in f9, y is in f8 and f0 is used as temporary storage
slli x5, x18, 3
add x5, x5, x9
fld f0, 0(x5)
fmul.d f0, f9, f0
fadd.d f8, f8, f0

Random Questioning

Wednesday, 22 April 2020

Homework on Textbook Sections 2.9 to 2.14, 2.17, 2.19, 3.1 to 3.7, 3.9

No comments:

Post a Comment