Wednesday, 22 April 2020

Quiz on Textbook Sections 4.1 to 4.4

Q2: Why are multiplexers required, as shown in Figure 4.2? 
Solution: We cannot wire data lines from multiple sources together. Instead, we require a logic circuit element to select between the sources.

Q3: Match the logic design terms to their corresponding definitions. 
Solution:
Combinational element: The outputs depend only on the current inputs
State element: The outputs depend on current inputs and internal stored values
Edge-triggered clocking: All state changes occur on a clock edge
Control signal: Directs operation of a functional unit or selection by a multiplexer
Data signal: Contains information that is operated on by a functional unit

Q4: The register file is a state element that consists of a set of registers that can be read and written by supplying a register number to be accessed. 
Solution: True

Q6: The single-cycle datapath conceptually described in Section 4.3 must have separate instruction and data memories, because 
 Solution: the processor operates in one cycle and cannot use a single-ported memory for two different accesses within that cycle.

Q8: How does the datapath in Figure 4.17 determine the outcome of a beq instruction? 
Solution: The ALU subtracts the operands and asserts the Zero output if the difference is zero. That, ANDed with the Branch decode signal, controls the multiplexer to select the next PC value.

Q10: The clock cycle time for the datapath described in Section 4.4 is determined by the longest chain of functional units used for any instruction. All instructions thus take that amount of time to execute. For this reason, single-cycle implementations are the main form of processor used today. 
Solution: False 

Homework on Textbook Sections 2.9 to 2.14, 2.17, 2.19, 3.1 to 3.7, 3.9

Q1: Write a C function equivalent to the following RISC-V assembly language code.

clear: addi  x5, x0, 0
       addi  x7, x0, '-'
       jal   x0, L2
L1:    sb    x7, 0(x6)
       addi  x5, x5, 1
L2:    add   x6, x10, x5
       lbu   x28, 0(x6)
       bne   x28, x0, L1
       jalr  x0, 0(x1)

Solution: void clear (char s[]){
int i;
i=0;
while (s[i] != '\0'){
s[i] = '-';
i++;
}
}

Q2: What doubleword, in hexadecimal, is placed in register x8 by the following instructions?

    lui  x8, 0xA9344
    ori  x8, x8, 0x01C

Solution: 0xFFFFFFFFA934401C

Q3: What word, in hexadecimal, encodes the branch instruction in the following code sequence?

    beq   x9, x24, L1
    slli  x8, x7, 2
    add   x8, x8, x15
    ld    x8, 0(x8)
    sd    x8, 0(x23)
L1: addi  x7, x7, 1


Solution: 0x01848A63

Q4: Suppose the following code sequence is executed on two processors in parallel without synchronization:

ld   x6, 0(x10)  // load x
add  x6, x6, x6  // double x
sd   x6, 0(x10)  // store x

If the variable x is initially 8, what are the possible final values for x? Assume that memory only does one load or store at a time, and that a pending load or store waits until the memory is not busy.


Solution:  16 or 32
 
Q5: The swap procedure on page 135 of the textbook is called by the sort procedure in Figure 2.25 on page 139. By how much would the dynamic instruction count change per iteration of the inner loop if the swap procedure were inlined? 

Solution: 4 fewer instructions
 
Q6: Measurements of programs running on a processor show the following relative frequencies of execution and CPI values:
  • Arithmetic instructions: 50%, 1 cycle
  • Load instructions: 15%, 3 cycles
  • Store instructions: 10%, 2 cycles
  • Branch instructions: 25%: 2 cycles
The clock frequency of the processor is 2GHz.
Suppose we augment a processor’s instruction set by adding a load indexed instruction that forms the effective address by adding two register values. The new instruction allows a sequence such as:
add  rtmp, rs1, rs2
ld   rd, 0(rtmp)
to be replaced by a single instruction
ldx  rd, (rs1+rs2)   // Load from address rs1+rs2 to rd
20% of the loads in the original processor are preceded by add instructions such that the pair can be replaced by a ldx instruction in the new processor. The CPI for the ldx instruction is 3 cycles, but its inclusion slows down the clock frequency of the processor.
What is the minimum clock frequency in GHz required for the new processor to ensure its performance is at least that of the original processor? 


Solution: 1.963

Q7: Consider a 32-bit multiplier organized in a similar way to Figure 3.7 in the textbook. Suppose the time required for each adder is 4ns. How long (in ns) does the multiplier take to multiply two 32-bit operands? 
Solution: 20

Q8: Use the RISC-V instructions described in the textbook (Sections 3.3 and 3.4 and the green card) to write assembly language code for the following C statement. Assume all variables are of type int, with a, b, c and d in x10, x11, x12 and x18, respectively.
d = (a * b) % c;


Solution: mul x18, x10, x11
rem x18, x18, x12

Q9: Write the hexadecimal word for the IEEE 754 single-precision representation of the decimal +81.75. 

Solution: 0x42A38000

Q10: What decimal number does the hexadecimal word 0xBFF00000 represent as an IEEE 754 single precision floating point value? 

Solution: -1.875

Q11: Write RISC-V instructions for the following C statement, assuming the variables y and a are of type double and are in floating-point registers, x is an array of double with the base address in x9, and i is of type int in x18.
y = y + a * x[i];


Solution: 
Assumption: a is in f9, y is in f8 and f0 is used as temporary storage
slli x5, x18, 3
add x5, x5, x9
fld f0, 0(x5)
fmul.d f0, f9, f0
fadd.d f8, f8, f0

Quiz on Textbook Sections 3.1 to 3.7, 3.9

Q1: For what combinations of operands can signed addition overflow? 
Solution: Both operands positive or both operands negative
 
Q2: What usually happens when an overflow occurs during addition to calculate an address in a program? 
Solution: Nothing - the overflow is ignored.
 
Q3: How many clock cycles would the sequential multiplier shown in Figure 3.5 of the textbook take to multiply 64-bit operands? 
Solution: 64

Q4: A parallel multiplier using tree-structured stack of adders is much faster than a sequential multiplier, but at the cost of significantly more hardware resources. 
Solution: True

Q5: Can we construct a fast parallel divider is a similar way to the way we make a fast parallel multiplier? 
Solution: No, because we need to use the sign of the difference calculated at each step in order to perform the next step.

Q6: How many bits are used for single precision floating-point values? 
Solution:  32 bits: 1 sign bit, 8 exponent bits, and 23 fraction bits

Q7: The revised IEEE 754-2008 standard added a 16-bit floating-point format with five exponent bits. What do you think is the likely range of numbers it could represent?
 Solution: ±1.0000 0000 00×2^−14 to ±1.1111 1111 11×2^15, ±0, ±∞, NaN

Q8: The hardware for floating-point operations is significantly more complex than that for integer operations.
Solution: True



Q9: From the statements below, select those that correctly describe floating point instructions in the RISC-V instruction set. 
Solution: Floating-point arithmetic instructions operate on different registers to integer instructions

Q10: The Intel SE2 instruction extensions provide subword parallelism by what means? 
Solution: By having 128-bit wide registers and a floating-point operations on short vectors of 4 single-precision or 2 double-precision elements.

Quiz on Textbook Sections 2.9 to 2.14, 2.17, 2.19

Q1: Which of the following statements about characters and strings in C and Java is true? 
Solution: A string in C takes about half the memory as the same string in Java.

Q2: If memory address 0x0000000000000020 contains the doubleword 0x8786858483828180, what value is loaded into register x9 by the following instruction?

lb  x9, 0x21(x0)

Recall that RISC-V uses little-endian addressing.

Solution: 0xffffffffffffff81
 
Q6: When do you use primitives like load reserved and store conditional? 
Solution: 
  1. When cooperating threads of a parallel program need to synchronize to get proper behavior for reading and writing shared data
  2. When cooperating processes on a uniprocessor need to synchronize for reading and writing shared data
Q8: Using pointers in C usually gives better performance than using arrays. 
Solution: False

Q9: How many bytes are used to encode x86 instructions? 
 Solution: From 1 to 15 bytes
 
Q10: It is generally best to write programs in a high level language, since they're more intelligible and maintainable, and a compiler can generate more efficient code than an assembly language programmer. 
Solution: True 

Homework on Textbook Sections 1.1 to 1.4, 1.6 to 1.9, 2.1 to 2.8

Q1: HTC’s VIVE VR headset, comprising two displays for stereo video, has the following specifications:

  •     Resolution: 1080 × 1200 per eye
  •     Refresh Rate: 90Hz
Assuming 24bit/pixel video, what data rate in Gbps (Gigabits/sec) is required on the HDMI cable for the device?

Solution: 5.5987

Q2: A processor, which has a clock frequency of 1.3GHz, take 10s to run a program of 6×10^9 instructions. What is the average number of cycles per instruction? 

Solution: 2.167

Q3: Suppose, for the program in Question 2, the instructions are composed as follows:


Instruction class
Instruction count
CPI
Arithmetic/logic
4×10^9
1
Load/store
1×10^9
6
Branch
1×10^9
3
We are trying to redesign the processor to increase performance by a factor of 1.25, but this would lead to an increase in the CPI for load/store instructions from 6 to 8. What clock frequency in GHz would be needed for the redesigned processor?
Solution: 1.875

Q4: Instead of trying to increase the performance of the single processor, we can consider using multiple processor cores in a computer. Suppose, for the program in Question 2, parallelizing the program to use p processor cores divides the number of arithmetic/logic instructions by 0.7p, the number of load/store instructions by 0.8p, and the number of branches by 0.9p. What is the minimum number of processor cores required to improve performance by a factor of 3? 
 Solution: 4

Q5: What is the actual speedup achieved with the number of processor cores you identified in Question 4? 
 Solution: 3.14

Q6: What RISC-V instruction is encoded by the hex word 0x0051E933? 
Solution: or x18, x3, x5

Q7: What hex word encodes the RISC-V instruction ld x9, -24(x10)? 
Solution: 0xFE853483

Q8: If x9 initially contains the value 0xC445028461001003, what value (in hex) is placed in x18 by the following instruction?

srai x18, x9, 6?

Solution: 0xFF11140A11840040




Q9: Which of the following RISC-V instruction sequences extracts the 6-bit field from bits 4 to 9 of x10 and places it in the least-significant 6 bits of x7? 
 Solution: andi x7, x10, 0x3F0
srli x7, x7, 4
 
Q10: Write RISC-V instructions for the following C statements, assuming a is in x9 and b is in x18:

if (a == b)
  a = a + 1;
else
  a = a - 1;
 

Solution: bne x9, x18, L1
addi x9, x9, 1
beq x0, x0, L2
L1: addi x9, x9, -1
L2:

Q11: What C statements are encoded by the following, assuming x10 contains the signed int variable m?

      addi x5, x0, 20
      bgeu x10, x5, skip
      jal  x1, my_func
skip:
 

Solution: if (m >= 0 && m < 20) my_func(m);

Q12: Write RISC-V instructions for the following leaf function:

int min(int a, b) {
  return a < b ? a : b;
}

Solution: min: bge x10, x11, L1
           add x10, x0, x10
           jalr x0, 0(x1)
L1: add x10, x0, x11
        jalr x0, 0(x1)

Q13: Write RISC-V instructions for the following recursive function, without eliminating the recursion (i.e., without replacing it by a loop):

int sum(int n) {
  if (n == 0)
    return 0;
  else
    return n + sum(n – 1);
}

Solution: sum: addi sp, sp, -16
            sd x1, 8(sp)
            sd x8, 0(sp)
            beq x10, x0, ret
            add x8, x0, x10
            addi x10, x10, -1
             jal x1, sum
             add x10, x8, x10
ret: ld x8, 0(sp)
         ld x1, 8(sp)
         addi sp, sp, 16
         jalr x0, 0(x1)




Quiz on Textbook Sections 2.1 to 2.8

Q1: What C statement is represented by the following RISC-V assembly code?

add  x5, x8, x9
add  x6, x20, x11
sub  x8, x5, x6
Solution:
a = (a + b) - (c + d);

Q2: Which of the following RISC-V instruction implements the C assignment

n = A[3];

assuming n is in x9, the base address of A is in x10, and A is an array of uint64_t?

Solution: ld  x9, 24(x10)

Q3: The statement that RISC-V is little-endian means that the least-significant byte of a doubleword in memory is at the lowest address and the most-significant byte the doubleword is at the highest address. 
Solution: True

Q4: What range of values can be used as the immediate operand in a RISC-V addi instruction? 
Solution:  −2048 to +2047
 
Q5: What RISC-V instruction is encode by the hexadecimal word 04533023? 
Solution:  sd  x5, 0x040(x6)

Q6: If the register x8 contains the value 0x804600B91200C00F, what value is placed in x5 by the instruction

srli x5, x8, 4

Solution: 0x0804600B91200C00


Q7: What C statement is implemented by the following RISC-V assembly code, assuming a is in x9 and b is in x18?

    blt  x18, x9, L1
    addi x9, x9, 1
L1:

Solution: if (a <= b) a = a + 1;


Q8: Which of the RISC-V code sequences correctly implements the C statements

i = 0;
while (A[i] != 0) i++;

assuming i is in x9 and the address of A is in x18

Solution:

    addi x9, x0, 0
L1: slli x5, x9, 3
    add  x5, x5, x18
    ld   x5, 0(x5)
    beq  x5, x0, L2
    addi x9, x9, 1
    j    L1
L2:

Q10: The RISC-V calling convention uses registers x10 to x17 for parameter passing. If a procedure has more than eight parameters, where are the additional parameters passed? 
 Solution:
They are pushed onto the stack by the caller.

Quiz on Textbook Sections 1.1 to 1.4, 1.6 to 1.9

Q1: Match each description with the class of computer. 
(i) General purpose, run a variety of software, subject to cost/performance tradeoff
Solution: Personal computers
(ii) Network based, high capacity, high performance, high reliability, range from small to building sized
Solution: Server computers
(iii) High-end scientific and engineering calculations, highest capability but represent a small fraction of the overall computer market
Solution: Supercomputers
(iv) Hidden as components of systems, stringent power/performance/cost constraints
Solution: Embedded computers

Q2: Which kind of computer can best be described as:
  • Battery operated 
  • Connects to the Internet 
  • Costs a few hundred dollars
  • Has touch screen
Solution: Personal mobile device




Q3: Which of the following are input devices? 
LCD display
Keyboard
Loudspeaker
Touchscreen
Pushbutton
Radio transmitter

Solution:
  1. Keyboard
  2. Touchscreen
  3. Pushbutton
Q4: Which of the following are output devices 
Loudspeaker
Temperature sensor
LED indicator light
Pushbutton
Mouse
LCD display 

Solution:
  1. Loudspeaker
  2. LED indicator light
  3. LCD display  
Q5: Match the following descriptions to the types of memory: 
(i) The storage area in which programs are kept when they are running and that contains the data needed by the running programs.
Solution: Main memory
(ii) Memory built as an integrated circuit; it provides random access to any location. Access times are 50 nanoseconds and cost per gigabyte in 2012 was $5 to $10.
Solution: Dynamic random access memory (DRAM)
(iii) A small, fast memory that acts as a buffer for a slower, larger memory.
Solution: Cache memory
(iv) Memory built as an integrated circuit, but faster and less dense than DRAM.
Solution: Static random access memory (SRAM)
(v) A form of nonvolatile secondary memory composed of rotating platters coated with a magnetic recording material. Because they are rotating mechanical devices, access times are about 5 to 20 milliseconds and cost per gigabyte in 2012 was $0.05 to $0.10.
Solution: Magnetic disk memory
(vi) A nonvolatile semi-conductor memory. It is cheaper and slower than DRAM but more expensive per bit and faster than magnetic disks. Access times are about 5 to 50 microseconds and cost per gigabyte in 2012 was $0.75 to $1.00.
Solution: Flash memory

Q6: Which of the following best defines the term "instruction set architecture"? 
Solution: An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on.

Q8: Computer C’s performance is 4 times as fast as the performance of computer B, which runs a given application in 28 seconds. How many seconds will computer C take to run that application? 
Solution: 7

Q9: A given application written in Java runs 15 seconds on a desktop processor. A new Java compiler is released that requires only 0.6 as many instructions as the old compiler. Unfortunately, it increases the CPI by 1.1. How fast can we expect the application to run using this new compiler? Pick the right answer from the three choices below: 
Solution: 15 × 0.6 × 1.1 = 9.9 seconds

 Q10: If we increase the clock frequency of a microprocessor, what will happen to the power consumption? 
Solution: Power will increase 

Wednesday, 15 April 2020

C++ Question - Even Odd Number

Q: Write a program to check whether the given number is even or odd (using ? : ternary operator )

Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 int a;
 cout<<"Enter the Number : ";
 cin>>a;
 (a%2==0)?cout<<"Number is even":cout<<"Number is odd";
 return 0;
}
 
Screenshot: 


C++ Program - Check Number is Positive or Negative

Q: Write a program to check whether the given number is positive or negative (using ? : ternary operator )
Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 int a;
 cout<<"Enter any non-zero Number : ";
 cin>>a;
 (a>0)?cout<<"Number is positive":cout<<"Number is negative";
 return 0;
}

Screenshot:
 

C++ Program - Area of Circle

Q: Write a program to calculate area of circle.
Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 float r,area;
 cout<< "\nEnter radius of circle : ";
 cin>>r;
 area = 3.14*r*r;
 cout<<"Area of circle : "<<area;
 return 0;
}

Screenshot: 

C++ Program - Swap Variables

Q: Write a program to swap the values of two variables. 
Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 int a,b,temp;
 cout<<"\nEnter two numbers : ";
 cin>>a>>b;
 temp=a; 
 a=b;
 b=temp;
 cout<<"\nAfter swapping numbers are : ";
 cout<<a<<" "<<b;
 return 0;
}

Screenshot:

 

C++ Program - ASCII Character

Q: Write a program which accepts a character and display its ASCII value. 
Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 char ch;
 cout<<"\nEnter any character : ";
 cin>>ch;
 cout<<"ASCII equivalent is : "<<static_cast<int>(ch);
 return 0;
}
  Screenshot:

 

C++ Program - Calculate Simple Interest

Q: Write a program which accept principle, rate and time from user and print the simple interest.

Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 int p,r,t,i;
 cout<<"Enter Principle : ";
 cin>>p;
 cout<<"Enter Rate : ";
 cin>>r;
 cout<<"Enter Time : ";
 cin>>t;
 i=(p*r*t)/100;
 cout<<"Simple interest is : "<<i;
 return 0;
}
  Screenshot:
 

C++ Question - Farenheit to Celcius

Q: Write a program which accept temperature in Fahrenheit and print it in centigrade.
Solution:

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 float F,C;
 cout<< "\nEnter temperature in Farenheit : ";
 cin>>F;
 C=5*(F-32)/9;
 cout<<"Temperature in celcius is : "<<C;
 return 0;
}

Screenshot:

C++ Program - Sum of Two Numbers

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 int a,b,c;
 cout<< "\nEnter first number : ";
 cin>>a;
 cout<<"\nEnter second number : ";
 cin>>b;
 c=a+b;
 cout<<"\nThe Sum is : "<<c;
 return 0;
}

Screenshot: 

Escape Sequence Example in C++

#include <iostream>
using namespace std;

int main(int argc, char *argv[])
{
 cout<<"subject " <<"\tmarks"<<"\nmathematic\t"
 <<90<<"\ncomputer\t"<<77<<"\nchemistry\t"<<69;
 return 0;
}

Screenshot: 

C++ Hello World Program

Q: Write a program to print HELLO WORLD on screen.
Solution:

#include<iostream>
using namespace std;

int main(int argc, char *argv[])
{
 cout<<"Hello world";
 return 0;
}

Screenshot: