1
) Instruction Design
Take the SRC (Simple Risc Computer, link) instruction sequence provided below and determine the operation of the sequence (explain this operation in words). Then propose a new instruction to the SRC to accomplish the same thing. You must provide abstract RTL and concrete RTL for the new instruction based on the 1-bus architecture from the textbook. If required, you may modify the architecture.
shr r0, r1, 1
5
andi r0, r0, 1
addi r
2
, r1, 0
lar r
3
1, done
brzr r31, r0
la r0, 0xFFFF
shl r0, r0, 1
6
or r2, r1, r0
done: …
Note: r0 is used only as a temporary register in this code sequence and should not be part of your new instruction. Your new instruction should be of the form new_instr rn, rm and should not affect any other general purpose registers than rn and rm (in the above code sequence, rm would be r1 and rn would be r2).
2) Computer Architecture (the FRACTION computer)
The use of floating point numbers (or fixed point for that matter) inherently introduces error into
computations. To remove all error, all computations should be performed as fractions. Therefore, the FRACTION is proposed as an “exact” computer architecture.
The core concept to the FRACTION is a new data type, the fraction. A fraction is a 32-bit piece of data, comprised of a
16
-bit numerator in the upper half of the data, and a 16-bit denominator in the lower 16 bits.
Numerator (16-bits) |
Denominator (16-bits) |
The proposed architecture then performs standard fraction arithmetic (+, -, , ). The standard ALU on our existing architecture can handle all standard integer based arithmetic, but there are several functions required to support fraction arithmetic. To manage this extra capability, we modify the 1-bus SRC architecture to include a new functional unit called the Fraction Unit (FrU) in addition to the standard ALU. The FrU has several proposed functions to assist in fraction arithmetic:
LCM |
Least Common Multiple |
Computes the least common multiple of 2 fractions |
LCMS |
Least Common
Multiple Scale |
Given 2 fractions, returns a fraction where the numerator is the multiple used to multiple the 1st fraction’s numerator and denominator to get the denominator to the LCM, and the denominator is the scale applied to the second fraction. |
INV |
Inverse |
Inverts a fraction |
RED |
Reduce |
Converts a fraction to its simplest form. |
The architecture also presents special register operations for latching different portions of fractions to different portions of the register
LatchUL Latch the upper 16 |
bits from the bus into the lower 16 bits of the register |
|
LatchLL Latch the lower 16 |
bits from the bus into the lower 16 bits of the register |
|
LatchLU Latch the lower 16 |
bits from the bus into the upper 16 bits of the register |
|
LatchUU Latch the upper 16 |
Your task is to expand on this concept. You need to extend the 1-bus SRC architecture to include the FrU and any other temporary registers you deem necessary. You need to then develop abstract and concrete RTL for the
4
fraction operations. In this process, you may increase or decrease the capabilities of the FrU and the registers if necessary.
3) Vector operations
You are tasked to design the execution sequence (you may ignore the instruction fetch sequence) of a vector add instruction (VADD) defined below. The instruction reads an element from two arrays, adds them, and stores the result in a third array. The instruction has the format:
VADD Rc, Ra, Rb
Where register Ra and Rb hold the addresses of the elements from the two source arrays in memory and Rc holds the address of the element in the destination array where result is to be saved. In addition, the addresses in all three registers must be incremented in preparation to add the next set of array elements. Pictorially, the operation is defined by:
For each of the architectures below, provide abstract RTL and concrete RTL for the execution of the VADD instruction.
Note about grading: A functional, yet fully un-optimized concrete RTL will result in only half the credit for your work. Full credit will be given only for fully optimized RTL for each of the architectures. It is a good idea however to come to a preliminary (un-optimized RTL) solution and then optimize the statements so that they take as few as possible execution cycles to complete.
Abstract RTL for VADD Rc, Ra, Rb:
Concrete RTL for Architecture 1 |
Concrete RTL for Architecture 2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |