We need to add the following circuits:
The input of ForwReg1 is tied to the output of the ALU,
The input of ForwReg2 is tied to the output of ForwReg1,
So: ForwReg1 will contains the value of the current ALU instruction and ForwReg2 contains the value of the one before the current instruction.
The input of tag register 1 is the destination field of the instruction (this field contains the register number that will receive the value)
The first multiplexor (yellow one) will select from among: PC1, A, ForwReg1 and ForwReg2.
The selection logic of this multiplexor is as follows:
if ( Instruction == Branch )
select PC1 as operand
else if ( tag1 == Src1 )
select ForwReg1 as operand
else if ( tag2 == Src1 )
select ForwReg2 as operand
else
select A as operand
|
The reason for this should be clear:
If the register where we get the operand is the register updated by the last instruction (that is when ( tag1 == Src1 ), we use the value stored in the ForwReg1 as operand.
If the register where we get the operand is not the register updated by the last instruction but was updated by the instruction before the last instruction (that is when ( tag2 == Src1 ), we use the value stored in the ForwReg2 as operand.
If neither case hold, we can use the value fetched from the register without any problem.
The second multiplexor (gold colored one) will select from among: IR1 (a constant), B, ForwReg1 and ForwReg2.
The selection logic of this multiplexor is as follows:
if ( Instruction Imm bit is set to 1 )
select IR1 as operand
else if ( tag1 == Src2 )
select ForwReg1 as operand
else if ( tag2 == Src2 )
select ForwReg2 as operand
else
select B as operand
|
The reason for this is similar to the first case:
If the register where we get the operand is the register updated by the last instruction (that is when ( tag1 == Src2 ), we use the value stored in the ForwReg1 as operand.
If the register where we get the operand is not the register updated by the last instruction but was updated by the instruction before the last instruction (that is when ( tag2 == Src2 ), we use the value stored in the ForwReg2 as operand.
If neither case hold, we can use the value fetched from the register without any problem.
Let us first look at an example and convince ourselves that the solution works. Later, I will should you how the two multiplexor in the EX stage is constructed (it's pretty straightforward).
ADD R2, R3, R1 R2=11, R3=9, R4=1, R5=8, R6=0, R7=2
ADD R4, R1, R4
ADD R5, R1, R5
ADD R6, R1, R6
ADD R7, R1, R7
...
The difference now is the result (20) will also be written into the Forwarding Register 1 along with the register tag = 001 (indicating register R1)
Also, at start of the CPU cycle, the ID stage selects R4 and R1 to be copied into the A and B registers,
Notice that an OLD value of R1 will still be fetched into B. (That is not a problem because we will find a way to obtain the more recent value from the forwarding registers - see next CPU cycle).
Notice that the Src2 field in "ADD R4, R1, R4" contains the bits 001 to indicate R1 !!!
For the first operand, the value from the A register (R4) is selected.
For the first operand, the value of the Forwarding register 1 is select -- because (tag1 (001) == Src2 (001)).
So the ALU will add R4 with the NEW value 20 for R1 (which has not arrived to R1 yet !!!)
Pictorially:
Notice that the Src2 field in "ADD R5, R1, R5" contains the bits 001 to indicate R1 !!!
For the first operand, the value from the A register (R5) is selected.
For the first operand, the value of the Forwarding register 2 is select -- because (tag2 (001) == Src2 (001)).
So the ALU will add R5 with the NEW value 20 for R1 (which has still not arrived to R1 yet !!!)
Pictorially:
NOTE: That is not a problem because we have previously determined that the 3rd instruction following "ADD R2,R3,R1" is able to obtain the correct value straight from R1.