LD [R2+R3], R1 R2=11, R3=9, R4=1, R5=8, R6=0, R7=2
Suppose Memory[20] = 4000
ADD R4, R1, R4
ADD R5, R1, R5
ADD R6, R1, R6
ADD R7, R1, R7
...
Notice that this is an ADDRESS, and the value must still be fetched from this memory location into register R1
Also, at start of the CPU cycle, the ID stage selects R4 and R1 to be copied into the A and B registers:
The LOAD instruction does NOT produce a valid result for the destination register. So we enter an INVALID tag into the Tag Register of Forwarding Register 1 to prevent the value being fetched by the multiplexor.
An invalid tag is easy to formulate: suppose you have 8 registers in the CPU, just enter the value 9 or higher into the tag register or we can add one more bit in the tag register field to indicate if the tag is "valid".
Also, at the end of the CPU cycle, A is updated to R4 (=1) and B is updated to the "current value" of R1 (= 123). This "current" value is a wrong value because there is a more current one on the way from the memory....
Notice that the CPU - at this moment - does not have a clue what that "more current value of R1" is.... because the CPU must still get that value from the memory....
Also, at the end of the CPU cycle, the instruction (LD [R2+R3], R1) is moved into IR(MEM), ADD R4, R1, R4 is moved into IR(EX) and instruction ADD R5, R1, R5 is fetched into IR(ID)
Also, at start of the CPU cycle, the ID stage selects R1 and the same OLD value of R1 to be copied into the A and B registers.
Notice that the LOAD instruction in MEM stage will cause the IF stage to STALL....
Also, at the end of the CPU cycle, A is updated to R5 and B is updated to the old value of R1 .
Also, at the end of the CPU cycle, the instruction (LD [R2+R3], R1) is moved into IR(WB), ADD R4, R1, R4 is moved into IR(MEM), instruction ADD R5, R1, R5 is moved into IR(EX) and the instruction NOP is inserted into IR(ID) (See: click here )