Control Flow in Rocket Core
Today, we talk about the control flow instructions in Rocket Core, mainly the jumps. I'm not familiar with the CSR and other part, so we only talk the normal jump instructions. This part differs in different version, so please refer to your version.
Jump or In Order?
Normally, the decode stage use the output of ibuf
as the result of fetch stage. However, the ibuf
read the instructions from icache
(instruction cache). See Overview of Rocker Pipeline. When there is no jump, ibuf
will fetch the next instruction orderly from the imem
.
If a jump comes, (or other situations that the pipeline do not want next instruction) a kill
is send to the ibuf
, which means clean instruction in the buffer, and read the new respond from the imem
. The pc of the new instruction (where jump to) is in the request of the imem
.
io.imem.req.bits.pc :=
Mux(wb_xcpt || csr.io.eret, csr.io.evec, // exception or [m|s]ret
Mux(replay_wb, wb_reg_pc, // replay
mem_npc)) // flush or branch misprediction
Here, inclueds all the possible of control flow alter. We are not talking about the exception/[m|s]ret/replay. The normal jumps, will take the last line, i.e., the mem_npc. Remember that the call/ret are special jumps.
Jump? When and Where?
Keys to anaylyze when and where to jump, are take_pc
and io.imem.req.bit.pc
. You can glance at Overview to get a general idea.
take_pc
As you can see in the Overview, the take_pc
leads to two results: one is enable the imem.req
, another is kill the ibuf
. That's comprehensible. If there is a jump, we need to read the new instructions and abandon the instruction buffer.
What will enable take_pc
, take_pc
= take_pc_mem_wb
in current version. In the history, there was a take_pc_id
, this will handle the jal instructions in the decode stage. But now, it was handled in frontend---faster. So now, take_pc
means take_pc_mem
or take_pc_wb
.
If you trace from the take_pc_mem
, you will find it will handle the 1) branch was taken 2) jal 3) jalr. The details differ when use BTB or not. So we do not dive into the code.
From take_pc_wb := replay_wb || wb_xcpt || csr.io.eret || wb_reg_flush_pipe
we found that the take_pc_wb
comes from 1) replay 2) exceptions 3) eret 4) flush pipe. Each of them is complicated, so we will disscuss them in the future.
imem.req.bit.pc
As we mentioned before, the imem.req.bits.pc
comes from mem_npc
in jumps. mem_npc
comes from two case:
- If it's a
jalr
,mem_pc
equalsmem_reg_wdata
, which comes from the ALU. - If it's a branch/
jal
,mem_pc
equalsmem_br_target
, which ismem_reg_pc
added an offset.
This is reasonable: jalr
uses the address in the register and use ALU, while jal
and branch use the current pc and an offset.
Others
For replay/exceptions, I may discuss them in the future. But CSR is not in my plan.