Note: RISC-V的靜態指令預測

續讀 “手把手教你設計CPU — RISC-V處理器", 提到了指令提取單元, 處理條件跳躍指令, 做跳轉靜態預測時, 根據規格書, 往回跳永遠為真,  需要程式本身配合, 估計寫程式也要有一個概念, 才能減少效能的損失, 不過想到 for loop 本身就吻合這個模型, 通常寫 for loop 都是假定會執行一定次數 編譯器通常會在區塊執行尾端判斷要不要回跳

規格書我查的是 riscv spec v2.2 在 17頁有下面一段

“Software should be optimized such that the sequential code path is the most common path, with less-frequently taken code paths placed out of line. Software should also assume that backward branches will be predicted taken and forward branches as not taken, at least the first time they areencountered. “

廣告

Note: 讀"手把手教你設計CPU RISC-V處理器篇"

RISC-V是近年獲得許多關注的開放處理器指令集,現在有許多實作出現,而且可見的未來有機會變成一個類似8051 一樣的通用核心,最少它能很好的取代CortexM0-M3系列的設計。而且相關的軟體生態正一步一步完善中, 例如GCC 已經納入支援體系,基於RISC-V開發應用SOC的門檻相對變得很低,在商業應用上也取得不少成果。

基於上述原因,加入自己也寫過一點點Verilog,所以一直想看看能不能找到本書跟版子練習一下,剛好對岸出了這本 “手把手教你設計CPU RISC-V處理器篇" 非常適合像我這種新手學習,畢竟雖然多年前寫過一點Verilog 但是沒大規模使用與應用經驗,只能一直處於一個新手村狀態,想藉著這本書稍微提高一下程度。

這本書初步讀了幾章,前幾章講的是處理器的演進歷史與目前業界現況,第五章開始講一些基本Verilog 設計注意事項。

  1. 使用D-FF 模版生成暫存器
  2. 使用Assign 語法替代if-else and Case 避免 simulation 與最後合成產生非預期結果
  3. 只在Control Signal Flow 使用具有 Reset 的D-FF,好處是可以稍微減少合成面積與增加時脈降低功耗
  4. Clock & Reset 信號只使用在D-FF上,避免組合邏輯(Combination Logic) 的 Glitch產生

之後每個章節講的是E203的核心一部分,會先講大架構,然後拆解每個模塊的設計考量,讓讀者了解設計緣由。更深入的資訊需要自己多讀源碼與查閱資料,才能內化成為自己的知識。

其中一個心得是RISC-V也算是經過這樣多年計算機結構設計與嘗試,加上編譯器工程的進步的精華萃取結果,算是站上歷史浪潮上的結果,現在提煉出一個去蕪存菁的設計考量。

  • AGU – Address Generation Unit
  • AMO – Atomic Memory Operation
  • BIU – Bus Interface Unit
  • CLINT – Core Local Interrupts Controller
  • CSR – Control and Status Register
  • DTCM – Data Tightly Coupled Memory
  • DTM – Debug Transport Module
  • EAI – Extension Accelerator Interface
  • ICB – Internal Chip Bus
  • ICG – Integrated Clock Gating
  • IFU – Instruction Fetch Unit
  • ITCM – Instruction Tightly Coupled Memory
  • LSU  – Load Store Unit
  • MEPC – Machine Exception Program Counter
  • OITF – Outstanding Instruction Track FIFO
  • PLIC – Platform Level Interrupts Controller
  • RAS – Return Address Stack
  • WAR – Write After Read

 

 

Note: Modelsim Examples 下的 “sc_vlog”

它是一個透過 SystemC  ringbuf.h 去執行 Verilog module 的範例

ringbuf.h 內是宣告繼承所謂的 sc_foreign_module 來執行外部module, 這里是 ringbuf.v

在 run.do 可以看到先將所有 verilog 都compile 然後做 scgenmod -bool ringbuf > ringbuf.h 手動產生 foregin  module 的連結, 這裡的 ringbuf > ringbuf.h 是將 ringbuf 當成參數傳進 ringbuf.h, 這各參數這裡是當成 ringbuf.v 內的 hdl_name 接收, 然後compile test_ringbuf 之後, 去跑 simulation.

算是一個用SystemC寫Test bench 的範例

Note: Modelsim 的SystemC 的Example, “sc_basic”gold

底下就是依序的指令(省略進入目錄的過程), 與執行結果, SystemC 目前主要用來驗證演算法的執行是否正確, 是C語言的語法 加入HW的元素, 有多Task可以代表硬體的平行執行.

vlib work
sccom -g basic.cpp
#
# Model Technology ModelSim SE sccom 10.0a compiler 2011.02 Feb 20 2011
#
# Exported modules:
#     top
sccom -link
#
# Model Technology ModelSim SE sccom 10.0a compiler 2011.02 Feb 20 2011
vsim -voptargs=+acc work.top
# vsim -voptargs=+acc work.top
# ** Note: (vsim-3812) Design is being optimized…
# Loading C:\modeltech_10.0a\examples\systemc\sc_basic\gold\work\_sc\win32_gcc-4.2.1\systemc.so
# Loading C:\modeltech_10.0a\examples\systemc\sc_basic\gold\work.top
run
# 1 main_action_method called
# 1 main_action_thread called
# 3 main_action_cthread called

Note: ModelSim 的 PLI Example “traverse_design”

這個Example除了把 Modelsim 安裝好後, 還要看環境安裝對應的 compiler,  因為在XP 32bits上測試, 所以需要 “gcc-4.2.1-mingw32vc9”的package 解壓縮後(看 traverse_design.do會有一些提示), 放在安裝目錄即可. 另外 traverse_design.do 最後會直接離開Modelsim 需要comment最後指令

下面是執行過程, 主要是dump  pin/wire 的information.

do traverse_design.do
# Model Technology ModelSim SE vlog 10.0a Compiler 2011.02 Feb 20 2011
# — Compiling UDP multiplexer
# — Compiling UDP sudp
# — Compiling module dff
# — Compiling module top
#
# Top level modules:
#     top
# gcc -g -c -m32 -Wall -ansi -pedantic -I. -IC:/modeltech_10.0a/include  pli_test.c
# gcc -shared -lm -m32 -Wl,-Bsymbolic -Wl,-export-dynamic -o  pli_test.dll pli_test.o  -LC:/modeltech_10.0a/win32 -lmtipli
# vsim -c -pli ./pli_test.dll top
# ** Note: (vsim-3812) Design is being optimized…
# ** Note: (vsim-3865) Due to PLI being present, full design access is being specified.
# Loading ./pli_test.dll
# Loading work.top(fast)
# Loading work.dff(fast)
#
# ===========================
# Results of Design Traversal
# ===========================
#
# Module name is top.
#   Type           is accModule (20).
#   Full Type      is accTopModule (224).
#   Cell Instance  is FALSE.
#   Def Name       is top.
#   Delay Mode     is 0 (accDelayModeNone).
#   File           is top.v.
#   Line No        is 0.
#   Full Name      is top.
#   Time Precision is -9 (1 ns).
#   Time Unit      is -9 (1 ns).
#   Top Module     is TRUE.
#
#   Port name is ain.
#     Type          is accPort (35).
#     Full Type     is accVectorPort (256).
#     Direction     is accInout.
#     Port Index    is 0.
#     Line No       is 14.
#     File          is top.v.
#     Port Size     is Vector of 4 bits.
#     Parent        is top of type accTopModule.
#
#   Port Bit #1: ———-
#
#   Port name is ain[3].
#     Type          is accPortBit (214).
#     Full Type     is accPortBit (214).
#     Direction     is accInout.
#     Line No       is 14.
#     File          is top.v.
#     Port Size     is Scalar.
#     Parent        is top of type accTopModule.

Note : PPM (Power Plane Management) CPLD for Atom + SCH

目前只做到 S4/S5 to S0 的 Function simulation, source clock 也不是用實際的32.768KHz,  先用簡單的用 32KHz替代, 結果與程式碼如下, 基本上從 Intel 的 Altera 的 AHDL 手工轉成 Verilog

image

/*
 * PPM(Power Plane Management for XXX Project)
 * Function description :
 *      Power Sequence control
 *      ref. Power Plane Management and LPC to SPI bridging Solution, Intel 391878.pdf
 *
 * Create by : Kun-Yi Chen
 * Data : Nov. 12 2008
 * Language : Verilog, RTL
 * Revesion : 0.1
 */

module PPM
(
    // ------------------------------------------------------------------------
    // inputs
    SMC_ONOFF_N,        // System Managment controller On/Off, 
                        // debounced copy of the power button.
    PM_RSMRST_PWRGD,    // Resume Reset Power Good
    SUS_CLK,            // Suspend Clock, 32.7680Khz
    PM_ALL_SYS_PWRGD,   // All System Power Good
    PM_IMVP_PWRGD,      // IMVP Power Good, Power for CPU is stable
    PM_SYSRST_N,        // System Reset button,
                        // debounced copy of the reset button
    SMC_RST_N,          // Indicates the 3.3v standby power is stable
    //
    // SCH to CPLD
    SLPMODE,            // Sleep mode, if (SLPMODE) enterd S3 else enterd S4/S5
    SLPRDY_N,           // Sleep Ready
    RSTRDY,             // Reset Ready
    // ------------------------------------------------------------------------
    // outputs
    PM_SLP_S4_N,        // Sleep S4
    PM_SCH_RSMRST_N,    // Resume Well Reset
    PM_SLP_S3_N,        // Sleep S3
    IMVP_VR_ON,         // IMVP Voltage Regulator On
    SCH_PWROK,          // Power OK, SCH core power is stable
    PM_RSTWARN,         // Reset Warning
    RST_N,              // System Reset
    //
    EC_A20M_N,          // EC A20 match
    EC_INIT_N           // EC initial
);
// port declare
input SMC_ONOFF_N;
input PM_RSMRST_PWRGD;
input SUS_CLK;
input PM_ALL_SYS_PWRGD;
input PM_IMVP_PWRGD;
input SLPMODE;
input SLPRDY_N;
input RSTRDY;
input PM_SYSRST_N;
input SMC_RST_N;

output PM_SLP_S4_N;
output PM_SCH_RSMRST_N;
output PM_SLP_S3_N;
output IMVP_VR_ON;
output SCH_PWROK;
output PM_RSTWARN;
output RST_N;
//
// single declare
//
reg PM_SLP_S4_N;
reg PM_SCH_RSMRST_N;
reg PM_SLP_S3_N;
reg IMVP_VR_ON;
reg SCH_PWROK;
reg PM_RSTWARN;
reg RST_N;

input EC_A20M_N;
input EC_INIT_N;
    
reg [7:0] CNT8B;
reg [11:0] CNT12B;

reg RST_CNT8B_N;
reg RST_CNT12B_N;

parameter [3:0] 
    S4 = 0,
    S3toS0_sec1 = 1, S3toS0_sec2 = 2, S3toS0_sec3 = 3,
    S3toS0_clr1 = 4, S3toS0_clr2 = 5,
    S0 = 6, S0toS3 =7, S3 = 8,
    warmreset = 9;

`define DEL 1


reg [3:0] ST_PM; // power managment state machine

always @( posedge SUS_CLK or negedge SMC_RST_N )
begin
    if (~SMC_RST_N) begin // asynchronous reset
        ST_PM <= #`DEL S4;
    end
    else begin
      case (ST_PM)
        S4: begin
            //PM_RSMRST_PWRGD <= #`DEL 1'b0;
            PM_SLP_S3_N <= #`DEL 0;
            
            IMVP_VR_ON <= #`DEL 0;
            SCH_PWROK <= #`DEL 0;
            PM_RSTWARN <= #`DEL 1;
            RST_N <= #`DEL 0;

            RST_CNT8B_N <= #`DEL 0;
            RST_CNT12B_N <= #`DEL 0;
            //CNT8B <= #`DEL 8'b0;
            //CNT12B <= #`DEL 12'b0;

            if (~SMC_ONOFF_N) begin 
              // when Power Button is press
              PM_SLP_S4_N <= #`DEL 1;
              if (PM_RSMRST_PWRGD) begin
                PM_SCH_RSMRST_N <= #`DEL 1;
                ST_PM <= #`DEL S3toS0_sec1;
              end
            end
            else begin
              PM_SLP_S4_N <= #`DEL 0;
            end
        end

        S3toS0_sec1: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_RSTWARN <= #`DEL 1;

            RST_CNT12B_N <= #`DEL 1;
            //CNT12B <= #`DEL CNT12B + 1;

            if (PM_SCH_RSMRST_N & (CNT12B >= 1300)) begin // wait 40ms(1300 * 1/32.768KHz)
                PM_SLP_S3_N <= #`DEL 1;
            end
            else begin
                PM_SLP_S3_N <= #`DEL 0;
            end

            if (PM_ALL_SYS_PWRGD) begin // wait for SCH Power to become stable
                ST_PM <= #`DEL S3toS0_clr1;
            end
        end

        S3toS0_clr1: begin // the purpose of this state is the clear the counter
            RST_CNT12B_N <= #`DEL 0;
            RST_CNT8B_N <= #`DEL 0;
            //CNT12B <= #`DEL 12'b0;
            //CNT8B <= #`DEL 8'b0;

            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            PM_RSTWARN <= #`DEL 1;
            ST_PM <= #`DEL S3toS0_sec2;
        end

        S3toS0_sec2: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            PM_RSTWARN <= #`DEL 1;

            if (CNT8B >= 104) begin
                IMVP_VR_ON <= #`DEL 1;
            end
            else begin
                RST_CNT8B_N <= #`DEL 1;
                //CNT8B <= #`DEL CNT8B + 1;
            end

            if (PM_IMVP_PWRGD) begin
                RST_CNT12B_N <= #`DEL 1;
                //CNT12B <= #`DEL CNT12B + 1;
            end

            if (CNT12B >= 786) begin
                SCH_PWROK <= #`DEL 1;
                ST_PM <= #`DEL S3toS0_clr2;
            end
            else begin
                SCH_PWROK <= #`DEL 0;
            end
        end

        S3toS0_clr2: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            PM_RSTWARN <= #`DEL 1;
            IMVP_VR_ON <= #`DEL 1;
            SCH_PWROK <= #`DEL 1;
            RST_CNT12B_N <= #`DEL 0;
            RST_CNT8B_N <= #`DEL 0;
            //CNT12B <= #`DEL 12'b0;
            //CNT8B <= #`DEL 8'b0;
            ST_PM <= #`DEL S3toS0_sec3;
        end

        S3toS0_sec3: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            IMVP_VR_ON <= #`DEL 1;
            SCH_PWROK <= #`DEL 1;

            RST_CNT12B_N <= #`DEL 1;
            //CNT12B <= #`DEL CNT12B + 1;
            if (CNT12B >= 2) begin
                PM_RSTWARN <= #`DEL 0;
            end
            else begin
                PM_RSTWARN <= #`DEL 1;
            end

            if (CNT12B >= 3333) begin
                RST_N <= #`DEL 1;
                ST_PM <= #`DEL S0;
            end
            else begin
                RST_N <= #`DEL 0;
            end
        end

        S0: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            IMVP_VR_ON <= #`DEL 1;
            SCH_PWROK <= #`DEL 1;
            RST_N <= #`DEL 1;

            RST_CNT8B_N <= #`DEL 0;
            RST_CNT12B_N <= #`DEL 0;
            //CNT8B <= #`DEL 8'b0;
            //CNT12B <= #`DEL 12'b0;

            if (~SLPRDY_N & RSTRDY) begin
                ST_PM <= #`DEL S0toS3;
            end

            if (PM_SYSRST_N) begin
                PM_RSTWARN <= #`DEL 0;
            end
            else begin
                PM_RSTWARN <= #`DEL 1;
            end

            if (SLPRDY_N & SLPMODE & ~RSTRDY) begin
                ST_PM <= #`DEL warmreset;
            end
        end

        S0toS3: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;

            RST_CNT8B_N <= #`DEL 1;
            //CNT8B <= #`DEL CNT8B + 1;

            if (CNT8B >= 2) begin
                PM_RSTWARN <= #`DEL 1;
            end
            else begin
                PM_RSTWARN <= #`DEL 0;
            end
            
            if (~RSTRDY) begin
                RST_CNT12B_N <= #`DEL 1;
                //CNT12B <= #`DEL CNT12B + 1;
            end

            if (CNT12B >= 3) begin
                RST_N <= #`DEL 0;
            end
            else begin
                RST_N <= #`DEL 1;
            end

            if (CNT12B >= 4) begin
                SCH_PWROK <= #`DEL 0;
            end
            else begin
                SCH_PWROK <= #`DEL 1;
            end

            if (CNT12B >= 5) begin
                IMVP_VR_ON <= #`DEL 0;
            end
            else begin
                IMVP_VR_ON <= #`DEL 1;
            end

            if (CNT12B >= 6) begin
                PM_SLP_S3_N <= #`DEL 0;
            end
            else begin
                PM_SLP_S3_N <= #`DEL 1;
            end

            if (~PM_ALL_SYS_PWRGD) begin
                ST_PM <= #`DEL S3;
            end
        end

        S3: begin
            //PM_RSMRST_PWRGD <= #`DEL 1;
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 0;
            IMVP_VR_ON <= #`DEL 0;
            SCH_PWROK <= #`DEL 0;
            RST_N <= #`DEL 0;

            if (~SLPMODE) begin
                ST_PM <= #`DEL S4;
            end
            else begin
                if (~SMC_ONOFF_N | SLPRDY_N) begin // wakeup
                    ST_PM <= #`DEL S3toS0_sec1;
                end
            end
        end

        warmreset: begin
            PM_SLP_S4_N <= #`DEL 1;
            PM_SCH_RSMRST_N <= #`DEL 1;
            PM_SLP_S3_N <= #`DEL 1;
            IMVP_VR_ON <= #`DEL 1;
            SCH_PWROK <= #`DEL 1;

            RST_CNT12B_N <= #`DEL 1;

            if (CNT12B > 32) begin
                PM_RSTWARN <= #`DEL 0;
            end
            else begin
                PM_RSTWARN <= #`DEL 1;
            end
            
            if (CNT12B >= 1 & CNT12B <= 39) begin
                RST_N <= #`DEL 0;
            end
            else begin
                RST_N <= #`DEL 1;
            end

            if (CNT12B > 39) begin
                RST_CNT12B_N <= #`DEL 0;
                if (PM_SYSRST_N) begin
                    ST_PM = S0;
                end
            end
        end

        default: begin
          ST_PM <= #`DEL S4;
        end
      endcase
    end
end

always @(posedge SUS_CLK) begin
    if (RST_CNT8B_N) begin
        CNT8B <= #`DEL CNT8B + 1;
    end
    else begin
        CNT8B <= #`DEL 8'b0;
    end

    if (RST_CNT12B_N) begin
        CNT12B <= #`DEL CNT12B + 1;
    end
    else begin
        CNT12B <= #`DEL 12'b0;
    end
end

endmodule