Interconnect Delay Aware RTL Verilog Bus Architecture Generation for an SoC Kyeong Ryu, Alexandru Talpasanu, Vincent Mooney and Jeffrey Davis School of Electrical and Computer Engineering Georgia Institute of Technology August 2004
Outline • Introduction • Interconnect Delay Estimation • Interconnect Aware Module Generation • BusSynth Overview • Application Example • Conclusion
Introduction User Options • A methodology to generate a custom bus architecture using accurate estimations of Bus Synthesis Tool Bus Synthesis Tool (BusSynth) interconnect delay (BusSynth) – Easy and quick design of an SoC bus system – Fast design space exploration across performance influencing factors – Development of a bus synthesis tool (BusSynth) – Register-transfer level HDL output based on user options and interconnect delay
Related Work • Shin et al . (’04), “Fast Exploration of Parameterized Bus Architecture for Communication-Centric SoC Design” [5] – A single type of bus topology Thepayasuwan et al. (’04), “ Layout Conscious Bus • Architecture Synthesis for Deep Submicron Systems on Chip” [6] – A single type of bus topology • BusSynth – A variety of bus types including multiple and heterogeneous type – Interconnect delay aware bus generation
Bus Synthesis (BusSynth) Overview BUS GENERATION TOOL INPUT User options BusSynth SYNTHESIZABLE VERILOG HDL CODE Interconnect Delay Interconnect Delay Estimation Estimation LIBRARIES Floorplan Floorplan Design Design
Interconnect Length Estimation Legend Bus Interconnect Memory Bus Interface (MBI) MPC755 MPC75 MPC755 MPC755 Bus Arbitrer PE PE 2 2 PE1 PE1 CPU Bus Interface (CBI) MPC755 MPC755 PE3 PE3 SRAM SRAM SRAM SRAM MPC755 MPC755 ** TSMC 0.25 µm Design Rules PE4 PE4 (a) Estimated Floorplan (b) Interconnect length estimation
Interconnect Model Parameters Ca = MOSIS fringe capacitance Cb = MOSIS area capacitance M3 M3 Coupling capacitance effects explained in M2 technical report M2 [11] R1 Cb M1 Ra Ca M1 Ra = MOSIS sheet resistance substrate R1/n R1/n R1/n R1/n R1/n R1/n R1/n R1/n C1 C1 C1 C1 C1 C1 C1 C1 n n n n n n n n
Accurate Interconnect Delay Estimation Legend Bus Interconnect Memory Bus Interface (MBI) MP MPC755 C755 MP MPC755 55 Bus Arbitrer PE 2 PE 2 PE1 PE1 CPU Bus Interface (CBI) MPC MPC755 755 PE3 PE3 SRAM SRAM SR SR AM AM MPC755 MPC 755 PE4 PE4 HSPICE Floorplan Bus HSPICE Code simulator Interconnect Length Generation Tool Calculation MOSIS Process Interconnect Delay Parameters Calculation for Each Bus Segment [MOSIS website]
A Bus System Example : General Global Bus Architecture (GGBA) Note BAN: Bus Access Node, PE: Processing Element, CBI: CPU Bus Interface MBI: Memory Bus Interface
Memory Bus Interface (MBI) Module Generation 1 • One of effects of interconnect delay insertion in an SoC: memory access cycle • Memory controller to adapt delay clocks due to interconnect delay aack_bars sram address ta_bars MBI sram_ data address (delay SRAM PowerPCs cs_bar data we_bar info.) control signals re_bar
Memory Bus Interface (MBI) Module Generation 2 (a) Estimated total delay of paths between each PE and a shared memory (b) Number of clock delays in data paths
MBI and Bus System Generation • Memory Bus Interface (MBI) module generation BusSynth User Option Input Input of For each Bus Subsystem interconnect delays For each BAN Module Generation Calculation of the number of clocks to be inserted Bus Access Node (BAN) Generation Extraction of MBI module Bus Subsystem Generation Module from Module Library Library N # of Subsystem > 1 Update of memory access Y Wire delay parameters Bus System Generation Library in an MBI module Synthesizable Verilog HDL code (a) Sequence of MBI Generation (b) Bus System Generation* Reference*: K. Ryu and V. Mooney, “Automated Bus Generation for Multiprocessor SoC Design,” Design, Automation and Test in Europe (DATE'03), pp. 282-287, March 2003.
A Bus System Generation Example User Input List 1. Bus System: # of Bus Subsystems = 1 // Skipped BusSynth Bus System .up_dataout(dataout_up_2[FIFO_D_WIDTH-1:0]), 2. Bus Subsystem: # of BANs = 5 .up_gen_int(gen_int_up_2), Bus Subsystem User Option Input User Option Input 3. Bus Properties: .up_isr0_ctlhi(isr0_ctlhi_up_2), User Option Input User Option Input - Bus Subsystem : address bus width = 32 and data .up_isr0_ctllo(isr0_ctllo_up_2), BAN1 BAN2 BAN3 BAN4 .dn_datain(datain_up_3[FIFO_D_WIDTH-1:0]), bus width: 64 .reb_dn(reb_up_3), 4. BAN Properties: For each Subsystem For each Subsystem 1 MPC755 MPC755 MPC755 MPC755 .web_dn(web_up_3), MPC755 MPC755 MPC755 MPC755 For Bus Subsystem .fifo_area_dn(fifo_area_up_3) - BAN1: CPU Type = MPC755, non-CPU Type = None ); Bus Access Node 3 (BAN3) Bus Access Node 4 (BAN4) Bus Access Node 1 (BAN1) Bus Access Node 2 (BAN2) Bus Access Node 5 (BAN5) Bus Access Node Bus Access Node Bus Access Node 2 (BAN2) Bus Access Node 3 (BAN3) Bus Access Node 1 (BAN1) Bus Access Node 5 (BAN5) Bus Access Node 4 (BAN4) Bus Access Node Bus Access Node and # of global and local memories = 0 endmodule Generation Generation Generation Generation Generation Generation Generation CBI_ CBI_ CBI_ CBI_ Generation Generation Generation Generation Generation Generation Generation CBI_ CBI_ CBI_ CBI_ - BAN2: CPU Type = MPC755, non-CPU Type = None MPC755 MPC755 MPC755 MPC755 MPC755 MPC755 MPC755 MPC755 module BusSystem(sysrstb, sysclk); and #s of global and local memories = 0 input sysrstb; - BAN3: CPU Type = MPC755, non-CPU Type = None Bus Subsystem Generation Bus Subsystem Generation Bus Subsystem Generation Bus Subsystem Generation input sysclk; Bus Subsystem Generation Bus Subsystem Generation Bus Subsystem Generation Bus Subsystem Generation and #s of global and local memories = 0 // Skipped Module - BAN4: CPU Type = MPC755, non-CPU Type = None MBI_ Library Arbiter MBI_ SubSys_GGBA SubSystem( Arbiter and #s of global and local memories = 0 SRAM N SRAM .sysrstb(sysrstb), # of Subsystem > 1 # of Subsystem > 1 # of Subsystem > 1 - BAN5: CPU Type = None , non-CPU Type = None, .sysclk(sysclk) # of global memories = 1, and # of local memories = 0 // Skipped Y SRAM 5. Memory Properties: ); SRAM Wire - BAN5: Type = SRAM, address bus width = 21 and Bus System Generation Bus System Generation Bus System Generation Bus System Generation Bus System Generation BAN5 Bus System Generation Bus System Generation Bus System Generation Bus System Generation Bus System Generation endmodule Library data bus width = 64 Synthesizable Synthesizable Synthesizable Synthesizable Verilog HDL code Verilog HDL code Verilog HDL code Verilog HDL code
Application Example • Orthogonal Frequency Division Multiplexing (OFDM) Transmitter, a wireless algorithm • Function assignment and their processing
Experimental Setup BUS GENERATION TOOL SIMULATION ENVIRONMENT INPUT SEAMLESS VCS XRAY User options CVE BusSynth GCC USER SYNTHESIZABLE C-CODE VERILOG HDL CODE SYNTHESIS ENVIRONMENT Interconnect Interconnect Delay Estimation LIBRARIES Delay Estimation DESIGN COMPILER Floorplan Floorplan Design Design Note: VCS and Design Compiler from Synopsys, Seamless CVE and Xray from Mentor Graphics and GCC from GNU
Three Configurations of GGBA for Performance Comparison – GGBA I - (NO WIRE MODEL) GGBA I is a GGBA system with no regard to interconnect delay on the bus – GGBA II - (ACCURATE WIRE MODEL) GGBA II is a GGBA system that works with different estimated interconnect delays on the shared bus – GGBA III - (WORST-CASE WIRE MODEL) GGBA III is a GGBA system that operates with a maximum estimated delay on all connections between PEs and a shared memory
Memory Bus Interface (MBI) Module Generation 2 (a) Estimated total delay of paths between each PE and a shared memory (b) Number of clock delays in data paths
Baseline Comparison Results
Baseline Baseline Baseline Comparison Results
Conclusion • Interconnect delay is a major concern as feature size is scaled down • Interconnect delay estimation from floorplan • Memory Bus Interface (MBI) module and Bus System generation • Performance improvement due to interconnect delay aware design • In an OFDM transmitter example, 35.3% reduction in execution time against GGBA III
Any Questions ?
Recommend
More recommend