Distributed Operation Layer: Efficient and Predictable KPN-Based Design Flow Iuliana Bacivarov, Wolfgang Haid, Kai Huang, and Lothar Thiele ETH Zürich, Switzerland
Efficiency vs. Predictability? Efficiency is… Predictability is… … speed -up … analyzability … scalability … guarantees … small memory … fast estimates … portability … good estimates … small effort … early in design Distributed Operation Layer (DOL): efficient and predictable system-level MPSoC design flow CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 2 Iuliana Bacivarov
Distributed Operation Layer Reduce “accidental complexity” in design by raising the level of abstraction and automation CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 3 Iuliana Bacivarov
Distributed Operation Layer Reduce “accidental complexity” in design by raising the level of abstraction and automation System specification abstract MoC (KPN) vs. BSP Performance analysis system-level (formal) analysis vs. complete system simulation Design space exploration automated system-level exploration vs. trial-and-error (Software) synthesis automated synthesis on various MPSoCs (possible due to formal MoC) CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 4 Iuliana Bacivarov
Outline Introduction Distributed operation layer design flow Specification Synthesis Design space exploration Performance analysis Some experimental results Conclusions CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 5 Iuliana Bacivarov
DOL Software System-Level Design Flow Goals Efficiency application mapping architecture specification specification specification Predictability (XML & C) (XML) (XML) calibration data back-annotation Challenges Scalable specification test & debug functional system analysis Automated synthesis simulation synthesis (HdS model generation generation) generation System-level design space exploration Analytic performance simulation on simulation on evaluation on workstation virtual platform workstation evaluation performance data Strengths design Abstraction space exploration Automation CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 6 Iuliana Bacivarov
System Specification Roles application mapping architecture specification specification specification Express data and functional (XML & C) (XML) (XML) calibration data back-annotation parallelism in application Specify mapping of application test & debug functional system analysis on target architecture simulation synthesis (HdS model generation generation) generation Challenges simulation on simulation on evaluation on workstation virtual platform workstation Scalability performance data Platform-independence design space exploration formal MoC – basis for efficient and predictable design CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 7 Iuliana Bacivarov
Programming Model Model of computation : Kahn process network Coordination: XML with performance annotations Functionality: C/C++ with specific programming DOL API CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 8 Iuliana Bacivarov
Programming Model – Scalability 01: < iterator variable="i" range="N"> 01: < process name="src"> 02: < process name="src"> 02: < port type="output" name="out"/> 03: < append function="i"/> 03: < source type="c" location="src.c"/> 04: < port type="output" name="out"/> 04: </ process > 05: < source type="c" location="src.c"/> 06: < /process > 07: < /iterator > Scalability : “iterators” for large, multi-tile descriptions CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 9 Iuliana Bacivarov
Abstract Platform Modeling Elements Structure: processors, peripherals, memories, buses, etc. Interconnect: explicit read and write communication paths Performance data: e.g. latency and bandwidth of HW communication CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 10 Iuliana Bacivarov
Abstract Platform – Scalability Specification: XML, including “iterators” capability CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 11 Iuliana Bacivarov
Mapping Specification Mapping Binding Scheduling Processes to processors Constraints SW channels to HW paths CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 12 Iuliana Bacivarov
System Synthesis Role application mapping architecture specification specification specification Close the gap between (XML & C) (XML) (XML) calibration data back-annotation system-level specification and implementation test & debug functional system analysis simulation synthesis (HdS model generation generation) generation Challenges Achieve desired performance simulation on simulation on evaluation on workstation virtual platform workstation Handle deadlocks, performance data starvation, and data races design Preserve KPN semantics space exploration automatic software synthesis – essential for efficient design CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 13 Iuliana Bacivarov
DOL Synthesis Synthesis application mapping architecture specification specification specification (XML & C) (XML) (XML) Functional synthesis calibration data back-annotation SystemC untimed, native test & debug functional system analysis execution model generation simulation synthesis (HdS model generation generation) generation Software synthesis HdS generation for MPARM, simulation on simulation on evaluation on workstation virtual platform workstation Atmel DIOPSIS, CELL performance data design space Strategy exploration Source-to-source code generators from DOL KPN to implementation Automatic generation of “glue code”: processes and channels implementation, bootstrapping, and scheduling CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 14 Iuliana Bacivarov
Functional Synthesis Automatic synthesis of DOL scheduler KPN in functional SystemC P1.fire() P2.fire() P3.fire() sc thread sc thread sc thread write() read() sc sc sc sc port port port port Synthesis sc channel sc channel DOL processes and FIFOs: SystemC threads and channels SystemC main file: bootstrapping and scheduling Features Execution: native, un-timed Debugging: standard tools, i.e., gdb Performance data extraction: monitor READ/WRITE/FIRE CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 15 Iuliana Bacivarov
DOL Software Synthesis MPARM : multi-ARM tiles connected tile DRAM ARM SP by NoC ctrl core Atmel Diopsis 940 : tile:ARM9+DSP tile tile connected by an AMBA bus; several NI x-bar tiles connected via NoC Cell BE : PowerPC and 8 SPEs connected via ring bus switch switch switch NoC SPU SPU SPU SPU SPU SPU SPU SPU SPE LS LS LS LS LS LS LS LS MFC MFC MFC MFC MFC MFC MFC MFC Element interconnect bus (EIB) Legend: L2 Cache MIC LS: Local Store MFC: Memory Flow Controller MIC: Memory Interface Controller L1 Cache PPE: Power Processor Element Main storage PPU: Power Processor Unit PPU SPE: Synergistic Processor Elements PPE Memory SPU: Synergistic Processor Unit CASA, ESWEEK – DOL: Efficient and Predictable Design Flow CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 16 16 Iuliana Bacivarov Iuliana Bacivarov
Design Space Exploration Role application mapping architecture specification specification specification Find Pareto-optimal mappings (XML & C) (XML) (XML) calibration data back-annotation of an application on target architecture test & debug functional system analysis Challenges simulation synthesis (HdS model generation generation) generation Multiple contradictory objectives simulation on simulation on evaluation on Exhaustive search not feasible workstation virtual platform workstation Instruction-accurate simulation performance data too slow for design space design exploration space exploration system-level automated design space exploration – the key element of an efficient design CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 17 Iuliana Bacivarov
Mapping Optimization Framework MPA (Modular SPEA2 (Strength Pareto Performance Analysis) Evolutionary Algorithm) http://www.mpa.ethz.ch Control & GUI: EXPO - https://www.tik.ee.ethz.ch/expo tool to explore the design space for network processor architectures Interface: PISA - https://www.tik.ee.ethz.ch/pisa Platform and language independent Interface for Search Algorithms CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 18 Iuliana Bacivarov
EXPO-PISA Illustration max. bus load 20 1 8 1 6 1 4 1 2 1 0 8 6 4 2 max. processor load 0 2 4 6 8 1 0 1 2 1 4 1 6 1 8 CASA, ESWEEK – DOL: Efficient and Predictable Design Flow 19 Iuliana Bacivarov
Recommend
More recommend