outline of tutorial soc architectures for hardware
play

Outline of Tutorial SoC Architectures for Hardware Designers - PowerPoint PPT Presentation

Outline of Tutorial SoC Architectures for Hardware Designers Technology opportunities and limits What is a System-on-a-Chip SoC 3rd International Seminar on Application-Specific Multi-Processor SoC 7 - 11 July 2003, Hotel Alpina,


  1. Outline of Tutorial SoC Architectures for Hardware Designers • Technology opportunities and limits • What is a System-on-a-Chip – SoC 3rd International Seminar on Application-Specific Multi-Processor SoC 7 - 11 July 2003, Hotel Alpina, Chamonix, France Trevor Mudge Bredt Professor of Engineering The University of Michigan, Ann Arbor http://www.eecs.umich.edu/~tnm Silicon is the Engine: Andy Grove’s Address at Dec. 2002 Where is technology heading? IEEE International Electron Devices Meeting source: Intel source: Intel 1

  2. Where is technology heading? Flavors of Integrated Circuits • Digital – signals are quantized to 2 levels – permits “infinite” precision – microprocessors etc. • DRAM – dynamic random access memory – variant of above specialized for high density • Analog – value of voltage models quantity exactly – low precision – only use when digital is not feasible – radio receivers and transmitters • Difficult to mix any two in one die source: Intel Limits Limits: Moore’s Law • Moore’s Law – the number of transistors on a given chip can be doubled every two years • Moore’s Law – principle of progress in electronics and computing since Moore first formulated the famous dictum in 1965 – for the same amount of time, people have predicted it would hit a wall. • Power • Future Generations of Si Technology • Mask Cost – double density = reduce line width by 0.7x – 130nm � 90nm � 60nm � 45nm � 30nm • Complexity – 2 or 3 years between generations – ~10 ± 2 Years • Return on Investment – after 2015 – paradigm shift to a non-Si technology – be careful about betting on that • Moore’s law no limits for next 10 years 2

  3. Limits: Power Power: The Current Battleground • It’s not just transistor density that has grown exponentially …. source: Intel Total Power of CPUs in PCs Low power has other implications … • Low power has been the technology that defines I992 – 90M CPUs @ 1.8W = 180MW mainstream computing technology today – 500M CPUs @ 18W = 10,000MW – Vacuum tubes → silicon Four Hoover dams – TTL → CMOS – microprocessors • 1950’s “supercomputers” created the technology • 1980’s supercomputer are the beneficiaries of microprocessor technology 3

  4. Limits: Mask Cost What hasn’t followed Moore’s Law Unit: K pcs, 8" Equivalent Wafer semiconductor manufacturer Installed and expected fab capacity from a leading 8000 8000 N65 N65 7000 7000 • Batteries have only N90 N90 6000 6000 0.13um 0.13um improved their power 5000 5000 0.15um 0.15um capacity by about 4000 4000 0.18um 0.18um 3000 3000 5% every two years 0.25um 0.25um 2000 2000 0.35um 0.35um 1000 1000 0.5um+ 0.5um+ 0 0 2002 2002 2003 2003 2004 2004 2005 2005 2006 2006 • Today greatest volume in 0.25, followed by 0.18 and 0.15 • Next year perhaps 0.13 processes • Older processes do not just disappear… Limits: Mask Cost Limits: Complexity • Problems include • Closer to leading edge � higher cost – design time and effort masks – validation and test • Volume is necessary • Hardware – often means more programmable to achieve – SoC of previously defined parts volume • Software • If application specific ness limits volume – bigger challenge � older process – 10x hardware costs – why run-time reconfigurable hardware may not be a good idea 4

  5. Limits: Return on Investment Fabless IP Providers • Business model is based • Return on investment of fabs upon the development – Mid 60’s < $1M and sale and/or licensing – Mid 70’s $3M of pre-defined, fully- characterized, – Early 90’s $1B semiconductor functional – ’02 $3B cores • In 2002, increased by – 2010 $??B 8.4% from 2001's $698.4 • Different business models million – separate design and fab • Forecast to reach $1,503.3 million by end 2007 What is An SOC? What is A Platform? • Its that part of a platform that can be • A programmable collection of digital cost-effectively integrated onto one components targeted to a class of chip applications • Why not the whole thing? • Platforms are usually complete • Because: Analog and DRAM enough to load and boot an OS 5

  6. Four Examples How Does a Platform Get Defined? • Someone has an idea, sells it to a large tier-one OEM • Texas Instruments OMAP 1510 • If the OEM thinks it's a good idea they ask their platform providers (i.e., ST and TI) to include that functionality in • STM Nomadik their platforms • That someone with an idea of course could be: ARM • Intel PXA800F with Jazelle, Nokia (i.e. an OEM), or ST with a coprocessor idea • PDA/Communicator – University of • Typically the tier-one OEM limits ST or TI from selling Michigan the platform to anyone else in the same form • The resulting ASSP (application specific standard part) • Common features that gets defined is slightly modified • Another view: – tier-one OEMs get all the bits they really want in a platform – tier-two OEMs are usually satisfied with something that almost does that job and is cheap TI: OMAP TI: Nomadik 6

  7. Intel: PXA800F PXA800F Commonality: PDA/Communicator Heterogeneous Multiprocessors I-cache C6200 DSP D-cache • Control processor IMMU SA-1110 I-cache • “Data plane” processor Integer D-cache Pipeline DMMU • Analogous to the control and data of a FPA Space Manager program – not a pure separation either iPAQ - like PIC RAM RTC • Data plane � digital signal processor Flash DMA • Other components are usually small but SER0 PCMCIA essential ingredients if OS is to be booted I/O Mgr GPIO or to interface to the external world console Platform Config = implemented = in development 7

  8. Major Components Why Standard Part Processors • Interconnect – current architectural paradigm uses buses – AMBA • Software (10x hardware) • Control processors – standard general purpose processors • Tool chain – more software – 1-2 generations behind state-of-the-art architecture • Data plane processors – standard DSPs Interconnect: Buses Open Standard Bus: AMBA • Advanced Microprocessor Bus Architecture • What is a bus? • A definition of a set of signals for broadcasting signals • On-chip bus proposed by ARM • Strengths – inexpensive support for many-to-many connections provided they don’t • Very simple protocol overlap in time – multidrop • High bandwidth bus • Weakness – AHB – Advanced High-performance Bus – bandwidth limitation – high drive needs – AXI protocol • Future alternatives – point-to-point communication • Low bandwidth bus • essential for streaming data • Network on a chip – APB – Advanced Peripheral Bus – leverage existing communications technology • Next generation high performance bus – need to simplify 8

  9. On-Chip Bus (OCB) AMBA AHB Features • Burst transfers • Interconnect components inside a single chip • Split Transactions • Single cycle bus master handover • Single clock edge operation • Non-tristate implementation • Wide data bus configurations supported – 64/128 bits AMBA APB Features AMBA AXI Features • Low power • Separate Address / Control and data phases • Latched address and control • Supports Unaligned data transfers • Simple interface • Burst-based Transactions • Suitable for many peripherals • Separate read / write channels for DMA • No wait state allowed • Ability to issue Multiple outstanding Addresses • No burst transfers • Out-of-order Transaction Completion • No arbitration (bridge the only master) • Easy Addition of Register Stages • No pipelined transfer • No response signal 9

  10. Processors Architectural Approaches to Parallelism • Control-type • Process level parallelism – parallelism – Homogeneous • Tessellations of processors – ARM processors • MMP – Initially thought of as a low power solution • SMPs • Data plane – Heterogeneous – Texas Instruments TMS32C6200 • SOC – Early DSP vendor – libraries & solutions • Control processor and application specific processors Pros and Cons Architectural Approaches to Parallelism • Instruction level parallelism • Superscalar – Pipelining and multiple instruction issue – Pros: run-time parallelism detected – Cons: complex and consumes area and • Superscalar processors energy – Hardware detects dependencies • VLIW – Responsible for scheduling instructions – Pros: simple hardware • VLIW processors – Cons: software is much more complex – No hardware overhead – Parallelism detected in software 10

Recommend


More recommend