Blackfin Processor Architecture Processor Architecture Blackfin - PowerPoint PPT Presentation

Graduate Institute of Electronics Engineering, NTU Blackfin Processor Architecture Processor Architecture Blackfin Instructor: Prof. Andy Wu ACCESS IC LAB

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Introduction Blackfin Processor Blackfin Processor Product Highlights

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Berkeley incorporated a Reduced Instruction Set Computer (RISC) architecture It has the following key features: A fixed (32 - bit) instruction size with few formats CISC processors typically had variable length instruction sets with many formats A load store architecture were instructions that process data operate only on registers and are separate from instructions that access memory CISC processors typically allowed values in memory to be used as operands in data processing instructions A large register bank of thirty - two 32 - bit registers, all of which could be used for any purpose, to allow the load - store architecture to operate efficiently CISC register sets were getting larger, but none was this large and most had different registers for different purposes

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Hard-wired instruction decode logic CISC processor used large microcode ROMs to decode their instruc tions Pipelined execution CISC processors allowed little, if any, overlap between consecut ive instructions (though they do now) Single-cycle execution CISC processors typically took many clock cycles to completes a single instruction

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Single memory space for program and data Shared global bus

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Separate program and data memory spaces Usually refer to separate program and data buses

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Program bus can be use for coefficient loading for MAC

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Introduction Blackfin Processor Blackfin Processor Product Highlights

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Made by Analog Devices Coporation A new breed of embedded media processor designed specifically for today s embedded audio, video and communication applications. Combine a 32-bit RISC-like instruction set and dual 16-bit multiply accumulate (MAC) signal processing functionality Perform equally well both in signal processing and control processing applications-in many cases deleting the requirement for separate heterogeneous processors

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Two 16-bit MACs, two 40-bit ALUs, four 8-bit Video ALUs Support for 8/16/32-bit integer and 16/32-bit fractional data types Concurrent Fetch of One instruction and two unique data elements Two loop counters that allow for nested zero-overhead looping A Modified Harvard architecture in combinational with a hierarchical memory

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Arbitrary bit and bit field manipulation, insertion and extraction Two data address generator (DAG) units with circular and bit-reversed addressing Data address generator contains two 32 - bit address ALUs and an address register file Address register file consists of six 32 - bit general purpose pointer registers and four 32 - bit circular buffer addressing registers

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Unified 4GB memory space Mixed 16/32-bit instruction encoding for best code density Memory protection for support of OS operation

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Three modes of operation User mode User mode has restricted access to a subset of system resources, thus providing a protected software environment User mode is considered the domain of application programs Supervisor mode and Emulation mode Supervisor mode and Emulation mode have unrestricted access to t he core resources Supervisor mode and Emulation mode are usually reserved for the kernel code of an operating system

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Blackfin Architecture Support Architecture Support Blackfin (Single Cycle ) (Single Cycle ) Possibility of the following parallel operations processed in one clock cycle Execution of a single instruction operating on both MACs or ALUs Execution of a 2 x 32 - bit data moves 2 reads or 1 read/1 write Execution of two pointer updates Execution of hardware loop updates

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Blackfin Processor Compute Unit Processor Compute Unit Blackfin

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU BF533 Memory Access BF533 Memory Access Under the right conditions 4 memory accesses at same time 64 bit Instruction Fetch, 2x32 bi t Data Loads, 32 bit Data Store PLUS up to 2 ALU(32 bit) and 2 MAC(16 bit) operations at the sam e time PLUS background DMA activity

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Compute Unit Architecture Compute Unit Architecture

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Register File Register File Data Register Syntax R0, R1 etc. refer to 32 bit registers R0.L refers to the low 16 bits of the R0 32 bit reg R0.H refers to the high 16 bits of the R0 register 8 x 32 bit Accumulator Syntax A0.L => low 16 bits OR A0.H => next 16 bits 16 x 16 bit A0.W => least significant 32 bit word A0.X => MS 8 bit extension SHARC 16 32-bit data registers, 2 x 40 bit integer and float. There is a pair of accumulators SHARC accumulator registers too

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Blackfin SHARC 68K R0 = R1 + R2; R0 = R1 + R2; MOVE.L R2, R0 ADD.L R1, R0 R0.L = R1.L + R2.H; MOVE.W R2, R0 ADD.W R1, R0 Closest MOVE.L R2, R0 R0 = R1 +| - R2; R0 = R1 + R2, ASR.L #16, R0 Means R4 = R1 R2; MOVE.L R1, R3 ASR.L #16, R3 R0.L = R1.L R2.L ADD.W R3, R0 in parallel with ASL.L #16, R0 R0.H = R1.H + R2.H MOVE.W R2, R0 ADD.W R1, R0

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU A & B registers must stay on the same side of the | for both Instruction For dual and quad 16 bit operations the (CO) option causes the destination registers to cross

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Multiplies are signed fractional by default Signed fractional multiply result is automatically left shifted 1 bit Signed fractional multiply != signed integer multiply Rounding available on fractional number multiplies and special option of integer number multiplies

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Two cases Rounding adds 0x8000 to the 32 bit multiplier result or accumula tor value before extracting a 16 bit value to the destination register too

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU When extracting a 16 bit fractional value from an accumulator the high 16 bits is taken Where in the destination register it goes depends on which accumulator is being extracted from

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU When extracting a 16 bit integer value from an accumulator the low 16 bits is taken Where in the destination register the 16 bit value goes depends on which accumulator is being extracted from

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU In general there are 16 and 32 bit versions of the arithmetic instructions Most of the 32 bit instructions can be executed in parallel with 2 x 16 bit memory/index operations Exceptions are DIVS, DIVQ and MULTIPLY with 32 bit operands || means parallel Examples: A1=R2.L*R1.L,A0=R2.H*R1.H||R2.H=W[I2++] || [I3++]=R3; \ R2=R2+|+R4, R4=R2 -|- R4 || I0+=M0||R1=[I0];

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Blackfin Processor Processor Blackfin Memory Architecture Memory Architecture

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU A single, unified 4G byte address space using 32-bit addresses The L1 memory system is the primary highest performance memory available to the core and is faster than L2 memory system The L2 memory system is off-chip and have longer access latencies

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Blackfin Processor Peripherals Processor Peripherals Blackfin

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Parallel Peripheral Interface (PPI) Serial Ports (SPORTs) Serial Peripheral Interface (SPI) General-purpose timers Universal Asynchronous Receiver Transmitter (UART) Real-Time Clock (RTC) Watchdog timer General-purpose I/O (programmable flags)

Blackfin Processor Architecture Processor Architecture Blackfin - PowerPoint PPT Presentation

Graduate Institute of Electronics Engineering, NTU Blackfin Processor Architecture Processor Architecture Blackfin Instructor: Prof. Andy Wu ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Introduction

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Processor Architecture: Current Trends A B Transfer a truckload at a time from A to B Processor

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Assembly Language Programming Processor architecture Zbigniew Jurkiewicz, Instytut Informatyki UW

Embedded systems & the Nios II soft core processor A Nios II processor system I equivalent to

CS356 : Discussion #14 Processor Architecture Marco Paolieri (paolieri@usc.edu) Illustrations

MIPS Architecture An Example: MIPS Example: subset of MIPS processor architecture From the

Cortex-A15 Processor ARMs next generation mobile applications processor Travis Lanier Senior

Ch. 5: Processor + Memory December 12, 2008 Ch. 5: Processor + Memory Overview of Implementation

Chapter 12 CPU Structure and Function Contents Processor organization Register

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

Hardware Architecture of the Cell Broadband Engine Processor LOGO Presented by Wei Wei,

MIPS Architecture w Example: subset of MIPS processor architecture n Drawn from Patterson

MIPS Architecture Example: subset of MIPS processor architecture Drawn from Patterson

Processor Pipeline Instructor: Nima Honarmand Spring 2015 :: CSE 502 Computer Architecture

T o w a a r r d d s s B B e e t t t e r C C o o d d e e G G e e n

Evaluating SFI for a CISC Architecture by Stephen McCamant and Greg Morrisett USENIX Security

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE Debdeep Mukhopadhyay, CSE, IIT Kharagpur The

Interface between CISC and Detector Facilities A. Cervera S. Gollapinni Contents of interface

Compilers and computer architecture: The RISC-V architecture Martin Berger 1 November 2019 1

Reduced Instruction Set Computers Raul Queiroz Feitosa Parts of these slides are from the

Realization of Random Forest for Real-Time Evaluation through Tree Framing Sebastian Buschjger,

CS 31: Intro to Systems ISAs and Assembly Martin Gagn Swarthmore College February 7, 2017

Blackfin Processor Architecture Processor Architecture Blackfin - PowerPoint PPT Presentation

Graduate Institute of Electronics Engineering, NTU Blackfin Processor Architecture Processor Architecture Blackfin Instructor: Prof. Andy Wu ACCESS IC LAB ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Introduction

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Processor Architecture: Current Trends A B Transfer a truckload at a time from A to B Processor

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Assembly Language Programming Processor architecture Zbigniew Jurkiewicz, Instytut Informatyki UW

Embedded systems &amp; the Nios II soft core processor A Nios II processor system I equivalent to

CS356 : Discussion #14 Processor Architecture Marco Paolieri (paolieri@usc.edu) Illustrations

MIPS Architecture An Example: MIPS Example: subset of MIPS processor architecture From the

Cortex-A15 Processor ARMs next generation mobile applications processor Travis Lanier Senior

Ch. 5: Processor + Memory December 12, 2008 Ch. 5: Processor + Memory Overview of Implementation

Chapter 12 CPU Structure and Function Contents Processor organization Register

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

Hardware Architecture of the Cell Broadband Engine Processor LOGO Presented by Wei Wei,

MIPS Architecture w Example: subset of MIPS processor architecture n Drawn from Patterson

MIPS Architecture Example: subset of MIPS processor architecture Drawn from Patterson

Processor Pipeline Instructor: Nima Honarmand Spring 2015 :: CSE 502 Computer Architecture

T o w a a r r d d s s B B e e t t t e r C C o o d d e e G G e e n

Evaluating SFI for a CISC Architecture by Stephen McCamant and Greg Morrisett USENIX Security

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE Debdeep Mukhopadhyay, CSE, IIT Kharagpur The

Interface between CISC and Detector Facilities A. Cervera S. Gollapinni Contents of interface

Compilers and computer architecture: The RISC-V architecture Martin Berger 1 November 2019 1

Reduced Instruction Set Computers Raul Queiroz Feitosa Parts of these slides are from the

Realization of Random Forest for Real-Time Evaluation through Tree Framing Sebastian Buschjger,

CS 31: Intro to Systems ISAs and Assembly Martin Gagn Swarthmore College February 7, 2017

Embedded systems & the Nios II soft core processor A Nios II processor system I equivalent to