Comparison of Processor Architectures for LTE Channel Estimation - PowerPoint PPT Presentation

1 Comparison of Processor Architectures for LTE Channel Estimation Authors: Omer Anjum Teemu Pitkanen Jari Nurmi Tampere University of Technology, Finland Email: first name.last name@tut.fi) 18.10.2011

2 • Case Study: Channel Estimation for LTE with 20MHz system Bandwidth • Objective: Comparison of different processor architectures for the case study • Architectures under consideration: • COFFEE RISC • Ninesilica NoC with 9 COFFEE RISC Cores • TMS320C6416 DSP by Texas Instruments • Xentium (Run time recofigurable core by RECORE systems) • Transport Triggered Architecture (TTA) 18.10.2011

3 LTE Frame Structure 18.10.2011

4 Channel Estimation Algorithm in Brief • Good estimate of channel is necessary to correctly demodulate the symbols • Hexagonal grid type reference symbol pattern is used in our case • First logical step in channel estimation is H p = Y p / X p o Hp, Yp and Xp are channel estimate at pilot symbol, received pilot symbol and original pilot symbol • Next step is to interpolate the channel estimate at all other symbol positions using the estimates calculated at pilot positions • Interpolation technique used in our case is Cubic Interpolation • Corresponding equation for cubic interpolation for k-th subcarrier is 18.10.2011

5 where, Here is an assumption for every k-th subcarrier as follows: where, D is the adjacent pilot symbol spacing for a subcarrier and m is the largest integer smaller than k/D 18.10.2011

6 Implementation made on different processor architectures 18.10.2011

7 COFFEE RISC • General purpose embedded processor developed at Tampere University of Technology 18.10.2011

8 • This core was developed with intention to work in a conventional embedded system for telecommunication and multimedia applications or as a GP node in a NoC. • To complete our task it took almost 1,657,900 cycles • Running on Stratix-IV @181Mhz consumed 1.12 mJ • Adding a hardware logic for division operation could reduce the cycle count to 322000 18.10.2011

9 Homogeneous MPSoC • MPSoC based on nine COFFEE cores has been developed at Tampere University of Technology 18.10.2011

10 • Central node behaves as Master • Master node distributes the data in equal chunks • Data is processed • Results are returned back to the master • Speed up gained as compared to single COFFEE is almost 6x. • Number of cycles take to complete the task are almost 271577 • Running on Stratix-IV @181Mhz consumed 1.033 mJ 18.10.2011

11 Xentium by RECORE Systems • Xentium is a fixed point VLIW-DSP optimized to perform digital baseband processing tasks • The datapath consists of 10 functional units that can operate in parallel • Data memory is organized in parallel memory banks to allow simultaneous access • Xentium running on 90nm@200 consumes 175 µW/MHz • It takes almost 495,725 cycles to complete the task and should consume approximately 0.086 mJ 18.10.2011

12 TI’s TMS320C6416 DSP •TI’s fixed point VLIW-DSP processor • It accommodates two independent data paths • Four functional units (one multiplier and 3 ALUs) and 32 of 32-bit general purpose registers each • Cross communication link between Data Paths • Total number of cycles it took are 403,692 cycles • Running on 130 nm CMOS@500MHz it should consume approximately 0.161 mJ to complete the task 18.10.2011

13 TTA (Transport Triggered Architecture) • No particular instruction set architecture is defined for TTA • Based on a single instruction called “MOVE” • FU is triggered as soon as the data arrives • A typical architecture consists of several number of buses, functional units, register files and load store units • More closely resembles to a VLIW architecture • Scaling up TTA is much less complex because the functional units and interconnection network are independent of each other . 18.10.2011

14 • TTA co-design environment (TCE) allows the TTA architecture to be built and tested gradually according to the application needs • Trade-off between flexibility and performance can easily be translated by the programmer by making the right choices for the required functional units, their granularity level, other supporting units and the interconnection among the units • Highly modular structure makes it easy to scale • The channel estimation task took almost 449,736 cycles • Adding a functional unit for square root the cycle count was reduced to 144814 • Targeted TTA on 180 nm@200MHz consumes 0.091mJ to complete the task 18.10.2011

18.10.2011 15 Energy(mJ)/Task Energy(mJ)/Task TTA @200MHz(180 nm) Summary of Results TMS320C6416@500MHz(130 nm) Xentium @200MHz(90 nm) Ninesilica@180MHz(Stratix-IV) COFFEE@180MHz(Stratix-IV) 1,2 1 0,8 0,6 0,4 0,2 0 Cycle Count Cycle Count TTA (Cust. FU) TTA TTA ~ TMS320C6416 TMS320C6416 Xentium Ninesilica Single COFFEE 1,8 1,6 1,4 1,2 1 0,8 0,6 0,4 0,2 0 Millions

16 Thank You ! 18.10.2011

Comparison of Processor Architectures for LTE Channel Estimation - PowerPoint PPT Presentation

1 Comparison of Processor Architectures for LTE Channel Estimation Authors: Omer Anjum Teemu Pitkanen Jari Nurmi Tampere University of Technology, Finland Email: first name.last name@tut.fi) 18.10.2011 2 Case Study: Channel

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Cortex-A15 Processor ARMs next generation mobile applications processor Travis Lanier Senior

Ch. 5: Processor + Memory December 12, 2008 Ch. 5: Processor + Memory Overview of Implementation

Chapter 12 CPU Structure and Function Contents Processor organization Register

Processor Architecture: Current Trends A B Transfer a truckload at a time from A to B Processor

Embedded systems & the Nios II soft core processor A Nios II processor system I equivalent to

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

Refrigerated Foods Processor of the Year This award honors a refrigerated foods processor for its

Intel Case Intel Case Processor Serial Number (PSN) Processor Serial Number (PSN) 5/9/99 Group

Runahead Runahead Runahead Runahead High Level Description High Level Description

Optimization algorithms on Cell processor Vladim r T rebick y Optimization algorithms

Hardware Architecture of the Cell Broadband Engine Processor LOGO Presented by Wei Wei,

F F Fast Transforms using the Cell/B.E. Processor Fast Transforms using the Cell/B.E. Processor

Monte Carlo Processor Modeling Monte Carlo Processor Modeling of Contemporary Computer of

Gold in Education and Elite Sport Ingrid van Gelder Programme manager NOC*NSF Elite Sports

Issue #1: Collaborative, multi sector involvement in CMSP is essential ... but not easy!

Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs Ankit

Pathways to Permanent Residence for International Students Michael Frenette IRCC Outreach

Statewide High Speed Network networkMaryland TM Advisory Group Meeting May 20, 2008 Annapolis, MD

Efficiency and Programmability: Enablers for ExaScale Bill Dally | Chief Scientist and SVP ,

Late Night Noise Limitation Program Overview Sea-Tac Stakeholder Advisory Roundtable, April 24 1

Briefing to the CSMC on Noise Investigations November 5, 2019 1 Six- Month Noise Investigation

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Comparison of Processor Architectures for LTE Channel Estimation - PowerPoint PPT Presentation

1 Comparison of Processor Architectures for LTE Channel Estimation Authors: Omer Anjum Teemu Pitkanen Jari Nurmi Tampere University of Technology, Finland Email: first name.last name@tut.fi) 18.10.2011 2 Case Study: Channel

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Cortex-A15 Processor ARMs next generation mobile applications processor Travis Lanier Senior

Ch. 5: Processor + Memory December 12, 2008 Ch. 5: Processor + Memory Overview of Implementation

Chapter 12 CPU Structure and Function Contents Processor organization Register

Processor Architecture: Current Trends A B Transfer a truckload at a time from A to B Processor

Embedded systems &amp; the Nios II soft core processor A Nios II processor system I equivalent to

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

Refrigerated Foods Processor of the Year This award honors a refrigerated foods processor for its

Intel Case Intel Case Processor Serial Number (PSN) Processor Serial Number (PSN) 5/9/99 Group

Runahead Runahead Runahead Runahead High Level Description High Level Description

Optimization algorithms on Cell processor Vladim r T rebick y Optimization algorithms

Hardware Architecture of the Cell Broadband Engine Processor LOGO Presented by Wei Wei,

F F Fast Transforms using the Cell/B.E. Processor Fast Transforms using the Cell/B.E. Processor

Monte Carlo Processor Modeling Monte Carlo Processor Modeling of Contemporary Computer of

Gold in Education and Elite Sport Ingrid van Gelder Programme manager NOC*NSF Elite Sports

Issue #1: Collaborative, multi sector involvement in CMSP is essential ... but not easy!

Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs Ankit

Pathways to Permanent Residence for International Students Michael Frenette IRCC Outreach

Statewide High Speed Network networkMaryland TM Advisory Group Meeting May 20, 2008 Annapolis, MD

Efficiency and Programmability: Enablers for ExaScale Bill Dally | Chief Scientist and SVP ,

Late Night Noise Limitation Program Overview Sea-Tac Stakeholder Advisory Roundtable, April 24 1

Briefing to the CSMC on Noise Investigations November 5, 2019 1 Six- Month Noise Investigation

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Embedded systems & the Nios II soft core processor A Nios II processor system I equivalent to