Judith Providence Computer Architecture CS 654 Outline - PowerPoint PPT Presentation

Judith Providence Computer Architecture CS 654

Outline  Background/Motivation  Multi-processors  Larrabee Architecture  Performance studies  Evaluation  Conclusion 4/30/09 W&M CS 654 2

Motivation:Trends Towards Many-core Processors  Power  Growth in HPC  Decrease performance in uniprocessors Limits on Instruction-Level Parallelism Register renaming Branch prediction Jump prediction Memory address Alias Analysis Perfect caches 4/30/09 W&M CS 654 3

Larrabee:GPU or CPU?  Larrabee CPU  GPU  It supports 4 threads  PCI bus  Efficient inter-block  Only a minimum communication amount of memory  Ring network for full inter-processor available communication  Only single-  Each Larrabee core is a complete x86 core that precision floating supports point performance  Virtual memory and page swapping  Fully coherent caches at all levels 4/30/09 W&M CS 654 4

Larrabee:CPU  Larrabee a in-order many-core x86 CPU  Intel president in 2005 stated: We are dedicating all of our future product development to multi-core designs.  Multi-core processors vs. many-core processors  GPU-like capabilities 4/30/09 W&M CS 654 5

Motivation for an in-order CPU  Comparison between a modern out-of- order CPU, the Intel Core2Duo processor, and a in-order test CPU design based on the Pentium processor with a 16-wide VPUs 4/30/09 W&M CS 654 6

Multi-processors  Inter-processor Communication Inter-processor Ring Network  Computation SIMD vector processing unit, mask register  Shared Memory Coherent cached memory hierarchy, MIMD Model  Synchronization Mechanisms Semaphores, Software locks 4/30/09 W&M CS 654 7

Larrabee Architecture 4/30/09 W&M CS 654 8

Core Design of Larrabee Larrabee CPU core and associated system blocks: the CPU is derived from the Pentium processor in-order design, plus 64-bit instructions, multi-threading and a wide VPU. Each core has fast access to its 256KB local subset of a coherent 2nd level cache. L1 cache sizes are 32KB for Icache and 32KB for Dcache. Ring network accesses pass through the L2 cache for coherency. 4/30/09 W&M CS 654 9

Inter-processor Ring Network  Bi-directional  Routing decisions made before messages are placed into the network  Checks for data sharing  Provides a path for the L2 cache to access memory  Allows Fixed Function Logic agents to be accessed by the CPU cores  Scaling to more than 16 cores 4/30/09 W&M CS 654 10

Wide Vector Processing Unit  SIMD  16 lanes  Executes integer and Floating point instructions  Scatter gather supports a Maximum of 16 elements 4/30/09 W&M CS 654 11

Fixed Function Logic Unit  Used for Graphical tasks  Larrabee uses software in place of a fixed functional unit for some graphical tasks  Cores pass commands to the texture unit through the L2 cache  Texture filter logic  would be 12x to 40x longer in software 4/30/09 W&M CS 654 12

Advanced Applications  Larrabee supports irregular data structures  An efficient scatter-gather support for irregular data structures  The SIMD vector processing unit can be programmed  Intel’s auto-vectorization computer technology 4/30/09 W&M CS 654 13

Performance Study  Spectral methods/Dense Linear algebra  Data is in the frequency domain  High Performance Kernel-3D-FFT  Data that are dense matrices or vectors -BLAS-3 4/30/09 W&M CS 654 14

High Performance Computing Kernels Simulation results are based on Stanford’s PhysBam  http://physbam.standford.edu/~fedkiw  Amdahl’s Law:Speedup maximum =1/(1-fraction enhanced)  4/30/09 W&M CS 654 15

Evaluation of Larrabee for parallel applications con Memory contention - Lack of error correcting - code(ECC) memory, Graphic double data rate Shortage of double - precision floating point capability pro - Load balancing is accomplished by moving processes - Supports irregular data structures 4/30/09 W&M CS 654 16

Conclusion-Relevance of Larrabee for the Future  Amdahl’s Law - Limitations in parallelism make it difficult to achieve good speedup  1965 - Moore’s Law states that the number of transistors on a chip will double about every two years  Need a Moore’s Law to handle software  Solution: the establishment of academic communities 4/30/09 W&M CS 654 17

Judith Providence Computer Architecture CS 654 Outline - PowerPoint PPT Presentation

Judith Providence Computer Architecture CS 654 Outline Background/Motivation Multi-processors Larrabee Architecture Performance studies Evaluation Conclusion 4/30/09 W&M CS 654 2 Motivation:Trends Towards

LASTO Spring Conference March 4-6, 2015 ACT 654 Summary ACT 654 ACT 654 LASTO Suggested

Revision 10 Risk Criteria Detail Judith Anderson, DrPH, RD Judith Anderson DrPH RD Judith

Roger Williams Medical Center Providence , RI Providence, RI Introduction Overview Roger

Providence Health Assurance Medicare Advantage Plans Providence Medicare Advantage Plans 2020

City of Providence Providence Public School District Benefits Division 2020 Teacher Retiree

2015 NRTRC T elemedicine Conference Presented by: Aaron Martin Providence Health & Services

The Providence Center for Urban Leadership Development Director of Alysa Administration &

East Providence Economic Development Update Presented to the East Providence City Council

Providence Health Assurance Providence Medicare Advantage Plans 2019 Plan Year Changes Medicare

Is Wales Weaned onto a Winning Diet Judith John - Food for the Future Judith John 25 02 2016 Is

CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch

Challenges for technology transfer industry to academia Judith Bishop Director of Computer

Engaging the Campus Community in Common Reading Programs Charles Haberle Providence College 36

Providence Integrative Medicine Our approach : focus on evidence based interventions, use

T H E E X E T E R T E A M P R E S E N T A T I O N Providence Innovation District - Parcel 28

Presentation Overview Pavement Management Background C Process P ity of Providence -

Mobile Agents Rendezvous in Mesh-Networks in spite of a Malicious Agent Shantanu Das 1 , Flaminia

Leader Election in a Synchronous Ring Paulo S ergio Almeida Distributed Systems Group

Nonzero-Sum Games Among Arbitrarily Many Players John Thistle (joint work with Hadi Zibaeenejad)

Linux Kernel Issues in End Host Systems Wenji Wu, Matt Crawford US-LHC End-to-End Networking

Attractor neural networks Vi Tij, Tji Vj X U i = T ij V j Dynamics: j V i = sign( U i )

An introduction to chaining, and applications to sublinear algorithms Jelani Nelson Harvard

Process Layout and Function Calls CS 161 Spring 2016 January 25, 2016 1 / 7 Process Layout

Review addressing modes Op Src Dst Comments movl $0, %rax Register movl $0, 0x605428

Judith Providence Computer Architecture CS 654 Outline - PowerPoint PPT Presentation

Judith Providence Computer Architecture CS 654 Outline Background/Motivation Multi-processors Larrabee Architecture Performance studies Evaluation Conclusion 4/30/09 W&M CS 654 2 Motivation:Trends Towards

LASTO Spring Conference March 4-6, 2015 ACT 654 Summary ACT 654 ACT 654 LASTO Suggested

Revision 10 Risk Criteria Detail Judith Anderson, DrPH, RD Judith Anderson DrPH RD Judith

Roger Williams Medical Center Providence , RI Providence, RI Introduction Overview Roger

Providence Health Assurance Medicare Advantage Plans Providence Medicare Advantage Plans 2020

City of Providence Providence Public School District Benefits Division 2020 Teacher Retiree

2015 NRTRC T elemedicine Conference Presented by: Aaron Martin Providence Health &amp; Services

The Providence Center for Urban Leadership Development Director of Alysa Administration &amp;

East Providence Economic Development Update Presented to the East Providence City Council

Providence Health Assurance Providence Medicare Advantage Plans 2019 Plan Year Changes Medicare

Is Wales Weaned onto a Winning Diet Judith John - Food for the Future Judith John 25 02 2016 Is

CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy &amp; Patterson Ch

Challenges for technology transfer industry to academia Judith Bishop Director of Computer

Engaging the Campus Community in Common Reading Programs Charles Haberle Providence College 36

Providence Integrative Medicine Our approach : focus on evidence based interventions, use

T H E E X E T E R T E A M P R E S E N T A T I O N Providence Innovation District - Parcel 28

Presentation Overview Pavement Management Background C Process P ity of Providence -

Mobile Agents Rendezvous in Mesh-Networks in spite of a Malicious Agent Shantanu Das 1 , Flaminia

Leader Election in a Synchronous Ring Paulo S ergio Almeida Distributed Systems Group

Nonzero-Sum Games Among Arbitrarily Many Players John Thistle (joint work with Hadi Zibaeenejad)

Linux Kernel Issues in End Host Systems Wenji Wu, Matt Crawford US-LHC End-to-End Networking

Attractor neural networks Vi Tij, Tji Vj X U i = T ij V j Dynamics: j V i = sign( U i )

An introduction to chaining, and applications to sublinear algorithms Jelani Nelson Harvard

Process Layout and Function Calls CS 161 Spring 2016 January 25, 2016 1 / 7 Process Layout

Review addressing modes Op Src Dst Comments movl $0, %rax Register movl $0, 0x605428

2015 NRTRC T elemedicine Conference Presented by: Aaron Martin Providence Health & Services

The Providence Center for Urban Leadership Development Director of Alysa Administration &

CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch