CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 - PowerPoint PPT Presentation

CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 Bhaskaran Raman Department of CSE IIT Kanpur http://web.cse.iitk.ac.in/~cs422/index.html

Announcements ● Course web-page is up http://web.cse.iitk.ac.in/~cs422/index.html ● Lecture scribe notes: – HTML please – lec-notesXY-1.html or lec-notesXY-2.html – Images in directory “images/” ● lecXY-1-anything.ext or lecXY-2-anything.ext – Please email to one of the TAs ● Extra classes?

Topics so far... ● Quantifying computer performance ● Amdahl's law ● Performance equation, CPI ● Effect of cache misses on CPI ● This week: – Instruction Set Architecture (ISA) – Pipelining: concept and issues

Instruction Set ● Instruction set is the interface between hardware and software ● Interface design Software – Central part of any system design Interface – Allows abstraction/independence (Instruction set) – Challenges: ● Should be easy to use by the layer Hardware above ● Should allow efficient implementation by the layer below

Instruction Set Architecture (ISA) ● Main focus of early designs (1970s, 1980s) ● Mutual dependence between ISA design and: – Machine organization ● Example: caches – Higher level languages and compilers (what instructions do they want?) – Operating systems ● Example: atomic instructions, paging...

The Design Space Operand(s) Result operand Instruction 1 What operations? How many 2 e.g. add, sub, and explicit operands? e.g. 0, 1, 2, 3 Type and size of operand 5 Non-memory 3 e.g. word, decimal operands from where? e.g. stack, register Memory-operand access modes 4 e.g. direct, indexed Other design choices: determining branch conditions, instruction encoding

Classes of ISAs Register- Register- Stack Accumulator register memory Push A Load A Load R1, A Push B Load R1, A Add B Load R2, B Add Add R1, B Store C Add R3, R1, R2 Pop C Store C, R1 Store C, R3 Memory- ● Those which use registers are also called memory General-Purpose Register (GPR) architectures ● Register-register also called load-store Add C, A, B

GPR Advantages ● Registers faster than memory ● Code density improves ● Easier for compiler to use – Hold variables – Expression evaluation – Passing arguments

Spectrum of GPR Choices ● Choices based on – How many memory operands allowed – How many total operands Number of memory Maximum number of Examples addresses operands allowed 0 3 SPARC, MIPS, PowerPC 1 2 80x86, Motorola 2 2 VAX 3 3 VAX

Memory Addressing ● Little-endian versus 0x00...0 Big-endian ● Aligned versus non- MSB LSB aligned access of memory units > 1 byte LSB MSB – Misaligned ==> more memory cycles for access 0xff...f Big Endian Little Endian

Addressing Modes Addressing mode Example Meaning Immediate Add R4, #3 R4 <-- R4 + 3 Register Add R4, R3 R4 <-- R4 + R3 Direct or absolute Add R1, (1001) R1 <-- R1 + M[1001] Register deferred Add R4, (R1) R4 <-- R4 + M[R1] or indirect Displacement Add R4, 100(R1) R4 <-- R4 + M[100+R1] Indexed Add R3, (R1+R2) R3 <-- R3 + M[R1+R2] Auto-increment Add R1, (R2)+ R1 <-- R1 + M[R2]; R2 <-- R2 + d; Auto-decrement Add R1, –(R2) R2 <-- R2 – d; R1 <-- R1 + M[R2] Scaled Add R1, 100(R2)[R3] R1 <-- R1 + M[100+R2+R3*d] Memory indirect or Add R1, @(R3) R1 <-- R1 + M[M[R3]] memory deferred

Usage of Addressing Modes 55.00% 50.00% Frequency of addressing mode 45.00% TeX 40.00% Spice 35.00% Gcc 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% Register Memory Immediate Displacement Scaled deferred indirect

How many Bits for Displacement? 27.50% 25.00% 22.50% Integer average Percentage of cases Floating-point average 20.00% 17.50% 15.00% 12.50% 10.00% 7.50% 5.00% 2.50% 0.00% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Num. bits needed for displacement value

How many Bits for Immediate? 50.00% 45.00% TeX 40.00% Percentage of cases spice 35.00% gcc 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% 0 5 10 15 20 25 30 35 Number of bits needed for immediate

Type and Size of Operands Double word Word Half word Byte 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% Frequency of reference Integer average Floating point average

Summary so far ● GPR is better than stack/accumulator ● Immediate and displacement most used memory addressing modes ● Number of bits for displacement: 12-16 bits ● Number of bits for immediate: 8-16 bits ● ● Next: what operations in instruction set?

Deciding the Set of Operations 80x86 Integer instruction average Load 22.00% Conditional 20.00% branch Compare 16.00% Store 12.00% Add 8.00% AND 6.00% Sub 5.00% Move reg-reg 4.00% Call 1.00% Return 1.00% Total 95.00% Simple instructions are used most!

Instructions for Control Flow Integer average Floating-point average Call/return Jump Conditional branch 0.00% 20.00% 40.00% 60.00% 80.00% 100.00% Frequency of control flow instructions

Design Issues for Control Flow Instructions ● PC-relative addressing – Useful since most jumps/branches are nearby – Gives position independence (dynamic linking) ● Register indirect jumps – Useful for many programming language features – Case statements, virtual functions, dynamic libraries ● How many bits for PC displacement? – 8-10 bits are enough

What is the Nature of Compares? Integer average Floating-point average “<, >=” 50% of integer comparisons are with ZERO! “>, <=” "==, !=” 0.00% 20.00% 40.00% 60.00% 80.00% 100.00% Frequency of type of compare

Compare and Branch: Single Instruction or Two? ● Condition Code: set by ALU – Advantage: simple, may be free – Disadvantage: extra state across instructions ● Condition register: test any register with result of comparison – Advantage: simple – Disadvantage: uses up a register ● Compare and branch: – Advantage: lesser instructions – Disadvantage: too much work in an instruction

Managing Register State during Call/Return ● Caller save, or callee save? – Combination of the two is possible ● Beware of global variables in registers!

Instruction Encoding Issues ● Need to encode: operation, and addressing mode of each operand – Opcode is used for encoding operation – Simple set of addressing modes ==> can encode addressing mode also in opcode – Else, need address specifier per operand! ● Challenges in encoding: – Many registers and addressing modes – But, also minimize average instruction size – Encoding should be easy to handle in implementation (e.g. multiple of bytes)

Styles of Encoding Opcode Address-1 Address-2 Address-3 Fixed (e.g. DLX, MIPS, PowerPC) Addr. Addr. Opcode, Address-1 Address-2 ... Spec-1 Spec-2 #operands Variable (e.g. VAX) Hybrid approach: reduce Fixed: variability in size, but provide (+) ease of decoding multiple encoding lengths (--) more instructions Examples: Intel 80x86 Variable: (+) lesser number of instructions (--) variance in amount of work per instruction

The Role of the Compiler ● Compilers are central to ISA design Front-end High-level optimizations Language independence Machine dependence Global optimizer Code generator

ISA Design to Help the Compiler ● Regularity: operations, data-types, and addressing modes should be orthogonal; no special registers/operands for some instructions ● Provide simple primitives: do not optimize for a particular compiler of a particular language ● Clear trade-offs among alternatives: how to allocate registers, when to unroll a loop...

What lies ahead... ● The DLX architecture ● DLX: simple data-path ● DLX: pipelined data-path ● Pipelining hazards, and how to handle them

CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 - PowerPoint PPT Presentation

CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 Bhaskaran Raman Department of CSE IIT Kanpur http://web.cse.iitk.ac.in/~cs422/index.html Announcements Course web-page is up http://web.cse.iitk.ac.in/~cs422/index.html

CS422 Computer Architecture Spring 2004 Lecture 23, 26 Mar 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 18, 26 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 13, 17 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 15, 20 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 05, 06 Jan 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 33, 22 Apr 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 02, 01 Jan 2004 Bhaskaran Raman Department of

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

User Interface Design and Programming - CS422 Luc Renambot renambot@uic.edu Yiwen Sun

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Architecture: Culture and Space Architecture: Culture and Space Architecture: Culture and Space

CSE 675.02: three aspects of computer design: instruction set architecture, Introduction to

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &

Introduction to Software Architecture Reid Holmes Architecture Architecture is: All

CMS Strip Readout Architecture for SLHC OUTLINE brief review of LHC strip readout architecture p

A New Golden Age for 1. Software advances can inspire architecture Computer Architecture:

How to Fix Them October 29, 2015 The webinar will start at 11:00 a.m. CT Heather Smith, QKA

Multicore Semantics and Programming Tim Harris Peter Sewell Amazon University of Cambridge

Exotic Brane Junctions Exotic Brane Junctions from F-theory from F-theory JHEP 05 (2016) 060

Application Logic Flaws Professor Larry Heimann Web Application Security Information Systems

Mobile Networks Considerations for IPv6 Deployment

ALTREP and Other Things Luke Tierney 1 Gabe Becker 2 Tomas Kalibera 3 1 University of Iowa 2

Review of Symbols Review of Symbols CS 105 Basic Parameters Tour of the Black Holes of

Temporal Fast Downward Using the Context-enhanced Additive Heuristic for Temporal and Numeric

CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 - PowerPoint PPT Presentation

CS422 Computer Architecture Spring 2004 Lecture 04, 06 Jan 2004 Bhaskaran Raman Department of CSE IIT Kanpur http://web.cse.iitk.ac.in/~cs422/index.html Announcements Course web-page is up http://web.cse.iitk.ac.in/~cs422/index.html

CS422 Computer Architecture Spring 2004 Lecture 23, 26 Mar 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 18, 26 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 13, 17 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 15, 20 Feb 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 05, 06 Jan 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 33, 22 Apr 2004 Bhaskaran Raman Department of

CS422 Computer Architecture Spring 2004 Lecture 02, 01 Jan 2004 Bhaskaran Raman Department of

Theory of Computation Textbook The Nature of Computation by Cristopher Moore and (CS

User Interface Design and Programming - CS422 Luc Renambot renambot@uic.edu Yiwen Sun

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

Architecture: Culture and Space Architecture: Culture and Space Architecture: Culture and Space

CSE 675.02: three aspects of computer design: instruction set architecture, Introduction to

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture &amp; Computer Architecture &amp;

Introduction to Software Architecture Reid Holmes Architecture Architecture is: All

CMS Strip Readout Architecture for SLHC OUTLINE brief review of LHC strip readout architecture p

A New Golden Age for 1. Software advances can inspire architecture Computer Architecture:

How to Fix Them October 29, 2015 The webinar will start at 11:00 a.m. CT Heather Smith, QKA

Multicore Semantics and Programming Tim Harris Peter Sewell Amazon University of Cambridge

Exotic Brane Junctions Exotic Brane Junctions from F-theory from F-theory JHEP 05 (2016) 060

Application Logic Flaws Professor Larry Heimann Web Application Security Information Systems

Mobile Networks Considerations for IPv6 Deployment

ALTREP and Other Things Luke Tierney 1 Gabe Becker 2 Tomas Kalibera 3 1 University of Iowa 2

Review of Symbols Review of Symbols CS 105 Basic Parameters Tour of the Black Holes of

Temporal Fast Downward Using the Context-enhanced Additive Heuristic for Temporal and Numeric

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &