lecture 2 processor design single processor performance
play

Lecture 2: Processor Design, Single-Processor Performance - PowerPoint PPT Presentation

Lecture 2: Processor Design, Single-Processor Performance G63.2011.002/G22.2945.001 September 14, 2010 Intro Basics Assembly Memory Pipelines Outline Intro The Basic Subsystems Machine Language The Memory Hierarchy Pipelines Intro


  1. Lecture 2: Processor Design, Single-Processor Performance G63.2011.002/G22.2945.001 · September 14, 2010 Intro Basics Assembly Memory Pipelines

  2. Outline Intro The Basic Subsystems Machine Language The Memory Hierarchy Pipelines Intro Basics Assembly Memory Pipelines

  3. Admin Bits • Lec. 1 slides posted • New here? Welcome! Please send in survey info (see lec. 1 slides) via email • PASI • Please subscribe to mailing list • Near end of class: 5-min, 3-question ‘concept check’ Intro Basics Assembly Memory Pipelines

  4. Outline Intro The Basic Subsystems Machine Language The Memory Hierarchy Pipelines Intro Basics Assembly Memory Pipelines

  5. Introduction Goal for Today High Performance Computing : Discuss the actual computer end of this. . . . . . and its influence on performance Intro Basics Assembly Memory Pipelines

  6. What’s in a computer? Intro Basics Assembly Memory Pipelines

  7. What’s in a computer? Processor Intel Q6600 Core2 Quad, 2.4 GHz Intro Basics Assembly Memory Pipelines

  8. What’s in a computer? Die Processor (2 × ) 143 mm 2 , 2 × 2 cores Intel Q6600 Core2 Quad, 2.4 GHz 582,000,000 transistors ∼ 100W Intro Basics Assembly Memory Pipelines

  9. What’s in a computer? Intro Basics Assembly Memory Pipelines

  10. What’s in a computer? Memory Intro Basics Assembly Memory Pipelines

  11. Outline Intro The Basic Subsystems Machine Language The Memory Hierarchy Pipelines Intro Basics Assembly Memory Pipelines

  12. A Basic Processor Memory Interface Address ALU Address Bus Data Bus Register File Flags Internal Bus Insn. PC fetch Data ALU Control Unit (loosely based on Intel 8086) Intro Basics Assembly Memory Pipelines

  13. A Basic Processor Memory Interface Address ALU Address Bus Data Bus Register File Flags Internal Bus Insn. Bonus Question: PC fetch Data ALU What’s a bus? Control Unit (loosely based on Intel 8086) Intro Basics Assembly Memory Pipelines

  14. How all of this fits together Everything synchronizes to the Clock . Control Unit (“CU”): The brains of the Memory Interface operation. Everything connects to it. Address ALU Address Bus Data Bus Bus entries/exits are gated and Register File Flags (potentially) buffered . Internal Bus CU controls gates, tells other units Insn. fetch PC Data ALU Control Unit about ‘what’ and ‘how’: • What operation? • Which register? • Which addressing mode? Intro Basics Assembly Memory Pipelines

  15. What is. . . an ALU? A rithmetic L ogic U nit One or two operands A, B Operation selector (Op): • (Integer) Addition, Subtraction A B • (Logical) And, Or, Not • (Bitwise) Shifts (equivalent to multiplication by power of two) Op • (Integer) Multiplication, Division Specialized ALUs: R • Floating Point Unit (FPU) • Address ALU Operates on binary representations of numbers. Negative numbers represented by two’s complement. Intro Basics Assembly Memory Pipelines

  16. What is. . . a Register File? Registers are On-Chip Memory %r0 • Directly usable as operands in %r1 Machine Language %r2 • Often “general-purpose” %r3 %r4 • Sometimes special-purpose: Floating point, Indexing, Accumulator %r5 %r6 • Small: x86 64: 16 × 64 bit GPRs %r7 • Very fast (near-zero latency) Intro Basics Assembly Memory Pipelines

  17. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  18. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  19. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  20. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  21. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  22. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Intro Basics Assembly Memory Pipelines

  23. How does computer memory work? One (reading) memory transaction (simplified): D0..15 Memory Processor A0..15 R/ ¯ W CLK Observation: Access (and addressing) happens in bus-width-size “chunks”. Intro Basics Assembly Memory Pipelines

  24. What is. . . a Memory Interface? Memory Interface gets and stores binary words in off-chip memory. Smallest granularity: Bus width Tells outside memory • “where” through address bus • “what” through data bus Computer main memory is “Dynamic RAM” (DRAM): Slow, but small and cheap. Intro Basics Assembly Memory Pipelines

  25. Outline Intro The Basic Subsystems Machine Language The Memory Hierarchy Pipelines Intro Basics Assembly Memory Pipelines

  26. A Very Simple Program 4: c7 45 f4 05 00 00 00 movl $0x5, − 0xc(%rbp) b: c7 45 f8 11 00 00 00 movl $0x11, − 0x8(%rbp) int a = 5; 12: 8b 45 f4 mov − 0xc(%rbp),%eax int b = 17; 15: 0f af 45 f8 imul − 0x8(%rbp),%eax int z = a ∗ b; 19: 89 45 fc mov %eax, − 0x4(%rbp) 1c: 8b 45 fc mov − 0x4(%rbp),%eax Things to know: • Addressing modes (Immediate, Register, Base plus Offset) • 0xHexadecimal • “AT&T Form”: (we’ll use this) <opcode><size> <source>, <dest> Intro Basics Assembly Memory Pipelines

  27. Another Look Memory Interface Address ALU Address Bus Data Bus Register File Flags Internal Bus Insn. PC fetch Data ALU Control Unit Intro Basics Assembly Memory Pipelines

  28. Another Look 4: c7 45 f4 05 00 00 00 movl $0x5, − 0xc(%rbp) b: c7 45 f8 11 00 00 00 movl $0x11, − 0x8(%rbp) 12: 8b 45 f4 mov − 0xc(%rbp),%eax Memory Interface 15: 0f af 45 f8 imul − 0x8(%rbp),%eax 19: 89 45 fc mov %eax, − 0x4(%rbp) 1c: 8b 45 fc mov − 0x4(%rbp),%eax Address ALU Address Bus Data Bus Register File Flags Internal Bus Insn. PC fetch Data ALU Control Unit Intro Basics Assembly Memory Pipelines

  29. A Very Simple Program: Intel Form 4: c7 45 f4 05 00 00 00 mov DWORD PTR [rbp − 0xc],0x5 b: c7 45 f8 11 00 00 00 mov DWORD PTR [rbp − 0x8],0x11 12: 8b 45 f4 mov eax,DWORD PTR [rbp − 0xc] 15: 0f af 45 f8 imul eax,DWORD PTR [rbp − 0x8] 19: 89 45 fc mov DWORD PTR [rbp − 0x4],eax 1c: 8b 45 fc mov eax,DWORD PTR [rbp − 0x4] • “Intel Form”: (you might see this on the net) <opcode> <sized dest>, <sized source> • Goal: Reading comprehension. • Don’t understand an opcode? Google “ <opcode> intel instruction ”. Intro Basics Assembly Memory Pipelines

  30. Machine Language Loops 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp int main() 4: c7 45 f8 00 00 00 00 movl $0x0, − 0x8(%rbp) { b: c7 45 fc 00 00 00 00 movl $0x0, − 0x4(%rbp) int y = 0, i ; 12: eb 0a jmp 1e < main+0x1e > 14: 8b 45 fc mov − 0x4(%rbp),%eax for (i = 0; 17: 01 45 f8 add %eax, − 0x8(%rbp) y < 10; ++i) 1a: 83 45 fc 01 addl $0x1, − 0x4(%rbp) y += i; 1e: 83 7d f8 09 cmpl $0x9, − 0x8(%rbp) return y; 22: 7e f0 jle 14 < main+0x14 > 24: 8b 45 f8 mov − 0x8(%rbp),%eax } 27: c9 leaveq 28: c3 retq Things to know: • Condition Codes (Flags): Zero, Sign, Carry, etc. • Call Stack: Stack frame, stack pointer, base pointer • ABI: Calling conventions Intro Basics Assembly Memory Pipelines

  31. Machine Language Loops 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp int main() 4: c7 45 f8 00 00 00 00 movl $0x0, − 0x8(%rbp) { b: c7 45 fc 00 00 00 00 movl $0x0, − 0x4(%rbp) int y = 0, i ; 12: eb 0a jmp 1e < main+0x1e > 14: 8b 45 fc mov − 0x4(%rbp),%eax for (i = 0; 17: 01 45 f8 add %eax, − 0x8(%rbp) y < 10; ++i) 1a: 83 45 fc 01 addl $0x1, − 0x4(%rbp) y += i; 1e: 83 7d f8 09 cmpl $0x9, − 0x8(%rbp) return y; 22: 7e f0 jle 14 < main+0x14 > 24: 8b 45 f8 mov − 0x8(%rbp),%eax } 27: c9 leaveq 28: c3 retq Things to know: Want to make those yourself? • Condition Codes (Flags): Zero, Sign, Carry, etc. Write myprogram.c . • Call Stack: Stack frame, stack pointer, base pointer $ cc -c myprogram.c $ objdump --disassemble myprogram.o • ABI: Calling conventions Intro Basics Assembly Memory Pipelines

  32. We know how a computer works! All of this can be built in about 4000 transistors. (e.g. MOS 6502 in Apple II, Commodore 64, Atari 2600) So what exactly is Intel doing with the other 581,996,000 transistors? Answer: Intro Basics Assembly Memory Pipelines

  33. We know how a computer works! All of this can be built in about 4000 transistors. (e.g. MOS 6502 in Apple II, Commodore 64, Atari 2600) So what exactly is Intel doing with the other 581,996,000 transistors? Answer: Make things go faster! Intro Basics Assembly Memory Pipelines

Recommend


More recommend