Emulation Outline Emulation Interpretation basic, threaded, - PowerPoint PPT Presentation

Emulation – Outline • Emulation • Interpretation – basic, threaded, directed threaded – other issues • Binary translation – code discovery, code location – other issues • Control Transfer Optimizations 1 EECS 768 Virtual Machines

Key VM Technologies • Emulation – binary in one ISA is executed in processor supporting a different ISA • Dynamic Optimization – binary is improved for higher performance – may be done as part of emulation – may optimize same ISA (no emulation needed) HP Apps. X86 apps Windows HP UX Alpha HP PA ISA Emulation Optimization 2 EECS 768 Virtual Machines

Emulation Vs. Simulation • Emulation – method for enabling a (sub)system to present the same interface and characteristics as another – ways of implementing emulation • interpretation: relatively inefficient instruction-at-a-time • binary translation: block-at-a-time optimized for repeated – e.g., the execution of programs compiled for instruction set A on a machine that executes instruction set B. • Simulation – method for modeling a (sub)system’s operation – objective is to study the process; not just to imitate the function – typically emulation is part of the simulation process 3 EECS 768 Virtual Machines

Definitions • Guest – environment being Guest supported by underlying platform • Host supported by – underlying platform that provides guest Host environment 4 EECS 768 Virtual Machines

Definitions (2) • Source ISA or binary – original instruction set or binary Source – the ISA to be emulated • Target ISA or binary emulated by – ISA of the host processor – underlying ISA Target • Source/Target refer to ISAs • Guest/Host refer to platforms 5 EECS 768 Virtual Machines

Emulation • Required for implementing many VMs. • Process of implementing the interface and functionality of one (sub)system on a (sub)system having a different interface and functionality – terminal emulators, such as for VT100, xterm, putty • Instruction set emulation – binaries in source instruction set can be executed on machine implementing target instruction set – e.g., IA-32 execution layer 6 EECS 768 Virtual Machines

Interpretation Vs. Translation • Interpretation – simple and easy to implement, portable – low performance – threaded interpretation • Binary translation – complex implementation – high initial translation cost, small execution cost – selective compilation • We focus on user-level instruction set emulation of program binaries. 7 EECS 768 Virtual Machines

Interpreter State • An interpreter needs to Program Counter maintain the complete Condition Codes Code architected state of the Reg 0 machine implementing Reg 1 . . the source ISA . Data – registers Reg n-1 – memory • code • data Stack • stack Interpreter Code 8 EECS 768 Virtual Machines

Decode – Dispatch Interpreter • Decode and dispatch interpreter – step through the source program one instruction at a time – decode the current instruction – dispatch to corresponding interpreter routine – very high interpretation cost while (!halt && !interrupt) { inst = code[PC]; opcode = extract (inst,31,6); switch(opcode) { case LoadWordAndZero: LoadWordAndZero (inst); case ALU: ALU (inst); case Branch: Branch (inst); . . .} } Instruction function list 9 EECS 768 Virtual Machines

Decode – Dispatch Interpreter (2) • Instruction function: Load LoadWordAndZero(inst){ RT = extract (inst,25,5); RA = extract (inst,20,5); displacement = extract (inst,15,16); if (RA == 0) source = 0; else source = regs[RA]; address = source + displacement; regs[RT] = (data[address]<< 32)>> 32; PC = PC + 4; } 10 EECS 768 Virtual Machines

Decode – Dispatch Interpreter (3) • Instruction function: ALU ALU(inst){ RT = extract (inst,25,5); RA = extract (inst,20,5); RB = extract (inst, 15,5); source1 = regs[RA]; source2 = regs[RB]; extended_opcode = extract (inst,10,10); switch(extended_opcode) { case Add: Add (inst); case AddCarrying: AddCarrying (inst); case AddExtended: AddExtended (inst); . . .} PC = PC + 4; } 11 EECS 768 Virtual Machines

Decode – Dispatch Efficiency • Decode-Dispatch Loop – mostly serial code – case statement (hard-to-predict indirect jump) – call to function routine – return • Executing an add instruction – approximately 20 target instructions – several loads/stores and shift/mask steps • Hand-coding can lead to better performance – example: DEC/Compaq FX!32 12 EECS 768 Virtual Machines

Indirect Threaded Interpretation • High number of branches in decode-dispatch interpretation reduces performance – overhead of 5 branches per instruction • Threaded interpretation improves efficiency by reducing branch overhead – append dispatch code with each interpretation routine – removes 3 branches – threads together function routines 13 EECS 768 Virtual Machines

Indirect Threaded Interpretation (2) LoadWordAndZero: RT = extract (inst,25,5); RA = extract (inst,20,5); displacement = extract (inst,15,16); if (RA == 0) source = 0; else source = regs(RA); address = source + displacement; regs(RT) = (data(address)<< 32) >> 32; PC = PC +4; If (halt || interrupt) goto exit; inst = code[PC]; opcode = extract (inst,31,6) extended_opcode = extract (inst,10,10); routine = dispatch[opcode,extended_opcode]; goto *routine; 14 EECS 768 Virtual Machines

Indirect Threaded Interpretation (3) Add: RT = extract (inst,25,5); RA = extract (inst,20,5); RB = extract (inst,15,5); source1 = regs(RA); source2 = regs[RB]; sum = source1 + source2 ; regs[RT] = sum; PC = PC + 4; If (halt || interrupt) goto exit; inst = code[PC]; opcode = extract (inst,31,6); extended_opcode = extract (inst,10,10); routine = dispatch[opcode,extended_opcode]; goto *routine; 15 EECS 768 Virtual Machines

Indirect Threaded Interpretation (4) • Dispatch occurs indirectly through a table – interpretation routines can be modified and relocated independently • Advantages – binary intermediate code still portable – improves efficiency over basic interpretation • Disadvantages – code replication increases interpreter size 16 EECS 768 Virtual Machines

Indirect Threaded Interpretation (5) interpreter interpreter source code routines source code routines "data" accesses dispatch loop Decode-dispatch Threaded 17 EECS 768 Virtual Machines

Predecoding • Parse each instruction into a pre-defined structure to facilitate interpretation – separate opcode, operands, etc. – reduces shifts / masks significantly – more useful for CICS ISAs (loa d w ord a n d ze ro) 07 1 2 08 lwz r1, 8(r2) (a d d ) add r3, r3,r1 08 3 1 03 stw r3, 0(r4) (s tore w ord ) 37 3 4 00 18 EECS 768 Virtual Machines

Predecoding (2) struct instruction { unsigned long op; unsigned char dest, src1, src2; } code [CODE_SIZE]; Load Word and Zero: RT = code[TPC].dest; RA = code[TPC].src1; displacement = code[TPC].src2; if (RA == 0) source = 0; else source = regs[RA]; address = source + displacement; regs[RT] = (data[address]<< 32) >> 32; SPC = SPC + 4; TPC = TPC + 1; If (halt || interrupt) goto exit; opcode = code[TPC].op routine = dispatch[opcode]; goto *routine; 19 EECS 768 Virtual Machines

Direct Threaded Interpretation • Allow even higher efficiency by – removing the memory access to the centralized table – requires predecoding – dependent on locations of interpreter routines • loses portability (loa d w ord a nd ze ro) 001048d0 1 2 08 (a d d ) 00104800 3 1 03 (s tore w ord ) 00104910 3 4 00 20 EECS 768 Virtual Machines

Direct Threaded Interpretation (2) • Predecode the source binary into an intermediate structure • Replace the opcode in the intermediate form with the address of the interpreter routine • Remove the memory lookup of the dispatch table • Limits portability since exact locations of the interpreter routines are needed 21 EECS 768 Virtual Machines

Direct Threaded Interpretation (3) Load Word and Zero: RT = code[TPC].dest; RA = code[TPC].src1; displacement = code[TPC].src2; if (RA == 0) source = 0; else source = regs[RA]; address = source + displacement; regs[RT] = (data[address]<< 32) >> 32; SPC = SPC + 4; TPC = TPC + 1; If (halt || interrupt) goto exit; routine = code[TPC].op; goto *routine; 22 EECS 768 Virtual Machines

Direct Threaded Interpretation (4) intermediate interpreter code routines source code pre- decoder 23 EECS 768 Virtual Machines

Interpreter Control Flow • Decode for CISC ISA • Individual routines General Decode for each instruction (fill-in instruction structure) Dispatch . . . Inst. 1 Inst. 2 Inst. n specialized specialized specialized routine routine routine 24 EECS 768 Virtual Machines

Interpreter Control Flow (2) • For CISC ISAs Dispatch on first byte – multiple byte opcode – make common Simple Simple Complex Complex ... Inst. 1 Inst. m Inst. m+1 ... Inst. n Prefix cases specialized specialized specialized specialized set flags routine routine routine routine fast Shared Routines 25 EECS 768 Virtual Machines

Emulation Outline Emulation Interpretation basic, threaded, - PowerPoint PPT Presentation

Emulation Outline Emulation Interpretation basic, threaded, directed threaded other issues Binary translation code discovery, code location other issues Control Transfer Optimizations 1 EECS 768 Virtual Machines

MAPS UMTS for IuCS, IuH Interfaces Emulator (IuCS Emulation over IP and ATM; and IuH Emulation

Emulation in ns Presented by Alefiya Hussain What is Emulation Ability to introduce the

"ENLIGHTENING" KVM "ENLIGHTENING" KVM HYPER-V EMULATION HYPER-V EMULATION

Game boy emulation Nicolas Montanaro nicolas.moe Emulation Overview hardware or software

Chip-8 Emulation on a SoCKit FPGA Team: Ashley Kling, Levi Oliver, Gabrielle Taylor, David

1 6/17/2011 Introduction Emulation Evaluation Conclusions CPU Device Chipset Memory

vIOMMU/ARM: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 Overview

Cross-ISA Machine Emulation for Multicores Emilio G. Cota Columbia University Paolo Bonzini

EMULATION OF THE SLOW CONTROL FOR THE PANDA CLUSTER - JET GENERATOR PRESENTED BY Bogusaw

Vehicular network emulation Scientific issues Contribution Team Airplug A. Buisset, B.

1 2 For todays lecture, well start by defining what we mean by emulation. Specifically, in

Linux emulation Ron Minnich Fifth IWP9 With thanks to Jim McKie Ron Minnich Linux emulation A

Shuntaint: Emulation-based Security Testing for Formal Verification Bruno Luiz

Outline A taxonomy of CR security threats Primary user emulation attacks Cognitive Radio

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

IOT EMULATION WITH COOJA BA BAGULA & ZENVILLE ERASMUS ISAT LABORATORY DEPARTMENT OF COMPUTER

Towards Automated Dynamic Analysis for Linux-based Embedded Firmware Dominic Chen 1 , Manuel Egele

Scale construction Michelle Mazurek (some material from Bilge Mutlu) 1 About scales

Demonstrating impact with standardized national performance measures to elevate school-based

PCOR Lessons from the Field: DARTNet David R. West, PhD Colorado Health Outcomes Program School

R/exams: A One-for-All Exams Generator Written Exams, Online Tests, and Live Quizzes with R Achim

Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to

Practical implementation of k-means clustering Karolis Urbonas Head of Data Science, Amazon

Regulations.gov Overview of the Latest Features and Functionality The Status of Social Media in

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Emulation Outline Emulation Interpretation basic, threaded, - PowerPoint PPT Presentation

Emulation Outline Emulation Interpretation basic, threaded, directed threaded other issues Binary translation code discovery, code location other issues Control Transfer Optimizations 1 EECS 768 Virtual Machines

MAPS UMTS for IuCS, IuH Interfaces Emulator (IuCS Emulation over IP and ATM; and IuH Emulation

Emulation in ns Presented by Alefiya Hussain What is Emulation Ability to introduce the

&quot;ENLIGHTENING&quot; KVM &quot;ENLIGHTENING&quot; KVM HYPER-V EMULATION HYPER-V EMULATION

Game boy emulation Nicolas Montanaro nicolas.moe Emulation Overview hardware or software

Chip-8 Emulation on a SoCKit FPGA Team: Ashley Kling, Levi Oliver, Gabrielle Taylor, David

1 6/17/2011 Introduction Emulation Evaluation Conclusions CPU Device Chipset Memory

vIOMMU/ARM: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 Overview

Cross-ISA Machine Emulation for Multicores Emilio G. Cota Columbia University Paolo Bonzini

EMULATION OF THE SLOW CONTROL FOR THE PANDA CLUSTER - JET GENERATOR PRESENTED BY Bogusaw

Vehicular network emulation Scientific issues Contribution Team Airplug A. Buisset, B.

1 2 For todays lecture, well start by defining what we mean by emulation. Specifically, in

Linux emulation Ron Minnich Fifth IWP9 With thanks to Jim McKie Ron Minnich Linux emulation A

Shuntaint: Emulation-based Security Testing for Formal Verification Bruno Luiz

Outline A taxonomy of CR security threats Primary user emulation attacks Cognitive Radio

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

IOT EMULATION WITH COOJA BA BAGULA &amp; ZENVILLE ERASMUS ISAT LABORATORY DEPARTMENT OF COMPUTER

Towards Automated Dynamic Analysis for Linux-based Embedded Firmware Dominic Chen 1 , Manuel Egele

Scale construction Michelle Mazurek (some material from Bilge Mutlu) 1 About scales

Demonstrating impact with standardized national performance measures to elevate school-based

PCOR Lessons from the Field: DARTNet David R. West, PhD Colorado Health Outcomes Program School

R/exams: A One-for-All Exams Generator Written Exams, Online Tests, and Live Quizzes with R Achim

Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to

Practical implementation of k-means clustering Karolis Urbonas Head of Data Science, Amazon

Regulations.gov Overview of the Latest Features and Functionality The Status of Social Media in

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

"ENLIGHTENING" KVM "ENLIGHTENING" KVM HYPER-V EMULATION HYPER-V EMULATION

IOT EMULATION WITH COOJA BA BAGULA & ZENVILLE ERASMUS ISAT LABORATORY DEPARTMENT OF COMPUTER