Virtio 1 - why do it? And - are we there yet? 2015 Michael S. Tsirkin Red Hat Uses material from https://lwn.net/Kernel/LDD3/ Gcompris, tuxpaint 1 Distributed under the Creative commons license.
Lots of work ... main-title 300 250 commits Aug 6 to Aug 6 in each year 200 150 100 50 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 year 2
Virtio 1: update ● Documented assumptions ● More Robust ● More Extendable 3
Conformance statements Virtio 0.9 Virtio 1.0 - DRIVER_OK status bit is set. The driver MUST NOT notify the - The device can now be used. device before setting DRIVER_OK. drv→probe(dev); drv→probe(dev); add_status(dev, DRIVER_OK); netif_carrier_on(dev) netif_carrier_on(dev) add_status(dev, DRIVER_OK); 4
Virtio 0.9: inflate 0.......................................31 0.......................................31 +1 FF FF 00 00 00 00 01 00 FF FF 01 00 DRIVER 5
Virtio 1.0: inflate 0.......................................31 0.......................................31 +1 FF FF 00 00 00 00 01 00 00 00 01 00 FF FF 00 00 DRIVER 6
Generation counter 1 0 0.......................................63 0.......................................63 +1 FFFFFFFF 00000000 00000000 00000001 FFFFFFFF 00000001 00000000 00000001 DRIVER 7
Memory map 0.9 COMMON DEVICE SPECIFIC FEATURES QUEUE STATUS ISR COMMON CAPABILITY LIST 1.0 IO BAR FEATURES QUEUE STATUS ISR ... MEMORY BAR DEVICE SPECIFIC VIRTIO CAPABILITY #1 8 VIRTIO CAPABILITY #2
Virtio 0.9: Port IO vs Memory Port IO MM IO x86 decode: address x86 decode: data Fast on x86 32/64 bit Page tables Required by PCI Express 9
Fast MMIO avoid need to decode data 0.9 0...................15 0...................15 VQ NUMBER NOTIFY ADDRESS DATA 1.0 0...................15 16............……......31 IGNORED VQ NUMBER ADDRESS NOTIFY DATA 10
Virtio 1: Access times on KVM x86: Cycles per access (lower is better) 4000 3500 3000 2500 MMIO 2000 Fast MMIO Port IO 1500 1000 500 0 CPU cycles 11
Virtio 1: Port IO vs Memory Port IO MM IO x86 decode: address Fast on x86 32/64 bit Page tables Required by PCI Express 12
Memory Region Aliases CAPABILITY LIST IO BAR Queue Notify VirtQueue MEMORY BAR Queue Notify VIRTIO CONFIG Queue Notify CAPABILITY VIRTIO CAPABILITY 13
soft mac Ethernet MAC 52 54 00 12 34 56 DRIVER 0.9 1.0 52 54 00 12 34 56 VirtQueue DRIVER 14
Virtio feature negotiation 0..............1...........2............. DEVICE FEATURES 0 1 1 -|- DRIVER DRIVER FEATURES 0 0 1 -|- Defaults must be maintained forever! 15
Virtio 1: Error handling ● DRIVER: set features ● DRIVER: set FEATURES_OK bit ● DEVICE: check features ● DEVICE: clear FEATURES_OK on error ● DRIVER: check FEATURES_OK bit ● DRIVER: fail gracefully if not set ! 16
Error handling: Virtio 0.9 ● Can't recover from device errors ● Not very useful? ● Just stop guest. 17
Vhost-user GUEST virtio-net VM RAM DMA VHOST USER CLIENT SETUP Client crash or restart need not cause guest crash! 18
DEVICE_NEEDS_RESET Read STATUS; Write Reconfigure device. STATUS=0 Write Detect: Will reset device STATUS=DRIVER_OK NEEDS_RESET set Restart operation. DRIVER 19
Compatibility Legacy Device Transitional Legacy Driver Device & Driver Legacy Modern Modern Legacy Legacy Modern Legacy Modern Modern Legacy Legacy Modern DRIVER DRIVER DRIVER DRIVER DRIVER 20
Are we there yet? GUEST GUEST BIOS DMA VHOST USER VHOST 21
What to expect? ● Current: Virtio-v1.0-cs03 ● Next bugfix: Virtio-v1.0-cs04 – Virtio-blk: writeback / writethrough control – More update guidance ● Next feature: Virtio-v1.1-cs01 – Virtio-input – Virtio-gpu – Virtio-vsock 22
TX: Interrupt avoidance uplink 23
TX: Interrupt coalescing uplink 24
Pass-through for nested virt Virtio Net (on host) ● Memory mapped: use page tables ● IOMMU: translate and protect guest memory 25
Virtio as PCI Express device ● Uses memory mapped IO support ● Multi-root for NUMA ● Native hotplug ● Advanced Error Reporting 26
Summary ● Why do it? – Improved robustness for virtual devices ● Are we there yet? – Yes! – And there's more to come. 27
Thank you! 28
Virtio 0.9: Port IO versus memory on KVM x86: cycles per access (lower is better) 4000 3500 3000 2500 2000 MMIO Port IO 1500 1000 500 0 CPU cycles 29
OASIS Virtio TC PCI Virtio 1.0 MMIO (ARM) CCW (PPC) 30
Virtio 1.0 ● Virtio PCI: – Replace Port IO with Memory mapped IO – PCI Express (hotplug, AER, multi-root, SRIOV) – Infinite features ● Reduced memory requirements ● Fixed endianness ● Compatibility 31
Port IO: outl notify EF VQ# OUT (%DX) %EAX VM Exit REASON QUALIFICATION STATE 32
Memory mapped IO: writel 3E 89 MOV (%EDI) %RSI PTE VALID? VM Exit REASON GUEST ADDRESS RIP 33
Fast MMIO notify VQ# MOV (%EDI) %RSI PTE VALID? VM Exit REASON GUEST ADDRESS 34
Multiple interfaces CAPABILITY LIST IO BAR MEMORY BAR VIRTIO CAPABILITY #1 VIRTIO CAPABILITY #2 35
Memory requirements 0.9 VQ desc avail used 1.0 VQ desc avail used 36
features 0.9 0.......................................31 DEVICE FEATURES 0 1 1 -|- DRIVER 1 DRIVER FEATURES v 0 1 -|- 0 1.0 SEL 1 2 3 4 ….. …. 0... …. …. …. DRIVER …. ... …. …. …. STATUS = FEATURES_OK 37
Endianness intel PPC Virtio 0.9 Virtio BE Virtio LE Device LE Device BE Virtio 1.0 Virtio LE Device Device 38 Device
compatibility 0.9 1.0 Driver Driver Compatibility Device Device 39
Packet layout Virtio 0.9 INDIRECT header next Virtio 1.0 header 40
Packet layout: transactions per sec (higher is better) 3500 3000 2500 2000 virtio 0.9 virtio 1.0 1500 1000 500 0 transactions/sec 41
More: virtio 1.0 versus 0.9.5 ● Virtio 9p ● Virtio blk: WCE ● Virtio-net Multiqueue ● Virtio-net dynamic offloads ● Already upstream (based on spec draft) 42
vhost updates ● Vhost scsi ● Vhost-net zero copy transmit ● No need for driver changes 43
Kvm networking ● Openvswitch – if time allows ● Ethernet bridge 44
Bridge FDB uplink London Heathrow Paris CDG London Paris Heathrow CDG London Paris 45
Flood: DOS potential uplink London Heathrow Paris CDG Bangkok Heathrow CDG London Paris 46
Disable flood uplink London Heathrow Paris CDG London Bangkok Paris Heathrow CDG London Paris 47
softmac ● Ifconfig eth0 hw ether 00:12:23:45:67:89 00:12:23:45:67:89 NEW virtio-net MAC 48
Using softmac/non promiscuous uplink London Heathrow Paris CDG Paris Heathrow CDG NEW Paris 49
Work in progress ● ELVIS (vhost blk/vhost net) ● Virgl ● Vhost-net performance 50
RX latency NIC HOST VHOST VM 51
Fast rx NIC HOST Current? VHOST VM RAM? 52
Fast rx: transactions per sec (higher is better) 7000 6000 5000 4000 thread 3000 irq 2000 1000 0 transactions/sec Hit 331668 Miss 79 53
Vhost-net threading tap RX VHOST VM TX NIC VHOST VM 54
Vhost-net thread pool tap VM WQ VHOST NIC VHOST VM 55
threading: UDP RR transactions/sec (higher is better) 16000 14000 12000 10000 8000 thread wq 6000 4000 2000 0 256 512 1024 2048 4096 8192 16384 56
threading: TCP STREAM transactions/sec (higher is better) 14000 12000 10000 8000 thread wq 6000 4000 2000 0 256 512 1024 2048 4096 8192 16384 57
summary ● Performance ● Manageability ● Security 58
Questions? 59
OVS: flow match PACKET FLOW VM 22 192.68.0.1 22 12865 192.68.0.1 12865 kernel userspace OVS-VSWITCHD 60
OVS: wildcard match PACKET FLOW 22 192.68.0.1 * VM 12865 kernel userspace OVS-VSWITCHD 61
Wilcard: netperf CRR (higher is better) 2500 2000 1500 match wildcard 1000 500 0 bi-connections/sec 62
63
Recommend
More recommend