F LASH B ACK Immersive Virtual Reality on Mobile Devices via Rendering Memoization Kevin Boos David Chu Eduardo Cuervo MobiSys 2016
2
high graphic complexity low latency high framerate photo-realistic, wide FOV < 25ms 60+ FPS 3
mobile devices cannot meet these demands
Responsiveness Current VR Landscape Graphical Mobility Tethered HMDs Te Quality Energy Affordability Efficiency 5
Responsiveness Current VR Landscape Graphical Mobility Tethered HMDs Te Quality Energy Affordability Efficiency 6
Responsiveness Current VR Landscape Graphical Mobility Te Tethered HMDs Quality Mobile HMDs Mo Energy Affordability Efficiency 7
Responsiveness Current VR Landscape Graphical Mobility Tethered HMDs Te Quality Mobile HMDs Mo F LASH B ACK Fl Energy Affordability Efficiency 8
F LASH B ACK objectives mobility self-contained on mobile device desktop-level image quality graphic quality low end-to-end latency responsiveness energy efficiency long battery life, low thermal output affordability no specialized hardware 9
Design
Mobile GPUs are constrained Mobile storage is abundant
key idea: pre-render all possible views for fast real-time replay
F LASH B ACK design & challenges • Pre-rendering [offline] • infinite input space • Live playback [runtime] • huge cache, fast retrieval • inexact query matching • Dynamic objects 13
F LASH B ACK design & challenges • Pre-rendering [offline] • infinite input space • Live playback [runtime] • huge cache, fast retrieval • inexact query matching • Dynamic objects 14
Pre-rendering : map pose to frame pose frame (key) (value) infinite input space 15
Pre-rendering 3D position input = 3D orientation 16
Pre-rendering: megaframes LEFT EYE RIGHT EYE RIGHT LEFT TOP BOTTOM FRONT REAR 17
Pre-rendering: megaframes LEFT EYE RIGHT EYE RIGHT LEFT TOP BOTTOM FRONT REAR position megaframe (key) (value) 18
Further reducing the input space • Automate iteration over possible inputs • Prune unreachable player positions • Configurable quantization granularity Full-coverage frame cache is available at runtime 19
F LASH B ACK design & challenges • Pre-rendering [offline] • infinite input space • Live playback [runtime] • huge cache, fast retrieval • inexact query matching • Dynamic objects 20
Live playback overview pose frame cache (Flash/SSD) final frame megaframe(s) Cube Warp HMD RAM GPU 21
Building the cache frame cache L3: secondary storage – 9.2 ms (Flash/SSD) L2: system RAM – 8.7ms RAM GPU L1: GPU VRAM – 0.35 ms fast retrieval • raw, decoded megaframes from huge cache 22
Spatially indexing the cache • R-trees: fast, n-D nearest-neighbor search algorithm • Two cache indices: • Universal • GPU-only inexact query matching 23
F LASH B ACK design & challenges • Pre-rendering [offline] • infinite input space • Live playback [runtime] • huge cache, fast retrieval • inexact query matching • Dynamic objects 24
Pre-rendering dynamic objects • Extension of static scene • 7D input space: • 3D relative position • 3D object rotation • 1D animation sequence • Supports arbitrary motion paths and animations • but most are periodic Automated megaframe capture 25
Dynamic object cache indexing CK.pos Dynamic • A 7D query is not meaningful CacheKey • Decompose into chain of queries Top-level R-tree, position-indexed Mid-level R-tree, CK.orientation orientation-indexed = Faster queries • Can prune search space Last-level animation list, at each level timestamp-indexed Anim. List Anim. List Anim. List Retrieved megaframe 26
Dynamic + static compositing sta$c& dynamic& composite& megaframe& 27
Evaluation
Evaluation setup • HP Pavilion Mini + Oculus Rift DK2 • Small weak computer approximates mobile device • Underperforms Samsung Galaxy S6 Gear VR 1.2 Benchmark Score (normalized) HP Pavilion Mini Samsung Galaxy S6 Desktop 1 HP Pavilion Mini 0.8 Samsung Galaxy S6 0.6 6 Desktop 0.4 0.2 0 Manhattan T-Rex ALU Texturing 29
F LASH B ACK 30
Local Rendering 31
15x higher framerate 8x lower latency 80 250 Static End-to-End Latency (ms) 1 Dynamic Object 70 2 Dynamic Objects 200 60 Framerate (FPS) 50 150 40 100 30 20 50 10 0 0 Mobile Strong FlashBack FlashBack Mobile Strong FlashBack FlashBack Device Desktop (Decoding) (GPU) HMD Desktop (Decoding) (GPU) 32
97x more energy efficient Energy Per Displayed Frame (J) 6 Static 1 Dynamic Object 2 Dynamic Objects 5 4 longer battery life 3 less thermal discomfort 2 1 0 Strong Mobile FlashBack FlashBack Desktop Device (Decoding) (GPU) 33
F LASH B ACK maintains image quality • Measure perceived visual quality via SSIM • Compares rendered scene against a pristine image local rendering F LASH B ACK on mobile device 0.81 0.93 0.5 0.75 1.0 poor quality good quality 34
Limitations • Dynamic object scalability is moderate • con: per-pixel megaframe compositing is slow • pro: object complexity is irrelevant • Lighting models are limited • Restricted by hardware decoder 35
Related work web search results data types, behavior, and design choices precaching database queries are not applicable to VR domain. compute offload requires good network connection. wearable AR on Glass offloading latency/quality less demanding. rendering ignores local device storage. prelim. work on HMDs static video playback only. caching objects as QuickTime VR focused on desktop environments. reuse past renderings inaccuracies in object representation. rendered images caching with impostors very limited dynamic object support. requires specialized hardware warping cubemaps VR address recalculation added to high-end GPUs. 36
F LASH B ACK in conclusion • Avoids real-time rendering by pre-generating frames • flattens complex VR app behavior into data structures • Supports static scene and dynamic animated objects framerate ⬆ kevinaboos.web.rice.edu latency ⬇ energy ⬇ 37
38
Backup Slides
References [3] D. Lymberopoulos, et al., “Pocketweb: Instant web browsing for mobile devices,” ASPLOS 2012. [4] D. Barbara, et al., “Sleepers and workaholics: Caching strategies in mobile environments,” SIGMOD 1994. [5] E. Cuervo, et al., “Maui: Making smartphones last longer with code offload,” MobiSys 2010. [6] B. Chun, et al., “Clonecloud: Elastic execution between mobile device and cloud,” EuroSys 2011. [7] M. Gordon, et al., “Comet: Code offload by migrating execution transparently,” OSDI 2012. [8] K. Ha, et al., “Towards wearable cognitive assistance,” MobiSys 2014. [9] E. Cuervo, et al., “Kahawai: High-quality mobile gaming using gpu offload,” MobiSys 2015. [10] Y. Degtyarev, et al., “Demo: Irides, attaining quality, responsiveness, and mobility for VR HMDs,” MobiSys 2015. [11] S. Chen, “Quicktime VR: An image-based approach to virtual environment navigation,” SIGGRAPH 1995. [12] G. Schaufler, “Exploiting frame-to-frame coherence in a virtual reality system,” IEEE VR AIS 1996. [13] G. Schaufler and W. Sturzlinger, “A Three Dimensional Image Cache for Virtual Reality,” CG Forum 1996. [14] J. Shade, et al., “Hierarchical image caching for accelerated walkthroughs of complex environments,” SIGGRAPH 1996. [15] M. Regan and R. Pose, “Priority rendering with a virtual reality address recalculation pipeline,” SIGGRAPH 1994. [16] M. Regan and R. Pose,” An interactive graphics display architeture,” IEEE VR AIS 1993. 40
Cache lookup with R-trees • create minimally-overlapping bounding boxes around 3D points • characteristics fit our needs • good insertion and deletion • fast lookup is priority • better querying semantics Guttman, A. "R-Trees: A Dynamic Index Structure for Spatial Searching". ACM SIGMOD ‘84. 41
VR system in a nutshell • Head-Mounted Display (HMD) • Smartphone-class hardware • Internal sensors and external trackers • Combine sensor readings → player pose • 3D position • 3D orientation 42
Microbenchmarks 20 Cache Retrieval Time (ms) 18 18 Cache Query Time (µs) 15 16 12 14 9 12 6 10 3 8 6 4 0.4 2 0.3 0 from from from 100 1000 10000 100000 1e+06 GPU RAM Disk Cache Size (number of frames) 43
Typical cache sizes (uncompressed) car interior 115 MB bedroom 730 MB can be compressed 2.8 GB living room using video codec for two-story house 8.7 GB efficient deployment basketball arena 29 GB Viking Village 54 GB 44
Recommend
More recommend