accelerating stereo 360 stitching using multi gpus
play

ACCELERATING STEREO 360 STITCHING USING MULTI-GPUS Ken Turkowski - PowerPoint PPT Presentation

ACCELERATING STEREO 360 STITCHING USING MULTI-GPUS Ken Turkowski & Trevor Smith, GTC 2017 OVERVIEW What is a stereo panorama? [Ken] How do we stitch? [Ken] How do we handle real-time videos? GPUs! [Trevor] Demo in the VR Village, until


  1. ACCELERATING STEREO 360 STITCHING USING MULTI-GPUS Ken Turkowski & Trevor Smith, GTC 2017

  2. OVERVIEW What is a stereo panorama? [Ken] How do we stitch? [Ken] How do we handle real-time videos? GPUs! [Trevor] Demo in the VR Village, until noon today! SDKs 2

  3. NON-INTERACTIVE IMAGERY 3

  4. INTERACTIVE IMAGERY 4

  5. WHY 360° INTERACTIVE STEREO VIDEO? 360° Pan and zoom interactively Stereo Real-time Immersion in a real-world situation More so than still photographs, directed videos, and simple panoramas 5

  6. SINGLE CAMERA Light rays everywhere 2D sampling of a light field At one location Towards a preferred direction 6

  7. PANORAMA Sampled at one point, like a photograph More directions: 360° horizontally x ±90°vertically 7

  8. HOW TO MAKE A PANORAMA? Rotating slit camera with a fisheye lens Single camera rotated to different directions Multiple camera rig 8

  9. PANORAMA FORMATS Cube Map Equirectangular Double 180°Fisheye Single 360° Fisheye 9

  10. PANORAMA PROS AND CONS Gives a good sense of space Compact Interactive Facilitates individual exploration Frozen in time? à stitch videos NVIDIA has a mono stitching SDK! No sense of scale or depth Can we do better? How about stereo? 10

  11. IDEAL STEREO PANORAMA Light field for each eye Eyes separated by the IPD For each direction (360°around), capture a spray of rays for the left eye a spray of rays for the right eye 2D spray x 1D direction = 3D data set A lot of data! 11

  12. OMNIDIRECTIONAL STEREO Omnidirectional Stereo (ODS) [Ishiguro’90 Peleg’01] mono stereo 2D subset of the ideal 3D stereo rays 1D fan of rays, not a 2D spray Imagine 2 linear sensors + rotating motor Whereas mono rays converge radially, stereo rays converge tangentially 12

  13. MONO & STEREO LIGHT FIELDS Mono Stereo Radial sampling Tangential sampling 13

  14. ADVANTAGES OF STEREO PANORAMA Can drive an HMD Compelling Immersive Sense of depth & scale Compact: (~2 mono panos) A lot of bang per buck! 14

  15. STEREO PANORAMA STITCHING Camera size precludes capturing the rays that we want Cameras not on interpupillary circle Very few of the rays that we want Need to interpolate on the rig circle Project to the ipd circle 15

  16. INTERPOLATION BETWEEN CAMERAS (1) Compute pixel motion between adjacent cameras Lambertian, photometrically consistent, rig calibrated, epipolar à disparity (2) Interpolate/reproject pixel motion to virtual camera But there are problems with real-world images Need to filter and sometimes fake it 16

  17. INTERPOLATION CHALLENGES noise periodic textures occlusion textureless regions specular surfaces 17

  18. PIXEL MOTION POST-PROCESSING Detect occlusion boundaries Reduce pixel motion noise between occlusion boundaries Fill holes (textureless regions) with plausible motion Enforce temporal coherence 18

  19. 360 STEREO STITCHING PIPELINE Calibrate Camera Compute Stereo Overlap Project to Sphere (equirectangular) Generate Disparity Map Post Process Interpolate, Disparity Map Reproject & Blend real-time GPU accelerated

  20. STEREO PANORAMA PIPELINE Input SEtch Output • Uncompressed outputs (RGBA) • Capture • Real Eme – CUDA accelerated • Render to output devices(HMD) • SDI • High quality • MP4 (hardware accelerated encode) • USB 3.0 • Scalable across mulEple GPUs • Live stream • TCP/IP • Scalable across mulEple rigs • Queuing & synchronizaEon • MP4 (hardware accelerated decode) • Compressed and uncompressed input

  21. STITCHING IN REAL-TIME 21

  22. CHALLENGE: How can we decode real-time? Need to decode 8 separate 4K streams at 30 fps (similar to 240 fps!) After getting frames to GPU, will we have any time left to stitch? 22

  23. PIPELINING DECODE AND STITCH Using NVDEC dedicated hardware decoder for better throughput Stitch Output Decode Frame N Frame N Frame N Stitch Output Decode Frame N+1 Frame N+1 Frame N+1 Stitch Output Decode Frame N+2 Frame N+2 Frame N+2 Time 23

  24. CHALLENGE: Dealing with memory copy latency Must copy input/output between GPUs and CPU Synchronous memory copy injects bubbles in compute workload 24

  25. STREAMS AND ASYNC MEMCPY Using CUDA streams to overlap compute and copy 25

  26. CHALLENGE: Synchronizing CUDA streams without blocking Synchronizing with host can leave bubbles in compute work Can do better when we just need to sync/join streams with each other 26

  27. FORK AND JOIN WITH STREAMS Avoid host synchronization with stream events kernels kernels stream1 <fork kernels off in stream1 > kernels event stream2 <fork kernels off in stream2 > cudaEventRecord(event, stream2) cudaStreamWaitEvent(stream1, event, 0) <launch kernels that need to wait on both streams> 27

  28. CHALLENGE: Achieving maximum quality in real-time More cameras, higher output resolution, faster refresh rate -> higher quality Can only do so much with a single GPU End-to-pipeline is same for each stereo pair (task parallelism!) Disparity Post- Interpolate Project Map Process & Blend 28

  29. MULTI-GPU SCALING Distribute stereo pairs among available devices 29

  30. PERFORMANCE: DECODE SINGLE P6000 TWO P6000 FOUR P6000 4K 30fps 6 streams 12 streams 24 streams 1080p 60fps 12 streams 24 streams 48 streams 1080p 30fps 24 streams 48 streams 96 streams 30

  31. PERFORMANCE: STITCHING 8x 4K input streams SINGLE P6000 TWO P6000 FOUR P6000 5K x 5K output 14 FPS 26 FPS 37 FPS 4K x 4K output 22 FPS 39 FPS 51FPS 2.8K x 2.8K output 38 FPS 60 FPS 62 FPS 31

  32. WORKS WITH MULTIPLE RIGS 32

  33. VRWORKS 360 VIDEO SDK 33

  34. STEREO SDK COMING SOON Mono SDK beta out now! Optimized stereo pipeline Real-time, low-latency Ambisonic audio support I/O Formats: MP4, H264, CUDA memory GPU-accelerated camera calibration Custom calibration of fisheye lenses 34

  35. SAMPLE APPLICATION Working with calibration API // Set initial guesses for properties HelperCalibSetProperties(hCalibration, cam_props); // Feed in images for auto-calibration HelperCalibSetImages(hCalibration, input_frames); // Calibrate intrinsics, extrinsics, and distortion characteristics NVCALIB_Calibrate(hCalibration); 35

  36. SAMPLE APPLICATION Working with stitching API // Apply calibrated parameters to stitcher NVSS_Stitcher_SetCalibration(hStitcher, hCalibration) // Decode streams into device memory // Run stitching pipeline NVSS_Stitcher_Stitch(hStitcher, &cam_images, &pano_image); // Encode output for streaming or interop to OpenGL 36

  37. VRWORKS 360 VIDEO SDK NOW AVAILABLE! Features: Real-Time & Offline Stitching Up to 32 x 4k Camera Rigs and different fish-eye lenses GPU-Accelerated Decode, Stitching, and Encode Inputs: MP4 files, RGBA files, RGBA CUDA arrays Outputs: MP4 files, RGBA files, or OpenGL textures 3x2 cube map and equirectangular 360 projection Audio stitching in off-line mode Mono SDK Available in Beta Now! Stereo SDK Available Soon 37 37

  38. 38

Recommend


More recommend