Distribution Systems for 3D Teleimmersive and Video 360 Content: Similarities and Differences Klara Nahrstedt Department of Computer Science University of Illinois at Urbana-Champaign klara@illinois.edu ACM Multimedia Systems, June 12, 2018, Amsterdam, Netherlands
Overview • Motivation • 3D Teleimmersive Video Representation • Video 360 Representation • Similarities and Differences in Content Representation • Distribution of 3DTI Video • Distribution of Video 360 • Similarities and Differences in Content Distribution • Conclusion
3D Teleimmersive (3DTI) Systems Source: http://tele-immersion.citris-uc.org; http://monet.cs.illinois.edu/projects/cyphy-multi- 3 modal-teleimmersion-for-tele-physiotherapy/teleimmersion-gallery/
High-End Tele-Presence Environments Cisco Tele-presence HP Colesium HP Halo UNC
Multi-Camera Live Broadcast Systems http://www.dailymail.co.uk/sciencetech/article-2336893/New-TV-cameras-bring-Matrix-style-bullet-time- trickery-live-sports-coverage.html
Multi-Camera Broadcast Systems https://www.spiideo.com/sports/ https://thegadgetflow.com/portfolio/slingstudio- multi-camera-broadcaster/ https://www.cinfo.es/our-products/synthetrick/multicam https://www.myslingstudio.com/
360-Degree Video 7 360 Degrees Cameras – CoolPile.com: http://coolpile.com/tag/360-degrees-cameras
3D Teleimmersive Video Representation
3D Teleimmersive Stereo Video and Free Viewpoint Video Capture
3DTI Viewing Singapore, 2014 Photo courtesy of Prof. Ruzena Bajcsy.
3D Stereo Video Representation Wu, Ahsan, Kurillo , Agarwal, Nahrstedt, Bajcsy, “Color -plus-Depth Level-of-Detail in 3D Teleimmersive Video: A Psychophysical Approach”, ACM Multimedia 2011
Free-Viewpoint 3D Video Representation camera-1 Camera-2 Camera-3 Camera-8 Example of 3D representation captured by different cameras
View Model camera direction O i O u Angle θ source: http://zing.ncsl.nist.gov/~gseidman/vrml/
3DTI Data Model • 3D frame for camera i at time t : f i,t … • Each pixel in the frame carries color+depth data 1 n and can be independently rendered … f 1,t1 f n,t1 F t1 • Stream for camera i … F t2 f 1,t2 f n,t2 • S i = { f i,t1 f i,t2 … } • Macro-frame • F t = { f 1,t f 2,t … f n,t } S 1 S n
360-Degree Video Representation
360-Degree Video User’s Viewport Generation of 360-Degree Video • Capturing of multiple 2D videos together with their metadata • Stitching videos together and further editing them in spherical video • Encoding spherical video considering projection, interactivity, storage and delivery formats (this will impact decoding and rendering processes)
Video 360 Viewing and Navigation Controller Example of HDM (Head-Mounted Displays) – Oculus Rift, Samsung Gear VR, HTC Vive, https://en.wikipedia.org/wiki/Head-mounted_display
360-Degree Video Data Model • Field-of-View or Viewport – display region on the Head-Mounted Display • Fraction of omnidirectional view of the scene • Viewport defined by a device-specific viewing angle (typically 120 degrees) which delimits horizontally scene from head direction center, called viewport center • Viewport Resolution – 4K (3840x2160) pixels • Resolution of full 360-degree video – at least 12K (11520x6480) • Video Framerate – order of HMD refresh rate 100Hz – 100 fps • Motion-to-Photon Latency requirement • Less than 20 ms for VR – much smaller than Internet request-reply delay • Need viewport prediction • Bitrate – Video 360 vs HEVC (8K video at 60fps is approx. 100 Mbps) • Tiling - Spatial divide of spherical video into in independent tiles
Tiles and Spherical Maps Issues with Spherical Mapping to Tiles • Viewport distortion • Spatial quality variance Considerations of sphere-to-plane mapping and viewing probability of tiles are IMPORTANT • Overall spherical distortion of segment is the sum of distortion over all pixels the segment covers Xie et al. “360ProbDASH: Improving QoE of 360 Video Streaming Using Tile- based HTTP Adaptive Streaming”, ACM MM 2017
Video 360 Spherical-to-Plane Projections Video 360 Capture as Spherical Video Equirectangular Projection – stretches poles and reduces efficiency of coding Pyramid Projection – sees degradation on sides Cubemap – maps 90 degree FOV to sides of cube and provides hence less degradation Carbillon, Simon, Devlic, Chakareski , “Viewport -Adaptive Navigable 360- Degree Video delivery”, May 2017 Nasrabadi et al. “Adaptive 360 - Degree Video Streaming using Scalable Video Coding”, ACM Multimedia 2017
Encoding and Delivery Formats • Codecs • MPEG – Immersive media standard ISO/IEC 23090 • AVC/H.264, HEVC/H.265 • Part 1: Use cases • VP8, VP9 • Part 2: OMAF (Omnidirectional Media • Delivery Formats Application Format) • DASH/HLS (Dynamic Adaptive HTTP) • Description of equirectangular projection • MPEG-DASH Standard considers format • Metadata for interoperable rendering of tiling 360-degree monoscopic and stereoscopic • MPD (Media Presentation audio-visual data Description) – Modified for Video 360 • Storage format (ISO base media file • SRD (Spatial Relation Description) format/MP4 • Codecs: HEVC, MPEG0H 3D audio integrated into MPD • Part 3: Immersive video • HEVC considers video tiles • Part 4: Immersive Audio Graf, Timmerer , Mueller, “Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP”, ACM MMSys 2017
Similarities and Differences of Representations
Similarity Parameter 3DTI Video 360-Degree Video Multi-camera Views Yes (view) Yes (viewport) Joint coordinate system Yes Yes Bitrate consideration Yes Yes View change Yes Yes Difference Parameter 3DTI Video 360-Degree Video Color-Plus-Depth Color Video Format 3DTI frame tile Smallest item to adapt Frame Representation Frame manipulation at Pixel level (RGB, Frame manipulation at tiles and Region Depth, Polygons) of Interest level Simple zlip Complex HVEC Coding 640x480 or 1080p 4K to 16K Resolution No Yes Resolution for diverse devices No Yes Format for diverse navigation
Distribution Systems of 3DTI Video
Multi-Camera 3DTI Transmission System Site -2 microphone Site-1 microphone C = camera A = microphone camera G = gateway R = renderer av display camera av display A C P R C R C A Internet C G switch G G switch 25
Approach: Multi-stream Hierarchical Adaptation
Multi-stream Adaptation camera (Stream Selection) direction • Camera orientation: • User view orientation: cos = , , where is the angle between camera and user view • Selection (SI) – View-Centric Stream Selection where T is a user specified parameter Zhenuy Yang, Klara Nahrstedt, Bin Yu, Ruzena Bajcsy, “A Multi -stream Adaptation framework for Bandwidth Management in 3D Teleimmersion ” , ACM NOSSDAV 2006 , May 2006, Newport, Rhode Island
View-Centric Stream Differentiation 3D D captu turin ing 3D D ren enderin ing 3D D camera 8 8 less im les important str trea eams 6 2 tr transmis ission 6 2 4 4 str trea eams con ontributin ing user er vie view more to o user vie iew
Timing Performance Validation Macro-Frame Delay at Sender side Macro-frame Completion Interval at Receiver Side (End-to-End Delay UIUC-UCB)
Immersive View-Centric Multi- View Multi-Party 3DTI Z. Yang et al. “ ViewCast: View Dissemination and Management for Multi-Party 3D Tele- immersive Environments, ACM Multimedia 2007
Multi-Party Multi-View Telepresence Multi-stream contents High resource demand Multi-view environment Multi-stream dependency Real-time interactivity camera-1 Camera-2 Camera-3 Camera-8 c c Camera c 5 6 c view 4 c 7 3 c c2 c1 8 Example of 3D representation captured by 4 cameras
Telepresence Session Control Decoupled control and data plane Global Session Controller Hierarchical control Site-X C Global session controller C Local session controllers at G C Site-Y C G G Coordinated global control plane Monitor data plane A R Site-Z Configure data plane A R C = camera C G A Data plane at TI participants A = microphone G = gateway C Session routing table (SRT) (SRT) R R = renderer Stream forwarding Matching Forwarding Bitrate Field (ID) Action
ViewCast: Middleware (Overlay) Framework A three-layer multi-party/multi-stream management framework View-aware Stream Differentiation/Selection Tele-immersive Application ViewCast Service Middleware Network Overlay network
3D camera 3D capturing 3D rendering 8 8 less important streams U 2 6 2 transmission 6 2 V 2 4 4 streams contributing User/node’s view more to user view request session controller U 3 V 1 V 3 U 4 . w U 3 . w U 2 . w V 4 U 4 user view
Why view change U 2 a problem? V 2 session U 3 . w controller U 3 V 1 V 3 U 4 . w U 3 . w U 2 . w V 4 U 4 victim
Streams/View GC = 100%, I i ( O i ) = 24 average 3.2 better than MC – 3 performance but with 22% less rejection ratio
Immersive and Non-Immersive Multi-Party Multi-View (Live Broadcast) Systems Arefin Ahsan , Zixia Huang, Klara Nahrstedt , Pooja Agarwal, “ 4D TeleCast: Towards Large Scale Multi-site and Multi- view Dissemination of 3DTI Content”, IE IEEE IC ICDCS 20 2012 12, Makau, China.
Recommend
More recommend