Holograms are the Next Video Philip A. Chou, 8i Labs, Inc. ACM Multimedia Systems Conference 13 June 2018
Princess Leia Star Wars Episode IV, 1977
The Holodeck Star Trek Next Generation, Episode 12, 1988
Black Panther, 2018
“No, you can’t wipe ` em off. They’re holograms.” – Tobias Becket to Chewbacca. Solo, 2018
Gabor Holograms Encode https://en.wikipedia.org/wiki/Holography Decode • Dennis Gabor, “A new microscopic principle,” Nature, 1948. • Etymology: holo + gram, from Ancient Greek (hólos, whole) + ( grammḗ , letter, line, writing, message)
Gabor Holograms as (Angular Spectra of) Images from Multiple Viewpoints 𝑤 𝜚 𝑤 𝜚 𝜄 𝑣 𝜄 𝑣
Images from Multiple Viewpoints as Light Fields 𝑤 𝑢 𝑣 𝑡
Light Fields as Point Clouds
Agenda • Introduction • Holograms == Volumetric Media (Gabor Holograms, Light Fields, Point Clouds, …) • Applications • Historical remarks • Point Cloud Compression (PCC) • Light Field Compression using PCC • Streaming Holograms • Conclusion
Applications
Holograms: The Medium to Represent Natural Content in VR / AR / MR VR puts you in a Virtual World AR puts virtual objects in your world
Audio: Three Modes of Distribution On-Demand Live Broadcast Telecommunication
Video: Three Modes of Distribution On-Demand Live Broadcast Telecommunication
Holograms: Three Modes of Distribution Buzz Aldrin: Cycling Pathways to Mars On-Demand Live Broadcast Telecommunication
Historical Remarks
180+ years since invention of images (as photographs) Daguerreotype, 1838
140+ Years since the invention of audio (as telephony) Telephone ca. 1875
90+ Years since the invention of video (as television) Television, 1926
Dawn of Digital Video Arun Netravali, Head Visual IEEE Transactions on Computers, 1974 Communications Research Dept.
JPEG (January 1988) Photo from JPEG (Macau, October 2017): Celebration of 25 th Anniversary of JPEG Standard (1992) Today: > 1 Trillion photos/year
MPEG PCC (Macau, October 2017) Hologram compression today is like video compression in 1988
Subjective Results: Category 2 (Dynamic) 3.5 6.0 9 18 Mbps 3.9 6.0 13 27 Mbps 3.5 6.0 11 20 Mbps
MPEG Point Cloud Compression (PCC) 1. Static (single-frame) 3. Dynamic Acquisition 2. Dynamic (multi-frame) (e.g., from Mobile Mapping Systems)
“Video - based” approach • Patch Information • Dominant axis • (x,y,z) offset • (u,v) offset • Dimensions • Occupancy map • Geometry video • Texture video
“Native 3D” approach to coding geometry 10010001 10010001 11001001 10010001
“Native 3D” approach to coding attributes e.g., Y, U, V 221,136,255 255,153,255 255,102,255 153,153,255
Point Cloud Attribute Compression using a Region Adaptive Hierarchical Transform (RAHT) Ricardo L. de Queiroz and Philip A. Chou , “Compression of 3D Point Clouds Using a Region- Adaptive Hierarchical Transform,” IEEE Trans. Image Processing, Aug 2016. Maja Krivokuca, Maxim Koroteev, Philip A. Chou, Robert Higgs, and Charles Loop, “A Volumetric Approach to Point Cloud Compression,” in preparation.
Three Generations of Transforms for Point Cloud Attribute Compression 1. Graph Signal Processing (Graph Fourier Transform – GFT) 2. Sampled Spatial Stochastic Process (Gaussian Process Transform – GPT) 3. Volumetric Functions (Region Adaptive Hierarchical Transform – RAHT)
Measure • Measure 𝜈: 𝑇 ↦ ℝ + maps each set to a non-negative real number 𝑑 and ∪ 𝑇 𝑗 ∈ ℬ ) • The sets lie in a 𝜏 -algebra ℬ (set of sets for which 𝑇 𝑗 ∈ ℬ ⇒ 𝑇 𝑗 • If 𝑇 1 , 𝑇 2 , … are disjoint, then 𝜈 ∪ 𝑇 𝑗 = ∑𝜈(𝑇 𝑗 ) . • Examples: • Lebesgue measure on ℝ maps each interval of length 𝑀 to 𝑀 • Probability distribution of r.v. 𝑌 maps each set 𝑇 to the probability that 𝑌 ∈ 𝑇 • Counting measure w.r.t. points 𝒚 1 , … , 𝒚 𝑜 ∈ ℝ 3 maps each 𝑇 ⊂ ℝ 3 to #points in 𝑇 𝑦 1 𝑦 1 𝜈 = 𝜈 = 2 𝑦 2 𝑦 2
Measure defines Integration 𝑔(𝑦) 𝑜Δ Δ 𝑦 𝜈 𝒚 | 𝑔 𝒚 ≥ 𝑜Δ ∫ 𝑔 𝒚 𝑒𝜈 𝒚 = lim Δ→0 Δ 𝜈( 𝒚 | 𝑔 𝒚 ≥ 𝑜Δ ) = 𝑔 𝒚 𝑗 𝑜 𝑗
Integration defines Inner Product. Inner Product defines Norm, Orthogonality. 𝑔, = ∫ 𝑔 𝒚 𝒚 𝑒𝜈 𝒚 = ∑ 𝑗 𝑔 𝒚 𝑗 (𝒚 𝑗 ) 𝑔 2 = 𝑔, 𝑔 = ∑ 𝑗 𝑔 𝒚 𝑗 2 𝑔 ⊥ iff 0 = 𝑔, = ∑ 𝑗 𝑔 𝒚 𝑗 (𝒚 𝑗 ) ⇒ Measure defines Hilbert Space, and with it all the machinery required for function approximation
Cardinal B-Splines of Order 𝑞 Integer shifts of scaling functions span space of functions that are • Piecewise polynomial of degree 𝑞 − 1 over unit intervals • Continuously differentiable up to order 𝑞 − 1 Scaling functions
B-Spline Basis Functions ( 𝑞 = 1 ) Nested subspaces 𝑊 0 𝑊 𝑊 𝑊 𝑊 1 1 2 0 𝑊 2 𝑊 0 ⊕ 𝑋 0 = 𝑊 1 𝑊 1 ⊕ 𝑋 1 = 𝑊 2
B-Spline Wavelet Basis Functions ( 𝑞 = 1 ) Using Lebesgue Measure Using Counting Measure 1 1 × 1 𝑊 0 0 0 3 1 1/2 𝑋 2 0 0 0 × 3 −1 −1 2 2 1 × 1 𝑋 1 0 0 0 2 −1 − 2 − 2
Multiresolution Approximation Using Lebesgue Measure Using Counting Measure 𝑊 0 𝑊 1 𝑊 2
B-Spline Approximation ( 𝑞 = 1 ) Level 9 Level 6 Level 8 Level 7 Level 5 (237965 coeffs) (3821 coeffs) (62073 coeffs) (15604 coeffs) (917 coeffs)
B-Spline Approximation ( 𝑞 = 2 ) Level 9 Level 6 Level 8 Level 7 Level 5 (497199 coeffs) (7213 coeffs) (125244 coeffs) (30455 coeffs) (1699 coeffs)
Compression Results Comparison to Zhang, Florencio, and Loop, “Point cloud attribute compression with graph transform,” ICIP 2014
Surface Light Field Compression using a Point Cloud Codec Xiang Zhang, Philip A. Chou, Ming- Ting Sun, Maolong Yang, et al., “Surface Light Field Compression using a Point Cloud Codec,” submitted to IEEE JETCAS special issue on immersive video, and to appear at ICIP 2018.
“Light Field” == Plenoptic Function • 7D: 𝑔 𝑦, 𝑧, 𝑨, 𝜄, 𝜚, 𝜇, 𝑢 • 5D: 𝑔(𝑦, 𝑧, 𝑨, 𝜄, 𝜚) • 4D: 𝑔(𝑦, 𝑧, 𝜄, 𝜚) E. H. Adelson and J. R. Bergen, “The plenoptic function and the elements of early vision,” in Computational Models of Visual Processing, 1991.
Image-Based Light Field Representations Multiview representation Lenslet representation M. Levoy and P. Hanrahan, “Light field rendering,” SIGGRAPH 1996. S. J. Gortler, R. Grzeszczuk , R. Szeliski, M. Cohen, “The Lumigraph ,” SIGGRAPH 1996.
Surface Light Field (SLF) • The SLF can be regarded as a function 𝑔 𝒒, 𝝏 , representing the color of surface point 𝒒 = (𝑦, 𝑧, 𝑨) when viewed from direction 𝝏 = (𝜄, 𝜚) . • Spherical image 𝑔 𝝏 𝒒 , or view map, 𝒒 for each 𝒒 generalizes lenslet representation. 𝝏 • To compress 𝑔 𝒒, 𝝏 efficiently: • Represent 𝑔 𝝏 𝒒 for each 𝒒 in view map vie some image basis • Compress coefficients across surface 𝑔 𝝏 𝒒 points to reduce spatial redundancy D. N. Wood, et al., “Surface light fields for 3d photography,” SIGGRAPH 2000 W.-C. Chen, et al. , “Light field mapping: efficient representation and hardware rendering of surface light fields,” SIGGRAPH 2002
View Map Representation Linear combination of basis functions: 𝑔 𝝏 𝒒 = ∑ 𝑗 𝐻 𝑗 𝝏 𝛽 𝑗 (𝒒) B-spline wavelet basis functions sin 𝜚 𝜄 𝑯𝜷 𝒅 Observations 𝑯 𝜷 Coefficients Basis functions 𝑯𝜷 − 𝒅 2 + 𝜇 𝜷 2 + 𝛾 𝜷 − ഥ 𝜷 2 𝜷 = argmin 𝜷
Compress Coefficients of Representation • Spatially, using a Point Cloud Codec • Coefficients are attributes of the points • In this work, we used • Octree+RAHT PCC (MPEG PCC TMC1) • Video-based PCC (MPEG PCC TMC2) • All the SLF coefs. are scaled to the range of [0,255] for 8-bit video codec
Datasets Natural datasets: Elephant, Fish Synthetic datasets: Can, Die D. N. Wood, et al., “Surface light fields for 3D photography,” SIGGRAPH 2000
Die LF reconstruction N=1, 0.30 MB N=8, 0.62 MB N=32, 1.71 MB N=128, 3.90 MB
Fish LF reconstruction N=1, 0.24 MB N=8, 0.53 MB N=32, 1.57 MB N=128, 4.02 MB
RD Performance Elephant Fish
Streaming of Volumetric Media Jounsup Park, Philip A. Chou, and Jenq-Neng Hwang, “Rate -Utility Optimized Streaming of Volumetric Media for Augmented Reality,” arXiv:1804.09864. Also submitted to IEEE JETCAS special issue on immersive video, and to appear at Globecom 2018.
Streaming begins: Delivery rate > Media rate Hologram streaming today is like video streaming in 1997 Streaming QCIF (176x144) streaming video over 56 Kbps in 1997
Streaming 360° (Spherical) Video as Tiles https://bitmovin.com/bitmovin-receives-excellence-dash-award-tile-based-streaming-vr-360-video/
Recommend
More recommend