The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox - PowerPoint PPT Presentation

Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group | Microsoft

Agenda ● Why compress? ● The tools at present ● Measuring success ● A glimpse of the future

The Philosophy of Compression

The tools of the present ● Black box codecs ● Parameters that may or may not have well- understood meaning ● Results that may or may not be appropriate ● Compression targets ● Iteration slow enough to be discouraged ● Bulk quality settings

Compression formats, ca. 2012 ● Lossless codecs (<3:1): FLAC, Apple Lossless ● Lossy codecs ● “Reductions” (up to ∞:1): sample rate, bit depth, channel count, noise floor, culling ● Time domain: A-law/u-law, ADPCM (~4:1) ● Perceptual (6-40+:1): MP3, Ogg Vorbis, XMA, etc. ● Hybrids (vary): AAL, WavPack, MP3 variants

PCM Yes, still compression! ● Pulse Code Modulation ● Analog signal regularly sampled and stored digitally ● Bit depth: Storage representation of a sample ● Linear PCM = linear quantization ● Sampling rate: Frequency of analog signal capture or reproduction ● Nyquist frequency (SR/2)

PCM and Quantization ● Frequency quantization ● 44,100 Hz can represent sound frequencies up to 22,050 Hz ● Amplitude quantization 2 16 ● 16 bits: 20 log 2 = ~90 dB range 2 8 ● 8 bits: 20 log 2 = ~42 dB range

PCM A-Law/µ-Law (G.711) ● Pulse Code Modulation (1972, ITU 1988) ● Adds compander support ● A-Law (13 bit signed  8 bit signed) ● µ-Law (14 bit signed  8 bit signed) ● Encodes location of most significant non-zero bit, drops one or more LSBs ● Designed for telephony (8 kHz, 8 bit)

ADPCM (G.726) ● Adaptive Differential Pulse Code Modulation (ITU 1970s, IMA 1990s) ● Stores difference between samples ● Quantized to a step size lookup table ● ~4:1 compression (16 bits  4 bits) ● Cheap to decode on CPU, straightforward to HW accelerate

ADPCM Artifacts ● Codec assumption: Signal slope doesn’t change suddenly PCM ● Poor response to transients, source quick attacks ADPCM ● Settling time before silence output ● Challenged particularly at lower sampling rates (<32 kHz) ● Step size quantization errors

Perceptual Compression ● MP3, WMA, XMA, AAC, Ogg Vorbis, ATRAC, AC- 3… ● Psychoacoustic: based on human frequency sensitivities ● Frequency-domain compression ● Take advantage of limits of auditory perception

Perceptual Compression Strategies ● Frequency sensitivities ● Nominally 20 kHz, often realistically 16 kHz ● Most sensitive to speech range ● Absolute threshold of hearing ● Masking

Acoustic Masking ● Frequency Masking 20 50 100 200 500 1000 2000 4000 8000 16000 A narrow 1200 Hz noise band masks sounds at higher ● Time Masking frequencies (Scharf 1975) ● Forward masking ● Backward masking

Perceptual Codec Artifacts ● Time  frequency domain artifacts ● Window size limits accuracy for transients: ringing or pre-echoes ● Loss of phase information: warbles, ‘underwater’ ● Channel collapse/recreation artifacts ● Spatial loss and cross-talk

Game-Specific Perceptual Artifacts (Or, Games are from Mars, Codecs are from Venus) ● Pitch shifting ● Mixing / Synchronization ● Repetition and Reuse ● Looping

New Dog, Old Tricks ● Sample rate reduction ● Bit depth reduction ● Channel reduction ● Normalization …can all be less effective (or ineffective) with perceptual codecs

Choosing a Compression Format ● Support (device platform, middleware) ● Performance tradeoffs (CPU or hardware) ● Licensing (or lack thereof)

Evaluating Codec Capabilities ● Storage and bandwidth ● Decode latency ● Multichannel support (and leveraging) ● Looping accuracy ● Seamless seeking ● Perceptual quality

Measuring Success ● Critical listening and perceptual codecs

Squeeze Play: The Game Show Which wave is more compressed? A B C PCM XMA q60 ADPCM (46 KB) (8 KB, (12.5 KB, ~6:1) ~3.6:1)

Which wave is more compressed? Input (44.1 kHz PCM) 1.85 MB A Output (XMA, quality 1) 140 KB [13:1 compression] Output (xWMA, 48 kbps) B 76 KB [24:1 compression]

Measuring Success ● Critical listening and perceptual codecs ● Visual evaluations

Which wave is more compressed? Input (32 kHz PCM) 298 KB Output (ADPCM) A 82 KB [3.6:1 compression] Output (xWMA, 20 kbps) B 16 KB [18.6:1 compression] Output (XMA, quality 1) C 28 KB [10.6:1 compression]

Measuring Success ● Critical listening and perceptual codecs ● Visual evaluations ● Delta evaluations (Taylor, 2011)

Delta Evaluations

Measuring Success ● Critical listening and perceptual codecs ● Visual evaluations ● Delta evaluations (Taylor, 2011) ● Automated evaluation ● PESQ/POLQA (ITU-T Rec. P.863) ● PEAQ (ITU BS.1387-1) ● Noise to Mask Ratio (NMR)

NMR Evaluation ● Noise to Mask Ratio ● Windowed evaluation of Signal-to-Mask Ratio (SMR) minus Signal-to-Noise Ratio (SNR) NMR at three XMA quality settings (Mathews 2012)

The Compression of the Future? ● Self-correcting/adjusting compression ● Communicating more with less ● Linguistic sounds and speech synthesis ● MIDI music: the revenge? ● Parameterized procedural synthesis ● Case study: impacts

Impacts ● Resonant decay + transient ● Compress as modes + residual (>150:1) Lloyd, Raghuvanshi, Govindaraju (ACM, 2011) frequency + = time Original Modal (“Clean”) Residual (“Noise”)

Conclusions ● Know thy artifacts ● And use appropriate techniques to counter ● What’s the playback context ? ● More robust qualitative evaluation ● Avoid the ‘bulk’ knob ● Consider automating listening tests

Questions? scottsel@microsoft.com Xbox LIVE Gamertag: Timmmmmay

The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox - PowerPoint PPT Presentation

Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group | Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future The Philosophy of

UML State Models U State ode s Basic State Model Concepts/Notations Basic State Model

Finite State Machines (FSM) AKA Finite State Automat on State Machines Introduction State

Finite State Machines (FSM) Chapter 8 State Machines Introduction State Machines Mealy and

Liberty State Park Park Interior WRT Liberty State Park Today Liberty State Park The Park

ASP/ADOT Slide Rock State Park Catalina State Park Homolovi State Park Arizona State Parks

New State Pension The new State Pension will effect those reaching State Pension age on or after

Empire State Development New York State Surety Bond Assistance Program and Other State Resources

Finite State Machine (FSM) Consists of: CLK State register S S Next Current

Finite State Machine (FSM) Consists of: CLK State register S S Next Current

Dr. Ron Heiniger Vernon G. James Research Center North Carolina State University NC STATE

State of the State: Housing Perspectives from the field State of the State: Housing Perspectives

Overview Chapter 7 Ideal Gas Equation of State P= RT/V Van der Waals Equation of State Cubic

Lecture 4 Finite State Machines 1 9/26/2019 Modeling Finite State Machines (FSMs)

Lecture 4 Finite State Machines 1 9/18/2020 Modeling Finite State Machines (FSMs)

Unit 14 State Machine Design 14.2 Outcomes I can create a state diagram to solve a

Quantum Computation Lecture 27 And that s all we got time for! 1 State 2 State State of

1. Welcome & Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal:

Models and Causation of Child Language Disorders Models and Causation of Child Language

Y P O

Robot audition and its deployment Kazuhiro Nakadai Principal Researcher, Honda Research Institute

Accessibility Not An After Thought Charles LaPierre, Technical Lead, DIAGRAM and Born Accessible

Brief introduction to computational & statistical neuroscience Jonathan Pillow Lecture #1

What is computational neuroscience? 1. Use of mathematical/computational tools to study the

rhythm and the enactive sense of extent and duration SYNTHESIS ASU SHA XIN WEI Sha Xin Wei

The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox - PowerPoint PPT Presentation

Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group | Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future The Philosophy of

UML State Models U State ode s Basic State Model Concepts/Notations Basic State Model

Finite State Machines (FSM) AKA Finite State Automat on State Machines Introduction State

Finite State Machines (FSM) Chapter 8 State Machines Introduction State Machines Mealy and

Liberty State Park Park Interior WRT Liberty State Park Today Liberty State Park The Park

ASP/ADOT Slide Rock State Park Catalina State Park Homolovi State Park Arizona State Parks

New State Pension The new State Pension will effect those reaching State Pension age on or after

Empire State Development New York State Surety Bond Assistance Program and Other State Resources

Finite State Machine (FSM) Consists of: CLK State register S S Next Current

Finite State Machine (FSM) Consists of: CLK State register S S Next Current

Dr. Ron Heiniger Vernon G. James Research Center North Carolina State University NC STATE

State of the State: Housing Perspectives from the field State of the State: Housing Perspectives

Overview Chapter 7 Ideal Gas Equation of State P= RT/V Van der Waals Equation of State Cubic

Lecture 4 Finite State Machines 1 9/26/2019 Modeling Finite State Machines (FSMs)

Lecture 4 Finite State Machines 1 9/18/2020 Modeling Finite State Machines (FSMs)

Unit 14 State Machine Design 14.2 Outcomes I can create a state diagram to solve a

Quantum Computation Lecture 27 And that s all we got time for! 1 State 2 State State of

1. Welcome &amp; Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal:

Models and Causation of Child Language Disorders Models and Causation of Child Language

Y P O

Robot audition and its deployment Kazuhiro Nakadai Principal Researcher, Honda Research Institute

Accessibility Not An After Thought Charles LaPierre, Technical Lead, DIAGRAM and Born Accessible

Brief introduction to computational &amp; statistical neuroscience Jonathan Pillow Lecture #1

What is computational neuroscience? 1. Use of mathematical/computational tools to study the

rhythm and the enactive sense of extent and duration SYNTHESIS ASU SHA XIN WEI Sha Xin Wei

1. Welcome & Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal:

Brief introduction to computational & statistical neuroscience Jonathan Pillow Lecture #1