CTP431- Music and Audio Computing Digital Audio Graduate School of - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1

Digital Representations … 0 1 1 0 1 1 0 … Sound … 1 0 0 1 1 0 1 … Image … 0 0 1 1 0 1 1 … Text

Digital Representations § Sampling and Quantization – Sound (samples) – Image (pixels) § Trade-off – Resolution (quality) and data size

Digital Audio Chain …0 0 1 0 1 0 … 4

Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 5

Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 6

Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m f m f s -f m f s -f s -f s +f m -f m f s +f m Nyquist Frequency § Why? 𝑦 𝑜 = sin 2𝜌𝑔 * 𝑜𝑈 - = sin 2𝜌𝑔 * 𝑜/𝑔 - 𝑦 𝑢 = sin 2𝜌𝑔 * 𝑢 = sin 2𝜌𝑔 * 𝑜/𝑔 - ± 2𝜌𝑙𝑜 = sin 2𝜌𝑜(𝑔 * ± 𝑙𝑔 - )/𝑔 - 7

Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 8

Aliasing in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m -f s -f s +f m f m f s -f m f s -f m f s +f m § The frequency that we hear is 𝑔 - − 𝑔 * In order to avoid aliasing f m < f s − f m 9

Aliasing in Frequency Domain § For general signals, high-frequency content is folded over to lower frequency range Audible range f s -f s -f m f s -f m f m f s +f m 10

Avoid Aliasing § Increase sampling rate f s > 2 ⋅ f m § Use lowpass filters before sampling -f s -f m f s -f m f s f m f s +f m Lowpass Filter f s -f s -f s /2 f s /2 11

Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Trivial sawtooth wave spectrum Bandlimited sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 12 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)

Example of Aliasing Aliasing in Video https://www.youtube.com/watch?v=jHS9JGkEOmA 13

Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 14

Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 15

Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 16

Dynamic Range § Dynamic range Again, RMS of full-scale sine wave – The ratio between the loudest and softest levels for both loudest and softest 2 B − 1 / S rms,max 2 (With 16bits, DR = 90.31 dB) 20log 10 = 20log 10 = 6.02 B − 6 S rms,min 1/ 2 § Human ear’s dynamic range – Depending on frequency band 17 Equal Loudness Curve

Clipping and Headroom § Clipping – Non-linear distortion that occurs when a signal is above the max level § Headroom – Margin between the peak level and the max level In digital audio, 0dB is regarded as the maximum level Clipping 0 dB Max level Head room B = 16 bits -90.31 dB Min level -98.08 dB Noise floor (By quantization) 18

CTP431- Music and Audio Computing Digital Audio Graduate School of - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1 Digital Representations 0 1 1 0 1 1 0 Sound 1 0 0 1 1 0 1 Image 0 0 1 1 0 1 1 Text Digital

CTP431- Music and Audio Computing Digital Audio Effects Graduate School of Culture Technology

CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of

CTP431- Music and Audio Computing Audio Signal Processing (Part #1) Graduate School of Culture

CTP431- Music and Audio Computing Audio Signal Processing (Part #2) Graduate School of Culture

CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture

Digital Audio Graduate School of Culture Technology, KAIST Juhan Nam Outlines Introduction

CTP431- Music and Audio Computing, Fall 2017 Introduction Graduate School of Culture Technology,

CTP431- Music and Audio Computing Musical Interface and Sequencer Graduate School of Culture

Digital Audio Effects Graduate School of Culture Technology, KAIST Juhan Nam Introduction

CTP431- Music and Audio Computing Spectral Analysis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Musical Interface Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Spectral Analysis Graduate School of Culture Technology KAIST

Digital Audio Graduate School of Culture Technology (GSCT) Juhan Nam 1 Outlines

8. Audio databases About digital audio: Advent of digital audio CD in 1983. Order of

Sound File Formats Raw data has samples (interleaved w/stereo) Need way to parse raw

Chapter 8 Digital Media Computer Concepts 2013 8 Section A: Digital Sound Digital Audio

CS378 - Mobile Computing Audio Android Audio Use the MediaPlayer class Common Audio

Web Audio Tutorial 9/4, 2015 Your Goal Learn what digital audio is and how it works

Digital Audio 3 4 Sampling 44.1 kHz 48 kHz 96 kHz 192 kHz digitisation =

Welcome! There is no hold music when your audio connects, so thank you for your patience. The

The OMRAS2 project Bringing together semantic audio, music informatics and computational

Towards usability of Higher Order Ambisonics in Digital Audio Workstations Matthias Kronlachner