CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1
Digital Representations … 0 1 1 0 1 1 0 … Sound … 1 0 0 1 1 0 1 … Image … 0 0 1 1 0 1 1 … Text
Digital Representations § Sampling and Quantization – Sound (samples) – Image (pixels) § Trade-off – Resolution (quality) and data size
Digital Audio Chain …0 0 1 0 1 0 … 4
Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 5
Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 6
Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m f m f s -f m f s -f s -f s +f m -f m f s +f m Nyquist Frequency § Why? 𝑦 𝑜 = sin 2𝜌𝑔 * 𝑜𝑈 - = sin 2𝜌𝑔 * 𝑜/𝑔 - 𝑦 𝑢 = sin 2𝜌𝑔 * 𝑢 = sin 2𝜌𝑔 * 𝑜/𝑔 - ± 2𝜌𝑙𝑜 = sin 2𝜌𝑜(𝑔 * ± 𝑙𝑔 - )/𝑔 - 7
Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 8
Aliasing in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m -f s -f s +f m f m f s -f m f s -f m f s +f m § The frequency that we hear is 𝑔 - − 𝑔 * In order to avoid aliasing f m < f s − f m 9
Aliasing in Frequency Domain § For general signals, high-frequency content is folded over to lower frequency range Audible range f s -f s -f m f s -f m f m f s +f m 10
Avoid Aliasing § Increase sampling rate f s > 2 ⋅ f m § Use lowpass filters before sampling -f s -f m f s -f m f s f m f s +f m Lowpass Filter f s -f s -f s /2 f s /2 11
Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Trivial sawtooth wave spectrum Bandlimited sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 12 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)
Example of Aliasing Aliasing in Video https://www.youtube.com/watch?v=jHS9JGkEOmA 13
Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 14
Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 15
Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 16
Dynamic Range § Dynamic range Again, RMS of full-scale sine wave – The ratio between the loudest and softest levels for both loudest and softest 2 B − 1 / S rms,max 2 (With 16bits, DR = 90.31 dB) 20log 10 = 20log 10 = 6.02 B − 6 S rms,min 1/ 2 § Human ear’s dynamic range – Depending on frequency band 17 Equal Loudness Curve
Clipping and Headroom § Clipping – Non-linear distortion that occurs when a signal is above the max level § Headroom – Margin between the peak level and the max level In digital audio, 0dB is regarded as the maximum level Clipping 0 dB Max level Head room B = 16 bits -90.31 dB Min level -98.08 dB Noise floor (By quantization) 18
Recommend
More recommend