ctp431 music and audio computing digital audio
play

CTP431- Music and Audio Computing Digital Audio Graduate School of - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1 Digital Representations 0 1 1 0 1 1 0 Sound 1 0 0 1 1 0 1 Image 0 0 1 1 0 1 1 Text Digital


  1. CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1

  2. Digital Representations … 0 1 1 0 1 1 0 … Sound … 1 0 0 1 1 0 1 … Image … 0 0 1 1 0 1 1 … Text

  3. Digital Representations § Sampling and Quantization – Sound (samples) – Image (pixels) § Trade-off – Resolution (quality) and data size

  4. Digital Audio Chain …0 0 1 0 1 0 … 4

  5. Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 5

  6. Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 6

  7. Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m f m f s -f m f s -f s -f s +f m -f m f s +f m Nyquist Frequency § Why? 𝑦 𝑜 = sin 2𝜌𝑔 * 𝑜𝑈 - = sin 2𝜌𝑔 * 𝑜/𝑔 - 𝑦 𝑢 = sin 2𝜌𝑔 * 𝑢 = sin 2𝜌𝑔 * 𝑜/𝑔 - ± 2𝜌𝑙𝑜 = sin 2𝜌𝑜(𝑔 * ± 𝑙𝑔 - )/𝑔 - 7

  8. Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 8

  9. Aliasing in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m -f s -f s +f m f m f s -f m f s -f m f s +f m § The frequency that we hear is 𝑔 - − 𝑔 * In order to avoid aliasing f m < f s − f m 9

  10. Aliasing in Frequency Domain § For general signals, high-frequency content is folded over to lower frequency range Audible range f s -f s -f m f s -f m f m f s +f m 10

  11. Avoid Aliasing § Increase sampling rate f s > 2 ⋅ f m § Use lowpass filters before sampling -f s -f m f s -f m f s f m f s +f m Lowpass Filter f s -f s -f s /2 f s /2 11

  12. Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Trivial sawtooth wave spectrum Bandlimited sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 12 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)

  13. Example of Aliasing Aliasing in Video https://www.youtube.com/watch?v=jHS9JGkEOmA 13

  14. Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 14

  15. Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 15

  16. Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 16

  17. Dynamic Range § Dynamic range Again, RMS of full-scale sine wave – The ratio between the loudest and softest levels for both loudest and softest 2 B − 1 / S rms,max 2 (With 16bits, DR = 90.31 dB) 20log 10 = 20log 10 = 6.02 B − 6 S rms,min 1/ 2 § Human ear’s dynamic range – Depending on frequency band 17 Equal Loudness Curve

  18. Clipping and Headroom § Clipping – Non-linear distortion that occurs when a signal is above the max level § Headroom – Margin between the peak level and the max level In digital audio, 0dB is regarded as the maximum level Clipping 0 dB Max level Head room B = 16 bits -90.31 dB Min level -98.08 dB Noise floor (By quantization) 18

Recommend


More recommend