ctp431 music and audio computing fundamentals of sound
play

CTP431- Music and Audio Computing Fundamentals of Sound and Digital - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines What is Sound? Sound Properties Loudness Pitch Timbre Digital Representation of


  1. CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1

  2. Outlines § What is Sound? § Sound Properties – Loudness – Pitch – Timbre § Digital Representation of Sound – Sampling – Quantization 2

  3. What Is Sound? § Vibration of air that you can hear – Compression and rarefaction of air pressure Perception Propagation Production Vibration on materials Traveling via the air Sensation of the air vibration (e.g. string, pipe, membrane) through ears Physical Psychological 3

  4. Physical Sound § Governed by “Newton’s law” and ”Wave” properties § Sound production and propagation in musical instruments 1. Drive force on a sound object 2. Vibration by restoration force 3. Propagation 4. Reflection 5. Superposition 6. Standing Wave (modes): generate a tone Demos 7. Radiation from the object 8. Propagation through air http://www.acs.psu.edu/drussell/demos.html https://www.youtube.com/watch?v=_X72on6CSL0 4

  5. Psychological Sound § Governed by ears (physiological sense) and brain (cognitive sense) – human auditory system § Ears – A series of highly sensitive transducers – Transform sound into subband signals Electric § Brain (Cook, 1999) – Segregate and organize the auditory stimulus Fluid Air Mechanical – Recognize loudness, pitch and timbre Auditory Transduction Video http://www.youtube.com/watch?v=PeTriGTENoc 5

  6. Sound Properties Loudness Amplitude Frequency Pitch Waveshape Timbre Time Envelop (ADSR) Spectral Envelope (Modes) … Physical Psychological 6

  7. Loudness § Perceptual correlate of sound intensity § Sound Pressure Level (SPL) – Objective measure of sound intensity – Log scale: 20log 10 ( P / P 0 ) 0 = 20 µ Pa : threshold of human hearing P – Loudness is proportional to SPL but not exactly § Equal-Loudness Curve – Most sensitive to 2-5KHz tones – Threshold of hearing Equal-Loudness Curve (also called Fetcher-Munson Curve) 7

  8. 4000 Pitch 3500 3000 § Perceptual correlate of fundamental 2500 frequency − Hz 2000 frequency (F0) 1500 1000 § Pitch Scale 500 – Human ears are sensitive to frequency changes 0 10 20 30 40 50 time [second] in a log scale Chromatic Scale of Piano notes (Linear Frequency) • Ex) Piano note scale 120 100 § Frequency Range of Hearing MIDI note number 80 – 20 to 20kHz 60 40 20 10 20 30 40 50 time [second] Chromatic Scale of Piano notes 8 (Log Frequency)

  9. Timbre § Related to identifying a particular sound object – Musical instruments, human voices, … § Determined by multiple physical attributes – Time envelope (ADSR) – Spectral envelope – Changes of spectral envelope and fundamental ADSR frequency – Harmonicity: ratio between tonal and noise-like characteristics – The onset of a sound differing notably from the sustained vibration Changes of spectral envelope 9

  10. Timbre § Determined by multiple parameters – Perspective of sound synthesis Source: http://www.matrixsynth.com/2011/05/kid-with-buchla.html 10

  11. Digital Audio Chain …0 0 1 0 1 0 … 11

  12. Microphones / Speakers § Microphones – Air vibration to electrical signal – Dynamic / condenser microphones – The signal is very weak: use of pre-amp § Speakers – Electrical signal to air vibration – Generate some distortion (by diaphragm) – Crossover networks: woofer / tweeter 12

  13. Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 13

  14. Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 14

  15. Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency -f m f m f m f s -f s -f m f s -f m f s +f m To avoid overlap f m < f s − f m § Why ? f 2 = f 1 ± mf s x 1 ( t ) = A sin( ω 1 t ) = A sin(2 π f 1 n / f s ) x 2 ( t ) = A sin( ω 2 t ) = A sin(2 π f 2 n / f s ) = A sin(2 π ( f 1 ± mf s ) n / f s ) = A sin(2 π f 1 n / f s ± 2 π mn ) = A sin(2 π f 1 n / f s ) = x 1 ( t ) 15

  16. Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 16

  17. Aliasing in Frequency Domain § The high-frequency content is folded over to lower frequency range from the replicated images -f s -f m f s -f m f s f m f s +f m § A low-pass filter is applied before sampling to avoid the aliasing noise -f s /2 f s /2 f s -f s 17

  18. Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Bandlimited sawtooth wave spectrum Trivial sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 18 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)

  19. Example of Aliasing § Aliasing in Video – https://www.youtube.com/watch?v=QOqtdl2sJk0 – https://www.youtube.com/watch?v=jHS9JGkEOmA ( Note that video frame rate corresponds to the sampling rate ) 19

  20. Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 20

  21. Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 21

  22. Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 22

Recommend


More recommend