COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1
Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2
Wave equation ππ ππ‘π‘π£π π = π½ ππ’π + π½(π, π, π, π’) π½(π, π, π, π’) is not an arbitrary function. Rather: ππ 2 + π 2 π 2 ππ 2 + π 2 π 2 1 π½ π, π, π, π’ = ππ’ 2 π½ π, π, π, π’ ππ 2 π€ 2 π€ = 340 m/s 3
The wave equation + boundary conditions give complicated shadow and reflection effects. What happens when sound enters the ear ? plane wave + single slit sea waves + islands 4
Musical sounds (brief introduction) 5
Example: guitar Write one string displacement at t = 0 as sum of sines. π Modes are sin( π ππ¦) where π is the length of the string, π is an integer. 6
π Physics says: π = π π where constant π depends on physical properties of string (mass density, tension) 7
Modes of a vibrating string each have fixed points which reduce the effective length. π π π π 2 3 4 Physics says: π = π 2π 3π 4π π π π π 8
π = π 2π 3π 4π π π π π π 0 βfundamentalβ βovertonesβ (1 st harmonic) The temporal frequency π π 0 is called the π -th harmonic. 9
For stringed instruments, most of the sound is produced by vibrations of the instrument body (neck, front and back plates). http://www.acs.psu.edu/drussell/guitars/hummingbird.html The lines in the sketches below are the nodal points. They don't move. These are vibration modes , not harmonics. The guitar sound is a sum of these modes. 10
Difference of two frequencies π 1 and π 2 : π 2 πππ 2 octaves. π 1 e.g. 1 octave is a doubling of frequency. 11
(Western) Musical Notes Each βoctaveβ ABCDEFGA is divided into 12 βsemitonesβ, separated into 1/12 octave. C-D, D-E, F-G, G-A, A-B are two semitones each E-F, B-C are one semitone each. 12
Q: How many semi-tones are there from π 0 to π ? 13
Q: How many semi-tones are there from π 0 to π ? π A: 12 πππ 2 π 0 π π 0 Fundamental frequency of note 14
88 fundamental frequencies (Hz) on a keyboard The fundamental frequencies of successive notes define a geometric progression. This is different from the harmonics of a vibrating string which define an arithmetic progression . 15
Speech Sounds 16
What determines speech sounds? β’ voiced vs. unvoiced βzzzzβ vs. β ssss β, β vvvv β vs. β ffff β β’ articulators (jaw, tongue, lips) β aaaa β, β eeee β, β oooo β, β¦ 17
Voiced sounds are produced by βglottal pulsesβ. π ππππ’π’ππ π ππππ’π’ππ π π’ β π π ππππ’π’ππ π=0 18
Exercise 16 Q7. π π’ β π’ 0 = π π’ β π(π’ β π’ 0 ) 19
Voiced sounds are produced by βglottal pulsesβ. π ππππ’π’ππ π ππππ’π’ππ π ππππ’π’ππ π π’ β π π = π π’ β π π’ β π π ππππ’π’ππ ππππ’π’ππ π=0 π=0 20
π ππππ’π’ππ π ππππ’π’ππ π π’ β π π ππππ’π’ππ π=0 decrease π ππππ’π’ππ by increasing tension in vocal cords β‘ increase frequency of pulses 21
Let π π’ be the impulse response function of the articulators. (jaw, tongue,lips) π ππππ’π’ππ π½ π’ = π π’ β π π’ β π π’ β π π ππππ’π’ππ π=0 22
π ππ£ππ‘π β1 π ππ£ππ‘π β1 23
π ππ£ππ‘π β1 π ππ£ππ‘π β1 24
25
Oral and nasal cavity have resonant modes of vibration, like air cavity in guitar does. 26
Time domain Temporal frequency domain Peaks are called βformantsβ 27
π ππππ’π’ππ π π π’ β π π = ? ππππ’π’ππ π=0 π π is the period of the glottal pulse train. The pulse train has π ππππ’π’ππ pulses in π time steps, i.e. π ππππ’π’ππ π ππππ’π’ππ = π . Assume that the Fourier transform is taken over π samples. 28
Assignment 3: Show π ππππ’π’ππ β1 π ππππ’π’ππ β1 π π π’ β π π = π ππππ’π’ππ π π β π π ππππ’π’ππ ππππ’π’ππ π=0 π=0 π ππππ’π’ππ π ππππ’π’ππ 29
Units of temporal frequency π π ππππ’π’ππ is the period of the glottal pulse train. π ππππ’π’ππ pulses in π time samples. To convert π ππππ’π’ππ to βpulses per secondβ, we divide π (to get pulses per sample) and then multiply by βtime samples per secondβ. High quality audio uses 44,100 samples per second. 30
π ππππ’π’ππ is the fundamental frequency of the voiced sound. It determines the "pitch". Adult males : 100-150 Adult females : 150-250 Hz Children: over 250 Hz 31
glottal pulse spectrum βformantsβ sound spectrum π 0 = 100 πΌπ¨ π 0 = 200 πΌπ¨ glottal pulse spectrum formant spectrum sound spectrum 32
Voiced vowel sounds 33
Unvoiced sounds noise instead of glottal pulses 34
Unvoiced sounds noise instead of glottal pulses Flat amplitude spectrum on average ( βwhite noiseβ) 35
Consonants Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives - voiced z, v, zh, th (the) - unvoiced ? Stops - voiced b, d, g - unvoiced ? Nasals (closed mouth) - m, n, ng 36
Consonants Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives - voiced z, v, zh, th (the) - unvoiced s, f, sh, th (theta) Stops - voiced b, d, g - unvoiced p, t, k Nasals (closed mouth) - m, n, ng 37
I did not have time to cover the following slides properly. I will present them again in lecture 22. 38
Spectrogram Partition a sound signal into πΆ blocks of π samples each (i.e. the sound has πΆπ samples in total). Take the Fourier transform of each block. 39
Spectrogram Partition a sound signal into πΆ blocks of π samples each (i.e. the sound has πΆπ samples in total). Take the Fourier transform of each block. Let π be the block number, and π units be cycles per block. 40
Cycles per second (Hz) π 0 = Time (samples) 41
e.g. T = 512 samples (12 ms), π 0 = 86 Hz T = 2048 samples (48 ms) π 0 = 21 Hz 42
e.g. T = 512 samples (12 ms), π 0 = 86 Hz T = 2048 samples (48 ms), π 0 = 21 Hz You cannot simultaneously localize the frequency and the time. This is a fundamental tradeoff. We have seen it before (recall the Gaussian). 43
Narrowband (good frequency resolution, poor temporal resolution β¦ ~50ms) Wideband (poor frequency resolution, good temporal resolution) 44
Examples: Spectrograms of 10 vowel sounds 45
Recommend
More recommend