sound 2 frequency analysis
play

Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound - PowerPoint PPT Presentation

COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2 Wave equation =


  1. COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1

  2. Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2

  3. Wave equation 𝑄𝑠𝑓𝑑𝑑𝑣𝑠𝑓 = 𝐽 𝑏𝑒𝑛 + 𝐽(π‘Œ, 𝑍, π‘Ž, 𝑒) 𝐽(π‘Œ, 𝑍, π‘Ž, 𝑒) is not an arbitrary function. Rather: πœ–π‘Œ 2 + πœ– 2 πœ– 2 πœ–π‘ 2 + πœ– 2 πœ– 2 1 𝐽 π‘Œ, 𝑍, π‘Ž, 𝑒 = πœ–π‘’ 2 𝐽 π‘Œ, 𝑍, π‘Ž, 𝑒 πœ–π‘Ž 2 𝑀 2 𝑀 = 340 m/s 3

  4. The wave equation + boundary conditions give complicated shadow and reflection effects. What happens when sound enters the ear ? plane wave + single slit sea waves + islands 4

  5. Musical sounds (brief introduction) 5

  6. Example: guitar Write one string displacement at t = 0 as sum of sines. 𝜌 Modes are sin( 𝑀 π‘˜π‘¦) where 𝑀 is the length of the string, π‘˜ is an integer. 6

  7. 𝑀 Physics says: πœ• = 𝑑 𝑀 where constant 𝑑 depends on physical properties of string (mass density, tension) 7

  8. Modes of a vibrating string each have fixed points which reduce the effective length. 𝑀 𝑀 𝑀 𝑀 2 3 4 Physics says: πœ• = 𝑑 2𝑑 3𝑑 4𝑑 𝑀 𝑀 𝑀 𝑀 8

  9. πœ• = 𝑑 2𝑑 3𝑑 4𝑑 𝑀 𝑀 𝑀 𝑀 πœ• 0 β€œfundamental” β€œovertones” (1 st harmonic) The temporal frequency 𝑛 πœ• 0 is called the 𝑛 -th harmonic. 9

  10. For stringed instruments, most of the sound is produced by vibrations of the instrument body (neck, front and back plates). http://www.acs.psu.edu/drussell/guitars/hummingbird.html The lines in the sketches below are the nodal points. They don't move. These are vibration modes , not harmonics. The guitar sound is a sum of these modes. 10

  11. Difference of two frequencies πœ• 1 and πœ• 2 : πœ• 2 π‘šπ‘π‘• 2 octaves. πœ• 1 e.g. 1 octave is a doubling of frequency. 11

  12. (Western) Musical Notes Each β€œoctave” ABCDEFGA is divided into 12 β€œsemitones”, separated into 1/12 octave. C-D, D-E, F-G, G-A, A-B are two semitones each E-F, B-C are one semitone each. 12

  13. Q: How many semi-tones are there from πœ• 0 to πœ• ? 13

  14. Q: How many semi-tones are there from πœ• 0 to πœ• ? πœ• A: 12 π‘šπ‘π‘• 2 πœ• 0 πœ• πœ• 0 Fundamental frequency of note 14

  15. 88 fundamental frequencies (Hz) on a keyboard The fundamental frequencies of successive notes define a geometric progression. This is different from the harmonics of a vibrating string which define an arithmetic progression . 15

  16. Speech Sounds 16

  17. What determines speech sounds? β€’ voiced vs. unvoiced β€˜zzzz’ vs. β€˜ ssss ’, β€˜ vvvv ’ vs. β€˜ ffff ’ β€’ articulators (jaw, tongue, lips) β€˜ aaaa ’, β€˜ eeee ’, β€˜ oooo ’, … 17

  18. Voiced sounds are produced by β€œglottal pulses”. π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 𝑕 𝑒 βˆ’ π‘˜ π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 18

  19. Exercise 16 Q7. 𝑕 𝑒 βˆ’ 𝑒 0 = 𝑕 𝑒 βˆ— πœ€(𝑒 βˆ’ 𝑒 0 ) 19

  20. Voiced sounds are produced by β€œglottal pulses”. π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 𝑕 𝑒 βˆ’ π‘˜ π‘ˆ = 𝑕 𝑒 βˆ— πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 π‘˜=0 20

  21. π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 𝑕 𝑒 βˆ’ π‘˜ π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 decrease π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š by increasing tension in vocal cords ≑ increase frequency of pulses 21

  22. Let 𝑏 𝑒 be the impulse response function of the articulators. (jaw, tongue,lips) π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 𝐽 𝑒 = 𝑏 𝑒 βˆ— 𝑕 𝑒 βˆ— πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 22

  23. π‘œ π‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 π‘œ π‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 23

  24. π‘œ π‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 π‘œ π‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 24

  25. 25

  26. Oral and nasal cavity have resonant modes of vibration, like air cavity in guitar does. 26

  27. Time domain Temporal frequency domain Peaks are called β€œformants” 27

  28. π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 𝐆 πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ = ? π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 π‘ˆ 𝑕 is the period of the glottal pulse train. The pulse train has π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š pulses in π‘ˆ time steps, i.e. π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š = π‘ˆ . Assume that the Fourier transform is taken over π‘ˆ samples. 28

  29. Assignment 3: Show π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š βˆ’1 π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š βˆ’1 𝐆 πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ = π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š πœ€ πœ• βˆ’ 𝑛 π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘˜=0 𝑛=0 π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š 29

  30. Units of temporal frequency πœ• π‘ˆ π‘•π‘šπ‘π‘’π‘’π‘π‘š is the period of the glottal pulse train. π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š pulses in π‘ˆ time samples. To convert π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š to β€˜pulses per second’, we divide π‘ˆ (to get pulses per sample) and then multiply by β€˜time samples per second’. High quality audio uses 44,100 samples per second. 30

  31. π‘œ π‘•π‘šπ‘π‘’π‘’π‘π‘š is the fundamental frequency of the voiced sound. It determines the "pitch". Adult males : 100-150 Adult females : 150-250 Hz Children: over 250 Hz 31

  32. glottal pulse spectrum β€œformants” sound spectrum πœ• 0 = 100 𝐼𝑨 πœ• 0 = 200 𝐼𝑨 glottal pulse spectrum formant spectrum sound spectrum 32

  33. Voiced vowel sounds 33

  34. Unvoiced sounds noise instead of glottal pulses 34

  35. Unvoiced sounds noise instead of glottal pulses Flat amplitude spectrum on average ( β€˜white noise’) 35

  36. Consonants Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives - voiced z, v, zh, th (the) - unvoiced ? Stops - voiced b, d, g - unvoiced ? Nasals (closed mouth) - m, n, ng 36

  37. Consonants Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives - voiced z, v, zh, th (the) - unvoiced s, f, sh, th (theta) Stops - voiced b, d, g - unvoiced p, t, k Nasals (closed mouth) - m, n, ng 37

  38. I did not have time to cover the following slides properly. I will present them again in lecture 22. 38

  39. Spectrogram Partition a sound signal into 𝐢 blocks of π‘ˆ samples each (i.e. the sound has πΆπ‘ˆ samples in total). Take the Fourier transform of each block. 39

  40. Spectrogram Partition a sound signal into 𝐢 blocks of π‘ˆ samples each (i.e. the sound has πΆπ‘ˆ samples in total). Take the Fourier transform of each block. Let 𝑐 be the block number, and πœ• units be cycles per block. 40

  41. Cycles per second (Hz) πœ• 0 = Time (samples) 41

  42. e.g. T = 512 samples (12 ms), πœ• 0 = 86 Hz T = 2048 samples (48 ms) πœ• 0 = 21 Hz 42

  43. e.g. T = 512 samples (12 ms), πœ• 0 = 86 Hz T = 2048 samples (48 ms), πœ• 0 = 21 Hz You cannot simultaneously localize the frequency and the time. This is a fundamental tradeoff. We have seen it before (recall the Gaussian). 43

  44. Narrowband (good frequency resolution, poor temporal resolution … ~50ms) Wideband (poor frequency resolution, good temporal resolution) 44

  45. Examples: Spectrograms of 10 vowel sounds 45

Recommend


More recommend