CTP431- Music and Audio Computing Spectral Analysis Graduate School of Culture Technology KAIST Juhan Nam 1
Waveform § Time-domain representation of sound – Show the amplitude over time § Amplitude envelope – Short-term loudness: e.g. sound level meter – Computed by various methods • max-peak picking • root-mean-square (RMS) • Hilbert transform – ADSR • The amplitude envelope of musical sounds are often described with attack, decay, sustain and release. – Also used for dynamic range compression: e.g. compressor, expander 2
Example: Waveform and Amplitude Envelopes Flute A4 Note Piano C4 Note 3
Spectrogram § Time/Frequency-domain representation of sound – Show the amplitude envelope of individual frequency components over time – Better representation to observe pitch and timbre characteristics – Often called “Sonogram” § Visualization – 2D color map or waterfall 4
Example: Spectrogram - 2D color map Piano C4 Note Flute A4 Note 5
Example: Spectrogram - 3D waterfall Piano C4 Note Flute A4 Note 6
Phasor § A complex number representing a sinusoidal function with – Amplitude, angular frequency, initial phase 1 𝜕 = 2𝜌𝑔 = 2𝜌/𝑈 𝜕𝑢 + 𝜚 0.5 0 − 0.5 − 1 0 1 2 3 4 5 6 7 8 − 3 x 10 T = 1 f 𝑦 𝑢 = 𝑓 %('()*) = cos 𝜕𝑢 + 𝜚 + jsin(𝜕𝑢 + 𝜚) Euler’s Identity 7
� Fourier Series § Any signal x(t) with period T can be represented as a sum of phasors – The periods of phasors are T, T/2, T/3, ..., T/n, … F F 𝑦 𝑢 = real 1 = real 1 𝑈 ? 𝑠 A 𝑓 % BCA( ∗ 𝑓 % BCA( ∗ = 𝑠 A 𝑓 %* E )* E 𝑈 ? 𝑑 A 𝑑 A D D AGH AGH 𝑏 A = 𝑠 A cos (𝜚 A ) § Web Audio Examples 𝑐 A = 𝑠 A sin (𝜚 A ) – http://codepen.io/anon/pen/jPGJMK 𝑏 AB + 𝑐 A B 𝑠 A = § How can you get the coefficients? (𝑐 A Φ A = arctan ) 𝑏 A 8
Orthogonality of Sinusoids § The phasors are orthogonal to each other unless they have the same frequency D/B 𝑓 % BCR( 𝑓 S% BCT( 𝑒𝑢 = V0 (𝑛 ≠ 𝑜 ) Q D D 𝑈 (𝑛 = 𝑜) SD/B § Using the orthogonality D/B 𝑦(𝑢)𝑓 S% BCA( ∗ = Q 𝑑 A 𝑒𝑢 D SD/B 9
Discrete Fourier Transform (DFT) § Discrete-time version of Fourier series x ( n ) = [ x 0 , x 1 , x 2 , ! , x N − 1 ] § The number of discrete samples, N, corresponds on the period T – We assume that the segment x(n) is repeated every N samples § Then, we can directly derive DFT and Inverse DFT from ^S_ ^S_ 𝑦 𝑜 = 1 𝑌(𝑙) = ? 𝑦(𝑜)𝑓 S% BCAT 𝑂 ? 𝑌(𝑙)𝑓 % BCAT ^ ^ TGH AGH DFT IDFT 10
� Discrete Fourier Transform § Discrete Fourier Transform ^S_ 𝑌 𝑙 = ? 𝑦 𝑜 𝑓 S% BCAT = 𝑌 ` (𝑙) + 𝑘𝑌 b (𝑙) ^ TGH 𝑌 ` (𝑙) B + 𝑌 b (𝑙) B 𝐵(𝑙) = – Magnitude spectrum: (𝑌 b (𝑙) – Phase spectrum: Φ(𝑙) = arctan 𝑌 ` (𝑙)) § We use the magnitude spectrum to display spectrograms 11
DFT Sinusoids ∗ 𝑜 = 𝑓 % BCAT ∗ 𝑜 𝑡 e 𝑡 A ^ ∗ 𝑜 𝑡 B 𝑂 = 8 ∗ 𝑜 𝑡 _ ∗ 𝑜 𝑡 H ∗ 𝑜 ∗ 𝑡 S_ 𝑜 = 𝑡 f ∗ 𝑜 ∗ 𝑡 SB 𝑜 = 𝑡 g ∗ 𝑜 ∗ 𝑡 Se 𝑜 = 𝑡 h ∗ 𝑜 ∗ 𝑡 Si 𝑜 = 𝑡 i Source: the JOS DFT book 12
Fast Fourier Transform § Matrix multiplication view of DFT Source: the JOS DFT book § In fact, we don’t compute this directly. There is a more efficiently way, which is called “Fast Fourier Transform (FFT)” – Complexity reduction by FFT: O( N 2 ) à O( N log 2 N ) – Divide and conquer 13
Examples of DFT Sine waveform Drum Flute 14
Short-Time Fourier Transform (STFT) § DFT assumes that the signal is stationary – It is not a good idea to apply DFT to a long and dynamically changing signal like music – Instead, we segment the signal and apply DFT separately § Short-Time Fourier Transform ℎ : hop size ^S_ 𝑌(𝑙, 𝑚) = ? 𝑥(𝑜)𝑦(𝑜 + 𝑚 n ℎ)𝑓 S% BCAT 𝑥(𝑜) : window ^ : FFT size 𝑂 TGH § This produces 2-D time-frequency representations – Get “spectrogram” from the magnitude – Parameters: window size, window type, FFT size, hop size 15
Windowing § Types of window functions – Trade-off between the width of main-lobe and the level of side-lobe Main-lobe width Side-lobe level 16
Short-Time Fourier Transform (STFT) Source: the JOS SASP book 50% overlap 17
Example: Pop Music 18
Example: Deep Note 19
Time-Frequency Resolutions in STFT § Trade-off between time- and frequency-resolution by window size < Short window > < Long window > low freq.-resolution high freq.-resolution high time-resolution low time-resolution 20
Recommend
More recommend