CS 591 S1 – Computational Audio Wayne Snyder Computer Science Department Boston University Today: Analyzing Rhythm Analyzing rhythm: basic notions and motivations Onset detection, beat tracking Rhythm analysis Tempo Estimation Time Warping to account for variations in tempo Computer Science
What is Rhythm?
Rhythm Analysis: Breakdown of Phases 1. Onset Detection § Where precisely do notes start? 2. Beat tracking § Given an audio recording of a piece of music, determine the periodic sequence of beat positions 3. Tempo and Meter Estimation § Interpreting the periodicity in musical terms (BPM, meter) 4. Analyzing Style and Musical Effects § Variations in tempo (intentional and unintentional) § Musical effects: Anticipations, rubato, fermata, playing “behind the beat” (or ahead), swing. Note: There is something of a “chicken or egg” problem with 2 and 3.... We’ll fold these together for simplicity....
Rhythm Analysis: Introduction Example: Queen – Another One Bites The Dust Time (seconds)
Rhythm Analysis: Introduction Example: Queen – Another One Bites The Dust Time (seconds)
Rhythm Analysis: Introduction Further Examples: If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?
Rhythm Analysis: Introduction Even when rhythm is regular, there is a complicated semantic problem: rhythm is hierarchical, consisting of many interrelated groupings: Pulse level: Measure
Rhythm Analysis: Introduction Pulse level: Tactus (beat)
Rhythm Analysis: Introduction Example: Happy Birthday to you Pulse level: Tatum (fastest unit of division) Note: “Tatum” was named after Art Tatum, one of the greatest of all jazz pianists, who played a lot of fast notes!
Rhythm Analysis: Introduction In a sophisticated piece of music, these various levels are exploited by the composer in complicated ways. How should it be notated and described precisely? What is the time signature? Example: Bach, WTC, Fugue #1 in C Major
Rhythm Analysis: Introduction Challenges in beat tracking § Hierarchical levels often unclear § Global/slow tempo changes (all musicians do this!) § Local/sudden tempo changes (e.g. rubato) § Vague information (e.g., soft onsets, false positives ) § Sparse information: not all beats occur! (often only note onsets are used)
Introduction Tasks § Onset detection § Beat tracking § Tempo estimation
Tasks in Rhythm Analysis Tasks § Onset detection § Beat tracking § Tempo estimation
Tasks in Rhythm Analysis Tasks § Onset detection § Beat tracking § Tempo estimation phase period
Tasks in Rhythm Analysis Tasks § Onset detection Tempo := 60 / period § Beat tracking Beats per minute (BPM) § Tempo estimation period
Onset Detection § Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal ’ s properties: – energy or loudness – pitch or harmony – timbre
Onset Detection § Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal ’ s properties: – energy or loudness – pitch or harmony – timbre [Bello et al., IEEE-TASLP 2005]
Onset Detection (Amplitude or Energy-Based) Steps Waveform Time (seconds)
Onset Detection (Amplitude or Energy-Based) Steps 1. Amplitude squaring (full-wave rectification of power signal) Squared waveform Time (seconds)
Onset Detection (Amplitude or Energy-Based) Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window): “energy envelope” Time (seconds)
Onset Detection (Energy-Based) Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.” Time (seconds)
Onset Detection (Energy-Based) Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.” 4. Half-wave Rectification (negative samples => 0.0): note onsets are indicated by increases in energy only. Time (seconds)
Onset Detection (Energy-Based) Steps 1. Amplitude squaring 2. Windowing 3. Differentiation Peak positions indicate note onset candidates 4. Half wave rectification 5. Peak picking Time (seconds)
Energy based methods work well for percussive instruments, including piano: Example: Bach Well-Tempered Clavier, Book 1, Fugue #1 in C major (Glenn Gould)
Onset Detection § Energy curves often only work for percussive music § Many instruments have weak note onsets: wind, strings, voice. – Example: Shakuhachi Flute § Biggest problem: pitch or timbre changes may not correlate with energy changes (e.g., a singer may change the pitch without changing loudness). § More refined methods needed that capture changes in spectrum [Bello et al., IEEE-TASLP 2005]
Onset Detection (Spectral-Based) Steps: | X | Magnitude spectrogram 1. Spectrogram Frequency (Hz) § Aspects concerning pitch, harmony, or timbre are captured by spectrogram § Allows for detecting local energy changes in certain frequency ranges Time (seconds)
Onset Detection (Spectral-Based) Steps: Compressed spectrogram Y 1. Spectrogram 2. Logarithmic compression Y log( 1 C | X |) = + ⋅ Frequency (Hz) § Accounts for the human logarithmic sensation of sound intensity Dynamic range compression § Enhancement of low-intensity § values Often leading to enhancement § of high-frequency spectrum Time (seconds)
Onset Detection (Spectral-Based) Steps: Spectral difference 1. Spectrogram 2. Logarithmic compression 3. Differentiation Frequency (Hz) § First-order temporal difference § Captures changes of the spectral content § Only positive intensity changes considered Time (seconds)
Onset Detection (Spectral-Based) Steps: Spectral difference 1. Spectrogram 2. Logarithmic compression 3. Differentiation Frequency (Hz) 4. Accumulation: spectral differences summarized by a number. § Frame-wise accumulation of all positive intensity changes Encodes changes of the § spectral content Novelty curve t
Digression: Difference/Distance Metrics One of the most important issues in analyzing data, especially, multi-dimension and/or time- series data, is understand how similar two pieces of data are (represented typically by a vector or multi-dimensional array). There are two principle methods for such comparisons: Distance Metrics : Similar data vectors are regarded as closer in a geometrical sense; the range is [0 .. ∞ ), where distance = 0 means the vectors are identical: b D( a, b ) = “distance” between a and b a Dependence Metrics : Similar data vectors exhibit dependence: they “move together” in similar ways; the range of the coefficients is [-1 .. 1]: -1 0 1 Inverse No Strong 30 Dependence
Distance Metrics A Distance Metric obeys typical geometric laws: A set with an associated Distance Metric is called a Metric Space . 31
Distance Metrics A variety of metrics have been developed, from fields as diverse as game playing to pattern recognition, and the most important of these is as follows: Sum of Absolute Difference (Manhattan Distance): Sum of Squared Difference: Mean Absolute Error: Mean Squared Error: Euclidean Distance: 32
Distance Metrics These measures extend our common understanding of the notion of distance to complex mathematical domains (such as vector spaces) and give us tools to understand how similar or dissimilar two objects are. 33
Dependence Metrics Two common dependence metrics are as follows: Correlation (Pearson’s Product-Moment Correlation Coefficient): Correlation measures the linear dependence of two vectors or random variables X and Y. Cosine Similarity: Cosine similarity measures the cosine of the angle between two vectors of length N in N- dimensional space. NOTE that these are similar calculations, except that correlation subtracts the mean from each point. For musical signals of any length, the mean will be very close to 0, and so these are effectively the same. 34
Recommend
More recommend