Musical Source Separation: Principles and State of the Art Juan Jos - PowerPoint PPT Presentation

Musical Source Separation: Principles and State of the Art Juan José Burred Équipe Analyse/Synthèse, IRCAM burred@ircam.fr 2nd International Workshop on Learning Semantics of Audio Signals (LSAS), Paris, 21st June 2008

Presentation overview 1. Introduction Paradigms, tasks, applications o Mixing models o 2. Solving the linear mixing model Joint and staged separation o 3. Estimation of the mixing matrix The need for sparsity o Independent Component Analysis o Clustering methods, other methods o 4. Estimation of the sources Norm minimization o Time-frequency masking o 5. Methods using advanced source models Adaptive basis decomposition methods o Sinusoidal methods o Supervised methods o 6. Conclusions Juan José Burred. Musical Source Separation. 2

Sound Source Separation • “Cocktail party effect” E. C. Cherry, 1953. o Ability to concentrate attention on a o specific sound source from within a mixture. Even when interfering energy is close to o energy of desired source. • “Prince Shotoku Challenge” Legendary Japanese prince Shotoku (6th Century o AD) could listen and understand simultaneously the petitions by ten people. Concentrate attention on several sources at the o same time! “Prince Shotoku Computer” (Okuno et al., 1997) o • Both allegories imply an extra step of semantic understanding of the sources, beyond mere acoustical isolation. [Cherry53] E. C. Cherry. Some Experiments on the Recognition of Speech, With One and Two Ears. Journal of the Acoustical Society of America, Vol. 25, 1953. [Okuno97] H. G. Okuno, T. Nakatani and T. Kawabata. Understanging Three Simultaneous Speeches. Proc. Int. Joint Conference on Artificial Intelligence (IJCAI), Nagoya, Japan, 1997. Juan José Burred. Musical Source Separation. 4

The paradigms of Musical Source Separation • (based on [Scheirer00]) Understanding without separation Multipitch estimation, music genre classification “Glass ceiling” of traditional methods (MFCC, GMM) [Aucouturier&Pachet04] Separation for understanding First (partially) separate, then feature extraction Source separation as a way to break the glass ceiling? Separation without understanding BSS: Blind Source Separation (ICA, ISA, NMF) Blind means: only very general statistical assumptions taken. Understanding for separation Supervised source separation (based on a training database) [Scheirer00] E. D. Scheirer. Music-Listening Systems . PhD thesis, Massachusetts Institute of Technology, 2000. [Aucouturier&Pachet04] J.-J. Aucouturier and F. Pachet. Improving Timbre Similarity: How High is the Sky? Journal of Negative Results in Speech and Audio Sciences, 1 (1), 2004. Juan José Burred. Musical Source Separation. 5

Required sound quality • Regarding the quality of the separated sounds, source separation tasks can be divided into: • Audio Quality Oriented (AQO) Aimed at full unmixing at the highest possible quality. o Applications: o Unmixing, remixing, upmixing o Hearing aids o Post-production o • Significance Oriented (SO) Separation quality just enough for facilitating semantic analysis of complex o signals. Less demanding, more realistic. o Applications: o Music Information Retrieval o Polyphonic Transcription o Object-based audio coding o Juan José Burred. Musical Source Separation. 6

Musical Source Separation Tasks • Classification according to the nature of the mixtures: • Classification according to available a priori information: Juan José Burred. Musical Source Separation. 7

Linear mixing model • Only amplitude scaling before mixing (summing) • Linear stereo recording setups: XY Stereo MS Stereo Close miking Direct injection Juan José Burred. Musical Source Separation. 8

Delayed mixing model • Amplitude scaling and delay before mixing • Delayed stereo recording setups: Close miking Direct injection AB Stereo Mixed Stereo with delay with delay Juan José Burred. Musical Source Separation. 9

Convolutive mixing model • Filtering between sources and sensors • Convolutive stereo recording setups: Close miking Direct injection Reverberant environment Binaural with reverb with reverb Juan José Burred. Musical Source Separation. 10

Some terminology • System of linear equations: Usual algebraic methods from high school: X known, A known, S unknown o But in source separation: unknown variables ( S , sources) AND unknown coefficients o ( A , mixing matrix) • Algebra terminology is retained for source separation: More equations (mixtures) than unknowns (sources): overdetermined o Same number of equations (mixtures) than unknowns (sources): determined (square A ) o Less equations (mixtures) than unknowns (sources): underdetermined o • The underdetermined case is the most demanding, but also the most important for music! Music is (still) mostly in stereo, with usually more than 2 instruments o Overdetermined and determined situtations are only of interest for arrays of sensors or o arrays of microphones (localization, tracking) • Alternative interpretation of the linear model as a linear transform from signal space to mixture space, with A the transformation matrix and the columns of A the transformation bases. Juan José Burred. Musical Source Separation. 11

Solving the linear model • Direct way to tackle the problem: Mean Square Error (MSE) minimization: o F is the Frobenius norm (“matrix energy”) o BUT: this has infinitely many solutions o • One must assume probability distributions for the involved variables Maximum A Posteriori (MAP) approach: maximize o Applying Bayes’ theorem and o Assuming A has a uniform distribution (all source positions are equally equal) o and Assuming the sources are statistically independent this finally yields o is the noise variance (if any) and is the assumed log-density of the sources o Juan José Burred. Musical Source Separation. 13

Staged separation • However, such a joint estimation of A and S is: Extremely computationally demanding o Unstable with respect to convergence o • Most methods follow thus a staged approach: first estimate the mixing matrix, then estimate the sources. • Note that, if A is square (determined source separation) and invertible (virtually always for usual mixtures), then the sources can be readily obtained by (^ denotes estimation) • In that case, source separation amounts to mixing matrix estimation! • In the underdetermined case, A is rectangular and thus non-invertible. Thus, a second source estimation stage is needed! Juan José Burred. Musical Source Separation. 14

Mixing matrix estimation Simple examples can be visualized by means of scatter plots • Determined mixture Underdetermined mixture (2 channels, 2 sources) (2 channels, 3 sources) The coordinates of each data point are the values of a certain signal • coefficient (time sample, time-frequency bin) in each of the mixtures. Data points tend to concentrate around the vectors defined by the columns • of the mixing matrix: the mixing directions. The goal of mixing matrix estimation is thus to find such vectors. • Juan José Burred. Musical Source Separation. 16

Musical Source Separation: Principles and State of the Art Juan Jos - PowerPoint PPT Presentation

Musical Source Separation: Principles and State of the Art Juan Jos Burred quipe Analyse/Synthse, IRCAM burred@ircam.fr 2nd International Workshop on Learning Semantics of Audio Signals (LSAS), Paris, 21st June 2008 Presentation

A Musical Journey A Musical Journey A Musical Journey A Musical Journey A Musical Journey A

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

Separation energies A = 21 isobaric chain one-nucleon separation energies two-nucleon separation

Memorial Day Choral Festival May 24-27, 2019 A Musical Tribute To Americas Veterans A Musical

Musical Instruments They sound different, even on the same note They require energy to

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

A Classification Approach to Single Channel Source Separation CS 6772 Project Ron Weiss

ART OF CHANGE 21 PRSENTATION 2 ART OF CHANGE 21 ABOUT US Art of Change 21 works in the field

Overview of Presentation Public Art Definitions Why is Public Art Important ? Percent for Art

Underdetermined Source Separation Using Speaker Subspace Models Thesis Defense Ron Weiss May 4,

Musical Interfaces and Sequencers Graduate School of Culture Technology, KAIST Juhan Nam Musical

Musical Theatre Song: A Comprehensive Course In Selection, Preparation, And Presentation For The

Fundamentals of Musical Acoustics Graduate School of Culture Technology, KAIST Juhan Nam

Science, Policy & Musical Science, Policy & Musical Chairs Chairs Millie Baird Millie

The environment for musical learning A creative, confident learner

Supporting accessibility in your distribution Some feedback from Debian

BlindBox: Deep Packet Inspection Over Encrypted Traffic Justine Sherry, Chang Lan, Raluca Ada

Its 2050: What am I reading? Kim Marrio7 Monash University

Anonymous IBE, Leakage Resilience and Circular Security from New Assumptions Zvika Brakerski,

God Hates? By: Steve Morgan Proverbs 6:16-19 (NIV) There are six things the Lord hates,

The cost of global trade is estimated at $1.8 trillion annually 1 with potential savings from more

Hear and See Your Audience Intentional Resource and Program Design and Delivery June 12, 2020

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of

Musical Source Separation: Principles and State of the Art Juan Jos - PowerPoint PPT Presentation

Musical Source Separation: Principles and State of the Art Juan Jos Burred quipe Analyse/Synthse, IRCAM burred@ircam.fr 2nd International Workshop on Learning Semantics of Audio Signals (LSAS), Paris, 21st June 2008 Presentation

A Musical Journey A Musical Journey A Musical Journey A Musical Journey A Musical Journey A

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

Separation energies A = 21 isobaric chain one-nucleon separation energies two-nucleon separation

Memorial Day Choral Festival May 24-27, 2019 A Musical Tribute To Americas Veterans A Musical

Musical Instruments They sound different, even on the same note They require energy to

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

A Classification Approach to Single Channel Source Separation CS 6772 Project Ron Weiss

ART OF CHANGE 21 PRSENTATION 2 ART OF CHANGE 21 ABOUT US Art of Change 21 works in the field

Overview of Presentation Public Art Definitions Why is Public Art Important ? Percent for Art

Underdetermined Source Separation Using Speaker Subspace Models Thesis Defense Ron Weiss May 4,

Musical Interfaces and Sequencers Graduate School of Culture Technology, KAIST Juhan Nam Musical

Musical Theatre Song: A Comprehensive Course In Selection, Preparation, And Presentation For The

Fundamentals of Musical Acoustics Graduate School of Culture Technology, KAIST Juhan Nam

Science, Policy &amp; Musical Science, Policy &amp; Musical Chairs Chairs Millie Baird Millie

The environment for musical learning A creative, confident learner

Supporting accessibility in your distribution Some feedback from Debian

BlindBox: Deep Packet Inspection Over Encrypted Traffic Justine Sherry, Chang Lan, Raluca Ada

Its 2050: What am I reading? Kim Marrio7 Monash University

Anonymous IBE, Leakage Resilience and Circular Security from New Assumptions Zvika Brakerski,

God Hates? By: Steve Morgan Proverbs 6:16-19 (NIV) There are six things the Lord hates,

The cost of global trade is estimated at $1.8 trillion annually 1 with potential savings from more

Hear and See Your Audience Intentional Resource and Program Design and Delivery June 12, 2020

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs Chang Ye Dept. of

Science, Policy & Musical Science, Policy & Musical Chairs Chairs Millie Baird Millie