department of statistics svm based classification of
play

Department of Statistics SVM based Classification of Instruments - - PowerPoint PPT Presentation

Department of Statistics SVM based Classification of Instruments - Timbre Analysis Uwe Ligges and Sebastian Krey Department of Statistics, TU Dortmund Reisensburg, Statistical Computing 2009 Introduction Model Building Timbre


  1. Department of Statistics SVM based Classification of Instruments - Timbre Analysis Uwe Ligges and Sebastian Krey Department of Statistics, TU Dortmund Reisensburg, Statistical Computing 2009

  2. Introduction Model Building Timbre Features/Classification Results Summary 2 Introduction Why Timbre Analysis? Why classification of voices or instruments? Timbre generation objective criteria for the assessment of the quality of vocal performance support for singing teachers and students who try to improve voices derive properties related to performance quality aspects of single tones like solidity / softness / brilliance of tones Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  3. Introduction Model Building Timbre Features/Classification Results Summary 3 Introduction Music Recommender Systems quest for better features widely used on server infrastructure that support music listeners who download music from the web to their home computers and even their mobile devices. Speech Recognition where timbre should not make a difference Hearing Aids (and other audio compression tasks) ‘Vowel Classification by a Perceptually Motivated Neurophysiologically Parameterized Auditory Model’ (Szepannek et al., 2006) perception analysis Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  4. Introduction Model Building Timbre Features/Classification Results Summary 4 Introduction automatic transcription of (polyphonic) music of interest for music publishers, music amateurs, and scientists (particularly those working in music psychology) parts of transcription algorithms heavily used in music recommender systems Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  5. Introduction Model Building Timbre Features/Classification Results Summary 5 Pitch estimation Several methods for pitch estimation ( f 0 tracking, ...) have been proposed: in time domain (such as a model that follows shortly) in frequency domain (such as our heuristical proposal) hybrid methods any combinations with, e.g., HMMs none of them works really well on singing data none of them works on polyphonic data Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  6. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) = cos [2 π tf 0 + φ ] + ǫ t y t f 0 = fundamental frequency , the parameter of interest ǫ t = error � 0 S , 1 S , . . . , T − 1 � t ∈ time, no. of observations T S φ = phase displacement Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  7. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H � = cos [2 π tf 0 ( h ) + φ h ] + ǫ t y t h =1 f 0 = fundamental frequency , the parameter of interest ǫ t = error � 0 S , 1 S , . . . , T − 1 � t ∈ time, no. of observations T S φ h = phase displacement of h -th partial H = no. of partials in the model Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  8. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H � = B h cos [2 π tf 0 ( h ) + φ h ] + ǫ t y t h =1 f 0 = fundamental frequency , the parameter of interest ǫ t = error � 0 S , 1 S , . . . , T − 1 � t ∈ time, no. of observations T S φ h = phase displacement of h -th partial H = no. of partials in the model B h = amplitude of h -th partial Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  9. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H � = B h cos [2 π tf 0 ( h + δ h ) + φ h ] + ǫ t y t h =1 f 0 = fundamental frequency , the parameter of interest ǫ t = error � 0 S , 1 S , . . . , T − 1 � t ∈ time, no. of observations T S φ h = phase displacement of h -th partial H = no. of partials in the model B h = amplitude of h -th partial δ h = frequency displacement of h -th partial where δ 1 := 0 Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  10. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H I � � = Φ i ( t ) B h , i cos [2 π tf 0 ( h + δ h ) + φ h y t h =1 i =0 ] + ǫ t B h , i = amplitude of h -th partial for i -th basis function i = index of I + 1 basis functions Φ i ( t ) := cos 2 � π tS − i ∆ � 1 [( i − 1)∆ , ( i +1)∆] ( t ) i -th basis function 2∆ defined on windows with 50% overlap, ∆ := T − 1 , 1 indicator I function, S sampling rate Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  11. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H I � � = Φ i ( t ) B h , i cos [2 π tf 0 ( h + δ h ) + φ h y t h =1 i =0 +( h + δ h ) A v sin(2 π f v t + φ v )] + ǫ t B h , i = amplitude of h -th partial for i -th basis function i = index of I + 1 basis functions Φ i ( t ) := cos 2 � π tS − i ∆ � 1 [( i − 1)∆ , ( i +1)∆] ( t ) i -th basis function 2∆ defined on windows with 50% overlap, ∆ := T − 1 , 1 indicator I function, S sampling rate f v = frequency of vibrato A v = amplitude of vibrato φ v = phase displacement of vibrato Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  12. Introduction Model Building Timbre Features/Classification Results Summary 6 Pitch estimation model (monophonic) H I � � = Φ i ( t ) B h , i cos [2 π tf 0 ( h + δ h ) + φ h y t h =1 i =0 +( h + δ h ) A v sin(2 π f v t + φ v )] + ǫ t B h , i = amplitude of h -th partial for i -th basis function i = index of I + 1 basis functions Φ i ( t ) := cos 2 � π tS − i ∆ � 1 [( i − 1)∆ , ( i +1)∆] ( t ) i -th basis function 2∆ defined on windows with 50% overlap, ∆ := T − 1 , 1 indicator I function, S sampling rate f v = frequency of vibrato A v = amplitude of vibrato φ v = phase displacement of vibrato 5 + 3 H parameters to estimate , but H might be > 10 Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  13. Introduction Model Building Timbre Features/Classification Results Summary 7 Pitch estimation model (POLYphonic) J H I � � � y t = Φ i , j ( t ) B h , i , j cos [2 π tf 0 , j ( h j + δ h , j ) + φ h , j j =1 h =1 i =0 +( h j + δ h , j ) A v , j sin(2 π f v , j t + φ v , j )] + ǫ t Joint work in progress (?) with Katrin Sommer, Claus Weihs; cooperation with Technical University of Tampere. J number of polyphonic tones Identifiability ?! Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  14. Introduction Model Building Timbre Features/Classification Results Summary 8 The Timbre Problem Timbre Classification Joint work with Sebastian Krey Specific task: Classification of instruments based on a given audio track of one tone Data: McGill Instrument Database, 38 instruments played in 59 ways (e.g. bowed vs. pizz.), each with 6-88 differently pitched tones, altogether 1976 wave files (44100 Hertz, 16 bit, 3-5 seconds each) Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  15. Introduction Model Building Timbre Features/Classification Results Summary 9 Let’s start Pre-emphasis filtering to increase higher partials: y t = x t − 0 . 97 x t − 1 Short Time Fourier Transformation (on overlapping windows): N − M � � − 2 i π j k � F ( t , k ) = w ( j − t ) x j exp N j =1 − M Hamming windows (width: 25ms, overlap: 10ms): � 2 π t � − T 2 ≤ t ≤ T � 0 . 54 − 0 . 46 cos , T 2 w ( t ) = 0 otherwise Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

  16. Introduction Model Building Timbre Features/Classification Results Summary 10 Let’s start Mel scale: Transformation of FFT frequencies to Mel scale in order to model the emotional sense of the human ear (better resolution of human ear above 1 kHz, for example): � � 1 + hz Mel ( hz ) = 2595 log 10 700 Uwe Ligges and Sebastian Krey: SVM based Classification of Instruments - Timbre Analysis Reisensburg, Statistical Computing 2009

Recommend


More recommend