M. Sc. Thesis Presentation Bird Call Recognition with Artificial Neural Networks, Support Vector Machines, and Kernel Density Estimation Derek Ross Advisors: Dr. H. Card, Electrical and Computer Eng. Dr. D. McNeill, Electrical and Computer Eng. Committee: Dr. P. Yahampath, Electrical and Computer Eng. Dr. J. Anderson, Computer Science Department of Electrical and Computer Engineering University of Manitoba Winnipeg, Canada March 20, 2006
Outline • Goal. • Why bird calls? • Inspiration. • Pre-processing. • Classifiers. • Post-processing. • Results, analysis, conclusion. Bird Call Recognition with ANNs, SVMs and KDE Bird Call Recognition with ANNs, SVMs and KDE Bird Call Recognition with ANNs, SVMs and KDE 2
Goal • Take a recording of a bird call, and automatically determine which of ten species it belongs to. ) ) ) ) ) “ROBIN” Bird Call Recognition with ANNs, SVMs and KDE 3
Applications • Aviation industry, bird strikes. • Electrical distribution. • Wind turbines. • Night-flight monitoring. • Entertainment – “birding.” Bird Call Recognition with ANNs, SVMs and KDE 4
Previous Efforts • Not an active field. • Evans (2005), heuristic system. • Härmä and Somervuo (2004), sounds classified by harmonics. • McIlraith and Card (1997), ANNs, statistical methods. Bird Call Recognition with ANNs, SVMs and KDE 5
Inspiration • Musical instrument recognition research. • Tonal qualities: – Timbre; – Purity; – Dissonance; – Harshness. • Can a bird be recognized by tonal qualities? Bird Call Recognition with ANNs, SVMs and KDE 6
Bird Species 1. ALFL alder flycatcher 2. AMCR American crow 3. AMGO American goldfinch 4. AMRE American redstart 5. AMRO American robin 6. BAOR Baltimore oriole 7. BCCH black-capped chickadee 8. BCTI black-crested titmouse 9. BDOW barred owl 10. BLJA blue jay Bird Call Recognition with ANNs, SVMs and KDE 7
Bird Species Bird Call Recognition with ANNs, SVMs and KDE 8
Data Sets • 900 recordings, 10 species. Bird Call Recognition with ANNs, SVMs and KDE 9
Implementation Bird Call Recognition with ANNs, SVMs and KDE 10 10
Overall Recognition Process • Bird Call Recognition with ANNs, SVMs and KDE 11 11
Pre-processing • Audio data separated into frames of 512 samples each. • Spectral parameters extracted. • Cepstral parameters extracted. • Derivatives of spectral and cepstral features. • Short term (1.5 sec) amplitude envelope frequency. • 20 features. Bird Call Recognition with ANNs, SVMs and KDE 12 12
Pattern Recognition Techniques • Artificial neural networks. • Support vector machines. • Kernel density estimation. Bird Call Recognition with ANNs, SVMs and KDE 13 13
Artificial Neural Network • Coded with GNU C++. • Plain back-propagation algorithm (delta rule). • Variable learning rate, decreased exponentially. • Logistic neurons for hidden layer. • Linear neurons for output layer. • Training set is shuffled between epochs. Bird Call Recognition with ANNs, SVMs and KDE 14 14
Artificial Neural Network Bird Call Recognition with ANNs, SVMs and KDE 15 15
Artificial Neural Network • Training parameters: Bird Call Recognition with ANNs, SVMs and KDE 16 16
Support Vector Machine • Used LIBSVM library (based on SMO). • C-SVC: C-support vector classification. • Kernel is radial basis function, . • Probability estimates enabled for ROC. • Internal cross-validation: four-fold. • Termination threshold: = 0.0001. Bird Call Recognition with ANNs, SVMs and KDE 17 17
Support Vector Machine • Grid search: Bird Call Recognition with ANNs, SVMs and KDE 18 18
Support Vector Machine • Model parameters: Bird Call Recognition with ANNs, SVMs and KDE 19 19
Kernel Density Estimation • Coded with GNU C++. • Kernel is radial basis function (multivariate normal) • “Bandwidth” selected with normal reference rule : Bird Call Recognition with ANNs, SVMs and KDE 20 20
Post-Processing • Single frames • Entire calls Bird Call Recognition with ANNs, SVMs and KDE 21 21
Frame Post-Processing • Recognizers will try to classify everything, even silence. • Setting high output threshold will reject low- confidence frames. • Inspect ROC curve to find optimal threshold. • Tradeoff: accuracy vs. rejection rate. Bird Call Recognition with ANNs, SVMs and KDE 22 22
Call Post-Processing • Recognizers are trained to classify single frames. • How do you select a single species after classifying multiple frames? • Two techniques used here: – Voting. – Confusion matching: chi-square goodness-of-fit test. Bird Call Recognition with ANNs, SVMs and KDE 23 23
Confusion Row • Sum of all output vectors for a call. • Multinomial distribution of species estimates. • Example (simplified): Frames of unknown call Species estimate for each frame Confusion row for call ALFL AMCR AMGO AMRE 5 40 50 5 Bird Call Recognition with ANNs, SVMs and KDE 24 24
Post-processing: Voting • Winner is the species that was recognized in the most frames. • Highest value of confusion row. Confusion row for call ALFL AMCR AMGO AMRE 5 40 50 5 Winner Bird Call Recognition with ANNs, SVMs and KDE 25 25
Post-processing: Chi-Test • Chi-square goodness-of-fit test can determine similarity of multinomial distributions. Confusion row ALFL AMCR AMGO AMRE 5 40 50 5 Scan through CM, Confusion matrix from training set find row that gives ALFL AMCR AMGO AMRE ALFL 100 0 0 0 lowest χ 2 statistic. AMCR 0 75 25 0 AMGO 0 50 50 0 AMRE 0 0 0 100 Bird Call Recognition with ANNs, SVMs and KDE 26 26
Trials Run • For frame recognition: 3 datasets × 7 classifiers. • For call recognition: 3 datasets × 7 classifiers × 2 post-processors. • Total: 84 trials. Bird Call Recognition with ANNs, SVMs and KDE 27 27
Results Bird Call Recognition with ANNs, SVMs and KDE 28 28 28 28
Results (Uninterpreted) • Frame Results Call Results (Need reduced subset of data.) Bird Call Recognition with ANNs, SVMs and KDE 29 29
Results: Frame Accuracy • • • Best accuracy: 95%. • Accuracy floor: 36%. • Average accuracy: 67%. Bird Call Recognition with ANNs, SVMs and KDE 30 30
Results: Frame Rejection • • • Rejections table has same format. • Rejection rate seemed uncorrelated to other. performance measures. Bird Call Recognition with ANNs, SVMs and KDE 31 31
Results: Call Accuracy • • • Best accuracy: 99%. • Accuracy floor: 0%. • Average accuracy: 76%. Bird Call Recognition with ANNs, SVMs and KDE 32 32
Results: Condensed Format • • • • • • • • Accuracy, accuracy floor, rejection rate kept. Bird Call Recognition with ANNs, SVMs and KDE 33 33
Analysis of Results Bird Call Recognition with ANNs, SVMs and KDE 34 34 34 34
Single Frame Results Bird Call Recognition with ANNs, SVMs and KDE 35 35
Single Frame Accuracy • Best are NN-100, NN-500, SVMs. • NN-500 shows overtraining. • NN-20 shows high bias, low variance. • KDE is example of biased estimator. Bird Call Recognition with ANNs, SVMs and KDE 36 36
Single Frame Rejections • High discrimination threshold intended to reject silence. • Accuracy calculations ignore rejected frames. • Not a strong correlation (0.51) between accuracy and rejection rate. Bird Call Recognition with ANNs, SVMs and KDE 37 37
Frame Accuracy Floor • Similar to accuracy, but greater variation (floor is outlier). • NN-500, KDE have big gap between training set, CV set. • This hints at overtraining. Bird Call Recognition with ANNs, SVMs and KDE 38 38
Entire Call Results Bird Call Recognition with ANNs, SVMs and KDE 39 39
Call Accuracy • Calls use two types of post- processors: voting, chi-test. • Concentrate on comparing performance w.r.t. post- processors. • (Training set performs worse than superset -- has less frames.) Bird Call Recognition with ANNs, SVMs and KDE 40 40
Call Rejections • Calls have a much lower rejection rate. • All frames in a call have to be rejected for a call to be rejected. • This is unlikely. Bird Call Recognition with ANNs, SVMs and KDE 41 41
Call Accuracy: Postproc. Effect • Chi-test gives only moderate improvement for weaker classifiers. Bird Call Recognition with ANNs, SVMs and KDE 42 42
Note • I will usually refer to the cross-validation (CV) results from here on. Bird Call Recognition with ANNs, SVMs and KDE 43 43
Call Accuracy Floor (Voting) • NN-20, KDE have one species with 0% score for all 3 data sets. • SVM-FAR, score of 7% is less than chance. • Best is SVM-MID, accuracy is 40%. Bird Call Recognition with ANNs, SVMs and KDE 44 44
Acc. Floor (Voting vs. Chi-Test) • Big improvement in accuracy floor. Voting Chi-Test Bird Call Recognition with ANNs, SVMs and KDE 45 45
Recommend
More recommend