pixelwise classification for music document analysis
play

Pixelwise classification for music document analysis Jorge - PowerPoint PPT Presentation

Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 /


  1. Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 1 / 31

  2. Introduction 2 / 31

  3. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest 3 / 31

  4. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost 3 / 31

  5. Introduction ◮ Music archives and libraries preserve music over the centuries ◮ Computational tools for music analysis are of great interest ◮ Large amounts of content in symbolic format are required ◮ Manual transcription from source implies a high cost ◮ Automatic transcription systems become valuable tools 3 / 31

  6. Introduction Optical Music Recognition (OMR) ◮ From score image to symbolic encoding 4 / 31

  7. Introduction Optical Music Recognition (OMR) ◮ From score image to symbolic encoding 4 / 31

  8. Introduction Optical Music Recognition (OMR) ◮ Several interdisciplinary steps Score Document Symbol Music Music Symbolic image processing classi fi cation reconstruction encoding score 5 / 31

  9. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  10. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  11. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  12. Introduction ◮ Most document-processing stages focus on content separation : 6 / 31

  13. Introduction ◮ Poor generalization of the existing strategies ◮ Music documents have a high level of heterogeneity 7 / 31

  14. Introduction Framework ◮ Machine learning framework for music document processing ◮ Regardless of the specific characteristics of the source ◮ Detection of the different layers at the same time 8 / 31

  15. Framework 9 / 31

  16. Framework Pixelwise classification approach ◮ Categorization of each pixel within the input image ◮ Allows detecting small and thin elements present in music notation 10 / 31

  17. Framework ◮ Machine learning for avoiding hand-crafted procedures 11 / 31

  18. Framework ◮ Machine learning for avoiding hand-crafted procedures ◮ We make use of Convolutional Neural Networks (CNN) ◮ Great performance in image-related tasks ◮ Good generalization 11 / 31

  19. Framework Convolutional Neural Networks ◮ Series of hierarchical transformations (convolutions) ◮ Transformations not fixed but learned through training ◮ Less dependent on human intervention 12 / 31

  20. Framework Pixelwise classification ◮ Straightforward approach: classify every single pixel of the input image I ( x , y ) → { background , staff line , symbol , text , ... } 13 / 31

  21. Framework Pixelwise classification ◮ To train the CNN we need ground truth ◮ Documents whose categories have been correctly separated 14 / 31

  22. Framework Pixelwise classification ◮ Ground-truth example 1 ◮ One page ∼ 30 million pixels 1 Salzinnes Antiphonal manuscript (CDM-Hsmu M2149.14) 15 / 31

  23. Framework Pixelwise classification ◮ CNN is provided with the surrounding region of the pixel to be classified 16 / 31

  24. Framework Pixelwise classification ◮ Estimation of a probability for each possible category 17 / 31

  25. Framework Pixelwise classification ◮ Relevant issues 18 / 31

  26. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation 18 / 31

  27. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 18 / 31

  28. Framework Pixel.js ◮ Web-based tool for ground truth creation 19 / 31

  29. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js 20 / 31

  30. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost 20 / 31

  31. Framework Pixelwise classification ◮ Relevant issues ◮ Ground truth creation ◮ Pixel.js ◮ Computational cost ◮ Image-to-image approach 20 / 31

  32. Framework Image-to-image classification ◮ Image-to-image pixelwise classification ◮ Classify a whole region at the same time ◮ We need to split the document into patches of equal size 21 / 31

  33. Framework Image-to-image classification ◮ Similar accuracy ◮ Much more efficient (from several hours to few minutes) ◮ Usually needs a bigger training set 22 / 31

  34. Deployment 23 / 31

  35. Deployment General use ◮ Full workflow for a new type of document ◮ Ground-truth creation with Pixel.js ◮ Model training and document processing as Rodan jobs 24 / 31

  36. Deployment Resources ◮ Training models: very slow, need of high-performance computing ◮ Classification: fast with the image-to-image approach 25 / 31

  37. Deployment DEMO 26 / 31

  38. Conclusions 27 / 31

  39. Conclusions Summary ◮ Generalizable music document analysis with machine learning ◮ Research on effective and efficient strategies ◮ Usability through Rodan framework 28 / 31

  40. Conclusions Future work ◮ Integrate with the rest of the OMR workflow ◮ Make efforts towards faster adaptation to new document types ◮ Efficient ground truth creation with Pixel.js ◮ Study of model adaptation techniques 29 / 31

  41. Thank you! 30 / 31

  42. Pixelwise classification for music document analysis Jorge Calvo-Zaragoza Center for Interdisciplinary Research in Music Media and Technology Schulich School of Music McGill University, Montr´ eal (Canada) SIMSSA Workshop XII (Aug 2017) 31 / 31

Recommend


More recommend