deep learning techniques for music generation 3
play

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em


  1. Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  2. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Prediction Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Pitch 1st 2nd (Pich of) F Input: Melody 3rd Output: Pitch of a Chord 4th – Training on a corpus/dataset <melody, chord> – Production (Prediction) Deep Learning – Music Generation – 2019 2 Jean-Pierre Briot

  3. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Classification Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Class A Pitch A# B 1st … 2nd F Output: Chord (Pitch Class) 3rd … Input: Melody 4th G# – Training on a corpus/dataset <melody, chord> – Production (Classification) Deep Learning – Music Generation – 2019 3 Jean-Pierre Briot

  4. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 4 Jean-Pierre Briot

  5. Softmax and Sigmoid • Softmax is the generalization of Sigmoid • From Binary classification to Categorical (Multiclass – Single Label) classification Sigmoid Probability € [0, 1] σ(z) = 1 ⁄ 1 + e -z Softmax S Probabilities = 1 σ(z) i = e z i / Σ e z j Probabilities Logits Deep Learning – Music Generation – 2019 5 Jean-Pierre Briot

  6. Softmax and Sigmoid • Step function and Argmax are NOT differentiable • No gradient -> No possibility of back propagation Step function Probability € {0, 1} (Perceptron) Softmax Probabilities Logits Deep Learning – Music Generation – 2019 6 Jean-Pierre Briot

  7. Direct Use – Feedforward – Ex: ForwardBach • Feedforward Architecture • Classification Task • Ex: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 7 Jean-Pierre Briot

  8. Representation • Audio – Waveform – Spectrogram (Fourier Transform) – Other (ex: MFCC) • Symbolic – Note – Rest – Note hold – Duration – Chord – Rhythm – Piano Roll – MIDI – ABC, XML… Deep Learning – Music Generation – 2019 8 Jean-Pierre Briot

  9. Representation Score C B A# Piano Roll A G# G 0 0 0 One hot Encoding 1 1 Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  10. Encoding of Features (ex : Note Pitch) • Value – Analogic • One-Hot – Digital • Embedding – Constructed Deep Learning – Music Generation – 2019 10 Jean-Pierre Briot

  11. Encoding • Rest – Zero-hot » But ambiguity with low probability notes – One more one-hot element – … • Hold – One more one-hot element » But only for monophonic melodies – Replay matrix – … Deep Learning – Music Generation – 2019 11 Jean-Pierre Briot

  12. Representation hold 0 1 0 1 0 1 rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 12 Jean-Pierre Briot

  13. Representation hold 0 1 0 1 0 1 If time slice = sixteenth rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 13 Jean-Pierre Briot

  14. Representation Choices • More Alternative Possibilities for Note: – Pitch Class One-hot + Octave number – Binary of MIDI number – … • The Most Compact Representation is not always the Best for Training/Generation • Hold as a Note Special Symbol – Risk of Skewed Class, Because Appears Too Often Deep Learning – Music Generation – 2019 14 Jean-Pierre Briot

  15. Music / Representation / Network Input layer Output layer Hidden layers Alto Voice Soprano Tenor Voice Voice … … … Bass Voice Deep Learning – Music Generation – 2019 15 Jean-Pierre Briot

  16. Code • Python (3) • Keras • Theano or TensorFlow • Music21 Deep Learning – Music Generation – 2019 16 Jean-Pierre Briot

  17. Direct Use – Feedforward – Ex 2: ForwardBach • Feedforward Architecture • Prediction Task • Ex2: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 17 Jean-Pierre Briot

  18. ForwardBach Bach BWV 344 Chorale (Training Example) Original Regenerated Deep Learning – Music Generation – 2019 18 Jean-Pierre Briot

  19. ForwardBach Bach BWV 423 Chorale (Test Example) Original Regenerated Deep Learning – Music Generation – 2019 19 Jean-Pierre Briot

  20. Music / Representation / Network Alternative 3 Models Architecture [Cotrim & Briot, 2019] Input layer Output layer Hidden layers Alto Voice Soprano Tenor Voice Voice Bass Voice Deep Learning – Music Generation – 2019 20 Jean-Pierre Briot

  21. Forward3Bach [Cotrim & Briot, 2019] Bach BWV 423 Chorale (Test Example) Original Single Architecture Regenerated Triple Architecture Regenerated Deep Learning – Music Generation – 2019 21 Jean-Pierre Briot

  22. Comparison ? What happened ? Deep Learning – Music Generation – 2019 22 Jean-Pierre Briot

  23. Overfitness Limitations • Musical accuracy is not that good (yet) • Regeneration of training example is better than Regeneration of test/validation example • Case of Overfitness Deep Learning – Music Generation – 2019 23 Jean-Pierre Briot

  24. Techniques • Limit Accuraccy and Control Overfitness • More Examples (Augment the Corpus) – Keeping a Good Style Representation, Coverture and Consistency – More Consistency and Coverture – Transpose (Align) All Chorales to Only One Key (ex: C) • More Synthetic Examples – More Coverture – Transpose All Chorales in All Keys (12) • Regularization – Weight-based » L1, L2 – Connexion-based » Dropout – Epochs-based » Early-Stop – Analysis of Learning Curves Deep Learning – Music Generation – 2019 24 Jean-Pierre Briot

  25. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 25 Jean-Pierre Briot

  26. Softmax + Cross-Entropy • Cross-Entropy measures dissimilarity between two probability distributions (prediction and target/true value) [Ng 2019] Deep Learning – Music Generation – 2019 26 Jean-Pierre Briot

  27. Output Activation and Cost/Loss Functions Task Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Application the target ( y ) function Regression Real IR Identity (Linear) Mean squared error { 0, 1 } Classification Binary Sigmoid Binary cross-entropy Classification Multiclass single label One-hot Softmax Categorical cross-entropy Monophony Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Polyphony Multiple Multi Multi Sigmoid Binary cross-entropy Multivoice Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Ex. multiclass single label: Classification among a set of possible notes for a monophonic melody, with only one single possible note choice (single label) Ex. multiclass multilabel: Classification among a set of possible notes for a single- voice polyphonic melody, therefore with several possible note choices (several labels) Ex. multi multiclass single label: Multiple classification among a set of possible notes for multivoice monophonic melodies, therefore with only one single possible note choice for each voice; Multiple classification among a set of possible notes for a set of time slices for a monophonic melody, therefore for each time slice with only one single possible note choice Deep Learning – Music Generation – 2019 27 Jean-Pierre Briot

  28. Output Activation and Cost/Loss Functions Task Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Interpretation the target ( y ) function none Regression Real IR Identity (Linear) Mean squared error none { 0, 1 } Classification Binary Sigmoid Binary cross-entropy argmax or sampling Classification Multiclass single label One-hot Softmax Categorical cross-entropy Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy argsort and > threshold & max-notes Multiple Multi Multi Sigmoid Binary cross-entropy p argmax or sampling Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Other cost functions: Mean absolute error, Kullback-Leibler (KL) divergence… Deep Learning – Music Generation – 2019 28 Jean-Pierre Briot

  29. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 29 Jean-Pierre Briot

  30. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 30 Jean-Pierre Briot

  31. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 31 Jean-Pierre Briot

  32. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 32 Jean-Pierre Briot

Recommend


More recommend