deep learning techniques for music generation 3
play

Deep Learning Techniques for Music Generation 3. Generation by - PowerPoint PPT Presentation

Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire dInformatique de Paris 6 (LIP6) Sorbonne Universit CNRS Programa de Ps-Graduao em


  1. Deep Learning Techniques for Music Generation 3. Generation by Feedforward Architectures Jean-Pierre Briot Jean-Pierre.Briot@lip6.fr Laboratoire d’Informatique de Paris 6 (LIP6) Sorbonne Université – CNRS Programa de Pós-Graduação em Informática (PPGI) UNIRIO Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  2. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Prediction Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Pitch 1st 2nd (Pich of) F Input: Melody 3rd Output: Pitch of a Chord 4th – Training on a corpus/dataset <melody, chord> – Production (Prediction) Deep Learning – Music Generation – 2019 2 Jean-Pierre Briot

  3. Direct Use – Feedforward – Ex 1 • Feedforward Architecture • Classification Task • Ex1: Predicting a chord associated to a melody segment – scale/mode -> tonality Pitch Class A Pitch A# B 1st … 2nd F Output: Chord (Pitch Class) 3rd … Input: Melody 4th G# – Training on a corpus/dataset <melody, chord> – Production (Classification) Deep Learning – Music Generation – 2019 3 Jean-Pierre Briot

  4. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 4 Jean-Pierre Briot

  5. Softmax and Sigmoid • Softmax is the generalization of Sigmoid • From Binary classification to Categorical (Multiclass) classification Sigmoid Probability € [0, 1] Softmax S Probabilities = 1 Probabilities Logits Deep Learning – Music Generation – 2019 5 Jean-Pierre Briot

  6. Softmax and Sigmoid • Step function and Argmax are NOT differentiable • No gradient -> No possibility of back propagation Step function Probability € {0, 1} (Perceptron) Argmax Probability(Argmax) = 1 Probabilities Logits Deep Learning – Music Generation – 2019 6 Jean-Pierre Briot

  7. Representation • Audio – Waveform – Spectrogram (Fourier Transform) – Other (ex: MFCC) • Symbolic – Note – Rest – Note hold – Duration – Chord – Rhythm – Piano Roll – MIDI – ABC, XML… Deep Learning – Music Generation – 2019 7 Jean-Pierre Briot

  8. Representation Score C B A# Piano Roll A G# G 0 0 0 One hot Encoding 1 1 Deep Learning – Music Generation – 2019 Jean-Pierre Briot

  9. Encoding of Features (ex : Note Pitch) • Value – Analogic • One-Hot – Digital • Embedding – Constructed Deep Learning – Music Generation – 2019 9 Jean-Pierre Briot

  10. Encoding • Rest – Zero-hot » But ambiguity with low probability notes – One more one-hot element – … • Hold – One more one-hot element » But only for monophonic melodies – Replay matrix – … Deep Learning – Music Generation – 2019 10 Jean-Pierre Briot

  11. Representation hold 0 1 0 1 0 1 rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 11 Jean-Pierre Briot

  12. Representation hold 0 1 0 1 0 1 If time slice = sixteenth rest 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 1 C 1 0 0 0 0 0 0 0 0 0 0 0 0 Deep Learning – Music Generation – 2019 12 Jean-Pierre Briot

  13. Music / Representation / Network Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice … … … Bass Voice Deep Learning – Music Generation – 2019 13 Jean-Pierre Briot

  14. Code • Python (3) • Keras • Theano or TensorFlow • Music21 Deep Learning – Music Generation – 2019 14 Jean-Pierre Briot

  15. Direct Use – Feedforward – Ex 2: ForwardBach • Feedforward Architecture • Prediction Task • Ex2: Counterpoint (Chorale) generation • Training on the set of (389) J.S. Bach Chorales (Choral Gesang) Output: 3 Melodies Input: Melody Deep Learning – Music Generation – 2019 15 Jean-Pierre Briot

  16. ForwardBach Bach BWV 344 Chorale (Training Example) Original Regenerated Deep Learning – Music Generation – 2019 16 Jean-Pierre Briot

  17. ForwardBach Bach BWV 423 Chorale (Test Example) Original Regenerated Deep Learning – Music Generation – 2019 17 Jean-Pierre Briot

  18. Music / Representation / Network Alternative 3 Models Architecture [Cotrim & Briot, 2019] Output layer Input layer Hidden layers Alto Voice Soprano Tenor Voice Voice Bass Voice Deep Learning – Music Generation – 2019 18 Jean-Pierre Briot

  19. Forward3Bach [Cotrim & Briot, 2019] Bach BWV 423 Chorale (Test Example) Original Single Architecture Regenerated Triple Architecture Regenerated Deep Learning – Music Generation – 2019 19 Jean-Pierre Briot

  20. Comparison ? What happened ? Deep Learning – Music Generation – 2019 20 Jean-Pierre Briot

  21. Overfitness Limitations • Musical accuracy is not that good (yet) • Regeneration of training example is better than Regeneration of test/validation example • Case of Overfitness Deep Learning – Music Generation – 2019 21 Jean-Pierre Briot

  22. Techniques • Limit Accuraccy and Control Overfitness • More Examples (Augment the Corpus) – Keeping a Good Style Representation, Coverture and Consistency – More Consistency and Coverture – Transpose (Align) All Chorales to Only One Key (ex: C) • More Synthetic Examples – More Coverture – Transpose All Chorales in All Keys (12) • Regularization – Weight-based » L1, L2 – Connexion-based » Dropout – Epochs-based » Early-Stop – Analysis of Learning Curves Deep Learning – Music Generation – 2019 22 Jean-Pierre Briot

  23. Softmax Probabilities Logits Deep Learning – Music Generation – 2019 23 Jean-Pierre Briot

  24. Softmax + Cross-Entropy • Cross-Entropy measures dissimilarity between two probability distributions (prediction and target/true value) [Ng 2019] Deep Learning – Music Generation – 2019 24 Jean-Pierre Briot

  25. Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Application the target ( y ) function Regression Real IR Identity (Linear) Mean squared error { 0, 1 } Classification Binary Sigmoid Binary cross-entropy Classification Multiclass single label One-hot Softmax Categorical cross-entropy Monophony Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Polyphony Multiple Multi Multi Sigmoid Binary cross-entropy Multivoice Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Ex. multiclass single label: Classification among a set of possible notes for a monophonic melody, with only one single possible note choice (single label) Ex. multiclass multilabel: Classification among a set of possible notes for a single- voice polyphonic melody, therefore with several possible note choices (several labels) Ex. multi multiclass single label: Multiple classification among a set of possible notes for multivoice monophonic melodies, therefore with only one single possible note choice for each voice; Multiple classification among a set of possible notes for a set of time slices for a monophonic melody, therefore for each time slice with only one single possible note choice Deep Learning – Music Generation – 2019 25 Jean-Pierre Briot

  26. Output Activation and Cost/Loss Functions Type of the output ( ˆ y ) Encoding of Output activation Cost (loss) Task Interpretation the target ( y ) function none Regression Real IR Identity (Linear) Mean squared error none { 0, 1 } Classification Binary Sigmoid Binary cross-entropy argmax or sampling Classification Multiclass single label One-hot Softmax Categorical cross-entropy argsort and > threshold & max-notes Classification Multiclass multilabel Many-hot Sigmoid Binary cross-entropy Multiple Multi Multi Sigmoid Binary cross-entropy p argmax or sampling Classification Multiclass single label One-hot Multi Multi Softmax Categorical cross-entropy Other cost functions: Mean absolute error, Kullback-Leibler (KL) divergence… Deep Learning – Music Generation – 2019 26 Jean-Pierre Briot

  27. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 27 Jean-Pierre Briot

  28. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 28 Jean-Pierre Briot

  29. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 29 Jean-Pierre Briot

  30. Output Activation and Cost/Loss Functions Deep Learning – Music Generation – 2019 30 Jean-Pierre Briot

  31. (Summary of) Principles of Loss Functions • Probability theory + Information theory See also Maximum likelihood principle • Intuition: – Information content (Likely Event) : Low – Information content (Unlikely Event) : High • Self-information: I(x) = log (1/ P (x)) = - log P ( x ) Likely event Unlikely event • Ex: I(note=B) = - log P (note=B) Entropy of Probability distribution : S i I(note=Note i ), weighted by P (note=Note i ) • H(note) = S i P (note=Note i ) I(note=Note i ) • • Expectation-based alternative definition: Expectation: Mean value of f(x) when x~ P : E x~ P [f(x)] = S x P (x) f(x) • • H(note) = E note~ P I(x) = E note~ P [- log P (note)] = - E note~ P [log P (note)] Deep Learning – Music Generation – 2019 31 Jean-Pierre Briot

Recommend


More recommend