MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang Research Center of IT Innovation, Academia Sinica Demo Page https://salu133445.github.io/musegan/ *these authors contributed equally to this work
Outline 。 Goals & Challenges 。 Data 。 Proposed Model 。 Results & Evaluation 。 Future Works Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/ 2
Generate pop music 。 of multiple tracks Goals 。 in piano-roll format [Source Code] https://github.com/ salu133445/musegan [Demo Page] https://salu133445. 。 using GAN with CNNs github.io/musegan/ 3
Multi-track GAN Challenge I Multitrack Interdependency vocal piano strings bass drums music & clip by phycause 4
Convolutional Challenge II Neural Networks Music Texture melody chord (harmony) 5
Challenge III Temporal Structure song paragraph 1 paragraph 2 paragraph 3 phrase 1 phrase 2 phrase 2 phrase 3 phrase 4 bar 1 bar 2 bar 3 bar 4 4/4 time beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 6
Challenge III Temporal Structure Convolutional Neural Networks Fixed Structure bar 1 bar 2 bar 3 bar 4 4/4 time beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 7
Piano-roll (with symbolic timing) Data Representation polyphonic multi-track time step Bar 1 Bar 2 Bar 3 Bar 4 A3 pitch t 0 t 1 time 8
Multi-track Piano-roll (with symbolic timing) Data Representation polyphonic multi-track pitch tracks time 9
Bass Data Representation Drums Strings Piano Guitar 4 bars 84 5 tracks pitches a 4 × 96 × 84 × 5 tensor 96 time steps 10
LPD (Lakh Pianoroll Dataset) 。 >170,000 multi-track piano-rolls 。 Derived from Lakh MIDI Dataset 。 Mainly pop songs Data Pypianoroll (Python package) 。 Manipulation & Visualization 。 Efficient Save/Load [Dataset] https://salu133445.gith 。 Parse/Write MIDI files ub.io/musegan/dataset 。 On PYPI (pip installable) [Pypianoroll] https://salu133445. github.io/pypianoroll/ 11
Generative Adversarial Networks random noise fake data z ~ p ( z ) G G( z ) D 1/0 X real data 12
Generative Adversarial Networks Goal of G Make G(z) undistinguishable from real data for D random noise fake data log(1-D(G(z))) z ~ p ( z ) G G( z ) D 1/0 X log(1-D(X)) + log(D(G(z))) real data Goal of D Distinguish G(z) being fake from X being real 13
Generative Adversarial Networks Generator random noise fake data critic Discriminator z ~ p ( z ) G G( z ) (wgan-gp) D real/fake X real data 4-bar phrases of 5 tracks 14
MuseGAN – An Overview temporal bar generator generator G temp G bar 1 random noise 4 latent variables 4 piano-roll matrices 15
Generator Bar Generator z z z G G z G G G z z z z z z z z z 16
Generator Bar Generator Coordination z z z z G track-independent G z G G G No Coordination z z z z z z z z z track-dependent 17
Generator Bar Generator z G z z z z G G z z G z G z z G G z z G G z z G z G z z z z z z z z z 18
Generator Bar Generator z G z z z z G G z z G z G z z G G z z G G z z G z G z z z z z z z z z 19
Time Generator Dependent Independent Dependent Melody Groove Track Independent Chords Style Bar Generator Chords z G z Style z z z G G z z z G G z z G G Melody z z G G z z G z G z z z z Groove z z z z z 20
MuseGAN 21
Bass Line Drum pattern Results Sample 1 Sample 2 Chords More Samples on Demo Page https:// salu133445.github.io/musegan / Bass Drums Guitar Strings Piano Step 0 Step 700 Step 2500 Step 6000 Step 7900 22
Monitor the Training Objective Metrics UPC Negative Critic Loss 10 12 10 10 10 8 10 6 step QN 10 4 0 2000 4000 6000 8000 step UPC number of used pitch classes per bar QN ratio of qualified notes step 23
User Study composer H : harmonious R : rhythmic jamming MS : musically structured C : coherent OR: overall rating hybrid 24
Accompaniment System Conditional GAN Generation from Scratch nothing 5-track Accompaniment System single-track 5-track 25
Summary 。 MuseGAN ◦ a novel GAN for multi-track sequence generation ◦ multi-track , polyphonic music ◦ human-AI cooperative scenario 。 Lakh Pianoroll Dataset (LPD) ( new dataset!! ) 。 Pypianoroll ( new package!! ) 26
Full Song Generation Future song Works paragraph 1 paragraph 2 paragraph 3 phrase 2 phrase 1 phrase 2 phrase 3 phrase 4 bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 Hierarchical Temporal Structure 27
Cross-modal Generation 。 Music + Video Future 。 Music + Lyrics Works 。 Video + Text 28
Analysis ◦ music features ◦ e.g. chord recognition, beat/downbeat detection, music transcription, source MIR separation Retrieval Music ◦ query music Information ◦ e.g. query by humming, similarity search, music recommendation, Research playlist generation Generation ◦ X music ◦ e.g. generation, accompaniment, style transfer, mashup, remix 29
Music and Audio 人聲分離 Computing Lab 分離音樂 分離人聲 MACLab 音樂精彩段落擷取 運用 machine learning 技術,從歌曲中萃 Research Center for 取出人聲以及 音樂兩部分 IT Innovation, Academia Sinica 音樂生成 音樂拼圖遊戲 ( 應用 : 音樂串燒生成 ) MIDI 音樂格式 demo: https://remyhuang.github.io/ 創作系統 請搜尋 MuseGAN MidiNet [Lab Website] 伴奏系統 Lab Director http://mac.citi.sinica.e 多音軌 / 樂器模型 Dr. Yi-Hsuan Yang du.tw/ 30
AAAI 2018 31
New Orleans 32
Mardi Gras 33
Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/ Q&A MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Recommend
More recommend