Advanced Machine Learning CS 7140 - Spring 2018 Lecture 20: Generative Adversarial Networks Jan-Willem van de Meent Slide credits: Ian Goodfellow
Variational Autoencoders Input Hidden Mean Encoding Hidden Reconstructed Images Units Std Dev (random) Units Images 784 256 2-50 256 784 (28 x 28) (28 x 28)
Variational Autoencoders Assume prior:
<latexit sha1_base64="xCnLHJxvOenf3Kf6H0xnPDc3fOI=">AGrHicfZRb9MwFMfTMcoItw0eYnoSydVQLTNgkNTUMTPCA0pt2kulSO46TWHCdznN0sfzM+CHwb7DSCJg64D7V8fv9zjn1OTphTUgjf/9VbebD6sP9o7bH75Omz5y/WN16eFVnJET5FGc34RQgLTAnDp4Ii9yjmEaUnweXn409vNrzAuSsRNxl+NpChNGYoKg0Eez9Z8ghWKOIJVf1BCIORZw5IF8TjbdPc8FRZnOJNsL1Pev7r9I70HruWtmjGt6ZCEoTxUM3k1M/BQo/fKAymJFqpN5QKYzHRf1nigphDJPOZXERQwoaeZVMs5Z2pc0zlSnN06SuZjO1gf+2K+WZ2+CejNw6nU021j9AaIMlSlmAlFYFJPAz8VUQi4Iolg7LgucQ3QJEzRWwZTXExlVQLlNawnwVTGROYoYZMwrQwL2IdGrhonqK5Dox5M2x9OJXGS4QLkrCmKkyV64Ix7odqsxkFNISK3n86UBJf7T9bhS83VEthOoJoJdf6R/bSDhGLMa2d0aBdu7NpOXPKf4L+QbzGTDMcM3KEtTyCIJrjFSE/0+ALOi5NhcRIwlYNAKWXBC1RrKrsLlo23SkrQSLBqgTZ2t4SZi2rozoLu3zdW9hVF1Z3qp296KZh2cGWFlvaELcg3s4Qd8bEeUFoxqz7xEt01SexHZQuMXWNjUuqZ0wELY/5vBs3X2ubPW5V5liZdlkmIE9SqOsMshxzKDJuProbIuaUpEQUsrYrW0XY/1Xa3g52qJpN+Wd2WSQKadWYzbezOxTxqMmZW3ZgCW9i8J1gHkLrB/YkHreBe3pZm/O3o4Dfx82xrsH9STb8157bxhk7g7Dj7zmfnyDl1UO9DL+qlPdYf90/6k/50ga70as0rp7H68W8FM24q</latexit> Objective: Evidence Lower Bound (ELBO) Reconstruction KL between approximate log-likelihood posterior and prior Log marginal KL between approximate likelihood posterior and posterior
Generative Adversarial Networks D tries to make D(G(z)) near 0, D (x) tries to be G tries to make near 1 D(G(z)) near 1 Di ff erentiable D function D x sampled from x sampled from data model Di ff erentiable function G Input noise z
<latexit sha1_base64="RPJ/9dykIGmaIKoms+rtG6vULy0=">AGOnicfZRLa9wEICVNtum7itpj72Y7iWBJdhpSAKlEFpCckxDXhAviyzPekXkRyQ5L6F/lp/QP9Brb6W30lt7qOx16NoylcFIM98pBkpzBkV0vO+zj14ON979HjhifP02fMXLxeXh2LrOAEjkjGMn4aYgGMpnAkqWRwmnPAScjgJDz/VOpPLoELmqWH8iaHYLjlI4pwdKIRovj4FJdazcQNHz0e5ytVxgosCR04QkxTRYx/oR2jur0nl6vFShA4U/sP7u5U5L53AzkBiY0PSKN729Fi31v1quHaE7+e9FE9kdL83dBlJEigVQShoU4871cDhXmkhIGJptCQI7JOY7hzExTnIAYqupAtNvQHvpDNc5SCSlpmCmciATLiSUsYdGUkokJDLwZthYOVeklAkHjtGkVJtpxgjGpjhVZioKWQFaHex+1MobLwb+GubuoVwiGrC3/IG5msDMQdIa2RrfeBvbNlMXvCcwT/IK7EyGw4pXJEsSbCpTnAJRJ+Z8zG1EgWHciMqCBPV97XWFjxFjU2ld4JZ5bVWKmgkWHVG7uZwcqNGujGgm67fN1a2EUXNu2+juxlN42LDraw2MKGuAXxdobQGRNyQVmWvsZz9BVn4ztoGyGqWtcumTmxkfY8phPuvF8Qi32oFWZA12yBeZxgU+cgy4FjmfHy0l1ROWE0oVKoWq9tK5r+38ro28F2dLMpy38Yqh1tkSRkVWM2z87uUMKjJlfusgOLeRObFq4DzFtgfcAlad47v/262ZPjtVXfW/U/r/e39+qXbwG9QW/RMvLRJtpGe2gfHSGCvqCf6Df607vrfet97/2Yog/mapvXqDF6v/4CPnVJ8w=</latexit> Implicit Generative Models z x • Can generate samples x ~ p G ( x ) • Cannot evaluate density p G ( x ) • Requirement: G ( z ; θ ) is differentiable
Maximum Likelihood θ ∗ = arg max E x ∼ p data log p model ( x | θ ) θ
<latexit sha1_base64="pXpRo69NxwuPNk2867tQLRln0Yw=">AAAGl3icfZRbT9swFIBTxjqW3cr2NO2lWl+KVFDCEPA2xMbgbYC4TXVVOclpauFcZjtQsPzP9rZfsX8zJw2jiaO5Unvs852LfU6Pl1LCheP8aS09WX7afrby3H7x8tXrN53Vtxc8yZgP535CE3blYQ6UxHAuiKBwlTLAkUfh0rv+kusvb4BxksRn4i6FUYTDmEyIj4U+Gnd+o4jEY4l8FsivSnVRhGf5NmTyUG/tdRsd6P2NnGkdJ1E3HSMBM8F9GWCBVb9QrSkbUZiIIaJJ2C2dlSrESDgVo0dP9w+e+sXmn62dG8/lftd2u+uPjubplPyaPXf58Dsad3rOhlOsrim4pdCzynU8Xl3+hYLEzyKIhU8x50PXScVIYiaIT0EnlHFIsX+NQxhqMcYR8JEsXlu/yaL2zB3JSRILiP2KmcQRj7CYGoc5zKun/lQHBlYNWx6OZO4lAE7CuGrlRcq2UQATXfkiMxl4NAMlTw/3lXQG258G7uaOqiEMgpJwd52B/tSBkAHEJbK7NXC3d00mzVhK4RFycizPhkEMt34SRTgOdKnBV0P9PghinjHILyKRF8meq5Qy4DmqbQq9jRaVMyV1Ty4mWPRWHbtbwPKLaujOgO6bfN0b2M8mDIkp6KY3sxfNNM4a2MxgMxNiBsTqGUJjTEg5oUls3GeyQBd9MjGD0gWmrHHukupxEmDDYzptxtMpMdjTWmVOVd4uiwRmoR49SqIkBYZFwvI/3S0RU0oiIrgs9cq0IvH/rbS+HuxAVZsy//Y8eaAM0vdo0ZjVtzM7VE+qKpffsgELWRWbF64BTGtg+cA5qeedW59upnCxueE6G+7JVm9vv5x8K9YH66PVt1xrx9qzjqxj69zyW2ut762r1o/2+/bn9rf20RxdapU276zKap/8BfDiZjU=</latexit> Objective: Minimax Game • Generator G ( z ) generates fake images. • Discriminator D ( x ) classifies real images as 1 and fake images as 0. • Objective for G : Maximize probability that G fools D . • Objective for D : Minimize probability that D is fooled.
<latexit sha1_base64="pXpRo69NxwuPNk2867tQLRln0Yw=">AAAGl3icfZRbT9swFIBTxjqW3cr2NO2lWl+KVFDCEPA2xMbgbYC4TXVVOclpauFcZjtQsPzP9rZfsX8zJw2jiaO5Unvs852LfU6Pl1LCheP8aS09WX7afrby3H7x8tXrN53Vtxc8yZgP535CE3blYQ6UxHAuiKBwlTLAkUfh0rv+kusvb4BxksRn4i6FUYTDmEyIj4U+Gnd+o4jEY4l8FsivSnVRhGf5NmTyUG/tdRsd6P2NnGkdJ1E3HSMBM8F9GWCBVb9QrSkbUZiIIaJJ2C2dlSrESDgVo0dP9w+e+sXmn62dG8/lftd2u+uPjubplPyaPXf58Dsad3rOhlOsrim4pdCzynU8Xl3+hYLEzyKIhU8x50PXScVIYiaIT0EnlHFIsX+NQxhqMcYR8JEsXlu/yaL2zB3JSRILiP2KmcQRj7CYGoc5zKun/lQHBlYNWx6OZO4lAE7CuGrlRcq2UQATXfkiMxl4NAMlTw/3lXQG258G7uaOqiEMgpJwd52B/tSBkAHEJbK7NXC3d00mzVhK4RFycizPhkEMt34SRTgOdKnBV0P9PghinjHILyKRF8meq5Qy4DmqbQq9jRaVMyV1Ty4mWPRWHbtbwPKLaujOgO6bfN0b2M8mDIkp6KY3sxfNNM4a2MxgMxNiBsTqGUJjTEg5oUls3GeyQBd9MjGD0gWmrHHukupxEmDDYzptxtMpMdjTWmVOVd4uiwRmoR49SqIkBYZFwvI/3S0RU0oiIrgs9cq0IvH/rbS+HuxAVZsy//Y8eaAM0vdo0ZjVtzM7VE+qKpffsgELWRWbF64BTGtg+cA5qeedW59upnCxueE6G+7JVm9vv5x8K9YH66PVt1xrx9qzjqxj69zyW2ut762r1o/2+/bn9rf20RxdapU276zKap/8BfDiZjU=</latexit> <latexit sha1_base64="h/1wASbY8y+dyFEiXmgpamckOK4=">AGLHicfZTLbtQwFEDd0oESXlNYshkxmyJGVKqthukqlCVZan6kprRyHuzFh1EmM7fVn+Jb4A8RVsEGKH2ME3EKdBTOKAI0VXvuf6Pu2IMyqV73+Zm7+10Ll9Z/Gud+/+g4ePukuPj2SWCwKHJGOZOImwBEZTOFRUMTjhAnASMTiOzl5b/fE5CEmz9EBdcRgmeJLSMSVYFVuj7smb5fBcX5rn3queF4FJpqPdKjgUkmiY6ywMRVhvJ5d/9T3XvT4aPcPOr2/RW/XD1XCqhj6q1N1pa+BjGckTSBVhWMrTwOdqLFQlDAwXphL4Jic4QmcFmKE5BDXZbA9Grag2Cox1mqICU1M40TmWA1dTYtLOu7ZFo4BlF3W20OtT0lBknad0qSoznhTGMi3aUkek4YjkYvb+7bQ/WH85CFY3TAMREFdEsOkPiq8JTARAWiGba4NgfdNleC4g7+QbzEbjYAULkiWJDiNdXgOxJwW9QkhlbkAm4gOo0T3A2OMA9+ghU2p98JZ5aXROqwFWLa+iV3NYDbRArpyoOu2s64d7H0bFqop2DF0olftNM5b2NxhcxcSDiSaEUKrT+CSsix18hnP0OWcjF2nbIapemyPZMUdj7FzIp+243xKHXa/0Zl9Y8dlsBikuCiz2HGQWCVCXvpLqiaMpQJXWlN64VTf9vVeibznZMfSjtP4r0jnFIErFyMOu1cyeUiLjO2SxbsImoYzeNawF5A6wKbMnivQuar5srHK2uBP5K8G6tv7VdvXyL6Cl6hpZRgDbQFnqL9tAhIugT+o5+ol+dD53Pna+dbzfo/Fxl8wTVufHb9nRJI=</latexit> Exercise • Suppose that we could evaluate p DATA ( x ) and p G ( x ) • Can you express D ( x ) in terms of p DATA ( x ) and p G ( x ) ? SOLUTION:
Recommend
More recommend