Compositional Methods for Learning and Inference in Deep Probabilistic Programs Jan-Willem van de Meent Eli Sennesh Sam Stites Hao Wu Heiko Zimmermann
Deep Learning Success Stories Computer Vision Natural Language Reinforcement Learning 14M images (ImageNet) Very large corpora of text 4.9M games (Self-play) Annotations available (can self-supervise) Clear definition of success Ingredients for success 1. Abundance of (labeled) data and compute 2. A well-defined general notion of utility
Do we still need models? The Bitter Lesson Do we still need models or just more data and compute? Rich Sutton, March 13, 2019 Max Welling, April 20, 2019 The biggest lesson that can be read from 70 years of AI research is that When you need to generalize to general methods that leverage new domains, i.e. extrapolate away computation are ultimately the most from the data, you will need a effective, and by a large margin. generative model http://www.incompleteideas.net/ https://staff.fnwi.uva.nl/m.welling/ IncIdeas/BitterLesson.html wp-content/uploads/Model-versus- Data-AI-1.pdf
Do we still need models ?
When are models useful? Science & Engineering Autonomous Vehicles Recommendation High quality models Generalization to Large collection of and/or limited data long tail events small-data problems We need inductive biases that 1. Improve generalization 2. Safe-guard against overconfident predictions
Deep Probabilistic Models Deep Learning Probabilistic Programming High-capacity models Programs as inductive biases • • Scalable to large datasets Structured, interpretable • • Easy to try new models Also easy to try new models • • SGD + AutoDiff Monte Carlo Methods (very general) (more model specific) Stochastic Variational Inference (learn proposals using neural networks)
<latexit sha1_base64="+UXAQ6ZeCoF1fnHJT3XwyVfaI=">AGCnicfZTNbtQwEIDd0oWy/LVw5LJiL620qpJStRXiUIEqOJaqf1KzWjnO7K5VJzG20z/Lb8Aj8BTcEJwQN3gB3gY7G4lNHOFIyWjmz974pgzKlUQ/FlYvLPUuXtv+X73wcNHj5+srD49kXkhCByTnOXiLMYSGM3gWFHF4IwLwGnM4DS+eOvsp5cgJM2zI3XDYZjiSUbHlGBlVaOVbT6K1BQUXosu9bXpRSlNela8MQP3uTXrvehVj6+VqvWZYJWjlX6wEZSr5wthJfT30GwdjFaXvkdJToUMkUYlvI8DLgaiwUJQxMNyokcEwu8ATOrZjhFORQlw2aXs16FA71OM8UZKTmpnEqU6ymntLBsq4lU5sYRD1tpRxqFyUBSdZ3StOTbcbJTC2m1WpOYFWD04bs3RgeD7ZeDcHPHNBABSUWEu8HAPk1gIgCyCtndGoTbuz7DC8EZ/IMCh7lqBGRwRfI0xVmio0sg5tzuTwSZLAS4RnQUp7ofGmM8eIZan9LejeaN10brqFZgOSBN7GYOc42Wc9KEbti3XrYxzZsNp0t1at2GhctbOGxhQ8JDxLNCqE1J3BJWZ5/Yzn6HJOxn5SNsdUZ+xCMvsHJ9iLyKftOJ9Sjz1snMyhceMyT2AxSbE95yjnILDKhfvprqiaMpSJXVlN74Xzf7vZe3NZPumPpTuHcd63gkiVk5mPW98yeUiKTOuS5bsImoY7ODawF5A6w2JH2vgubt5svnGxuhFb+sNXfe13dfMvoOXqB1lCIdtAeo8O0DEi6DP6gX6h351PnS+dr51vM3RxofJ5hmqr8/Mv+kc3QA=</latexit> <latexit sha1_base64="+UXAQ6ZeCoF1fnHJT3XwyVfaI=">AGCnicfZTNbtQwEIDd0oWy/LVw5LJiL620qpJStRXiUIEqOJaqf1KzWjnO7K5VJzG20z/Lb8Aj8BTcEJwQN3gB3gY7G4lNHOFIyWjmz974pgzKlUQ/FlYvLPUuXtv+X73wcNHj5+srD49kXkhCByTnOXiLMYSGM3gWFHF4IwLwGnM4DS+eOvsp5cgJM2zI3XDYZjiSUbHlGBlVaOVbT6K1BQUXosu9bXpRSlNela8MQP3uTXrvehVj6+VqvWZYJWjlX6wEZSr5wthJfT30GwdjFaXvkdJToUMkUYlvI8DLgaiwUJQxMNyokcEwu8ATOrZjhFORQlw2aXs16FA71OM8UZKTmpnEqU6ymntLBsq4lU5sYRD1tpRxqFyUBSdZ3StOTbcbJTC2m1WpOYFWD04bs3RgeD7ZeDcHPHNBABSUWEu8HAPk1gIgCyCtndGoTbuz7DC8EZ/IMCh7lqBGRwRfI0xVmio0sg5tzuTwSZLAS4RnQUp7ofGmM8eIZan9LejeaN10brqFZgOSBN7GYOc42Wc9KEbti3XrYxzZsNp0t1at2GhctbOGxhQ8JDxLNCqE1J3BJWZ5/Yzn6HJOxn5SNsdUZ+xCMvsHJ9iLyKftOJ9Sjz1snMyhceMyT2AxSbE95yjnILDKhfvprqiaMpSJXVlN74Xzf7vZe3NZPumPpTuHcd63gkiVk5mPW98yeUiKTOuS5bsImoY7ODawF5A6w2JH2vgubt5svnGxuhFb+sNXfe13dfMvoOXqB1lCIdtAeo8O0DEi6DP6gX6h351PnS+dr51vM3RxofJ5hmqr8/Mv+kc3QA=</latexit> <latexit sha1_base64="+UXAQ6ZeCoF1fnHJT3XwyVfaI=">AGCnicfZTNbtQwEIDd0oWy/LVw5LJiL620qpJStRXiUIEqOJaqf1KzWjnO7K5VJzG20z/Lb8Aj8BTcEJwQN3gB3gY7G4lNHOFIyWjmz974pgzKlUQ/FlYvLPUuXtv+X73wcNHj5+srD49kXkhCByTnOXiLMYSGM3gWFHF4IwLwGnM4DS+eOvsp5cgJM2zI3XDYZjiSUbHlGBlVaOVbT6K1BQUXosu9bXpRSlNela8MQP3uTXrvehVj6+VqvWZYJWjlX6wEZSr5wthJfT30GwdjFaXvkdJToUMkUYlvI8DLgaiwUJQxMNyokcEwu8ATOrZjhFORQlw2aXs16FA71OM8UZKTmpnEqU6ymntLBsq4lU5sYRD1tpRxqFyUBSdZ3StOTbcbJTC2m1WpOYFWD04bs3RgeD7ZeDcHPHNBABSUWEu8HAPk1gIgCyCtndGoTbuz7DC8EZ/IMCh7lqBGRwRfI0xVmio0sg5tzuTwSZLAS4RnQUp7ofGmM8eIZan9LejeaN10brqFZgOSBN7GYOc42Wc9KEbti3XrYxzZsNp0t1at2GhctbOGxhQ8JDxLNCqE1J3BJWZ5/Yzn6HJOxn5SNsdUZ+xCMvsHJ9iLyKftOJ9Sjz1snMyhceMyT2AxSbE95yjnILDKhfvprqiaMpSJXVlN74Xzf7vZe3NZPumPpTuHcd63gkiVk5mPW98yeUiKTOuS5bsImoY7ODawF5A6w2JH2vgubt5svnGxuhFb+sNXfe13dfMvoOXqB1lCIdtAeo8O0DEi6DP6gX6h351PnS+dr51vM3RxofJ5hmqr8/Mv+kc3QA=</latexit> <latexit sha1_base64="AnF+U5gmvKZR/Rp3+LdcpQrED24=">AGCnicfZTNbtQwEIDd0oWy/LVw5LJiL620qpJStRXiUIEqOJaqf1KzWjnO7K5VJzG20z/Lb8Aj8BTcEJwQN3gB3gY7G4lNHOFIyWjmz974pgzKlUQ/FlYvLPUuXtv+X73wcNHj5+srD49kXkhCByTnOXiLMYSGM3gWFHF4IwLwGnM4DS+eOvsp5cgJM2zI3XDYZjiSUbHlGBlVaOVbT6K1BQUXosu9bXpRSlNela8MQP3uTXrvehVj6+VqvWZYJWjlX6wEZSr5wthJfRtQ5Gq0vfoyQnRQqZIgxLeR4GXA01FoSBqYbFRI4Jhd4AudWzHAKcqjLBk2vZj0Kh3qcZwoyUnPTOJUpVlNP6WBZ15KpTQyinrZSDrWLkoCk6zuFaem240SGNvNLivTScwKMPrw3Rujg8H2y0G4uWMaiICkIsLdYGCfJjARAFmF7G4Nwu1dn+GF4Az+QYHDXDUCMrgieZriLNHRJRBzbvcngkwWAlwjOopT3Q+NMR48Q61Pae9G8Zro3VUK7AckCZ2M4e5Rs5aUK3bFuPexjGzabzpbqVTuNixa28NjCh4QHiWaF0JoTuKQsz7x+xnN0OSdjPymbY6ozdiGZ/YMT7EXk03acT6nHjZO5tC4cZknsJik2J5zlHMQWOXC/XRXVE0ZTamSurIb34tm/ey9mayfVMfSveOY71vPJLErBzM+t75E0pEUudcly3YRNSx2cG1gLwBVhvsSHvfhc3bzRdONjdCK3/Y6u+9rm6+ZfQcvUBrKEQ7aA+9RwfoGBH0Gf1Av9DvzqfOl87XzrcZurhQ+TxDtdX5+ReoeDcA</latexit> <latexit sha1_base64="kwov9h8dYG1x2QhFDTYq6WBf2l8=">AF/3icfZRLb9QwEIBd6EJZXi0cuazYSxGrKmlLWyEOFaiCY6n6kupV5Tizu1YdJ7Wdviwf+An8Cm6IG0LiAD+Df0OcjcQmjvBKm5Hnm5dn7CjTOkg+DN36/Z8587dhXvd+w8ePnq8uPTkUKW5pHBAU57K4go4EzAgWaw3EmgSQRh6Po7J3TH12AVCwV+/o6g2FCxoKNGCW62DpdfHl+irMJW8YX5toOesXnxvZwmInXtkXPfy6d748lU8X+8FKUK6eL4SV0N9G07V7ujT/E8cpzRMQmnKi1EkYZHpoiNSMcrBdnCvICD0jYzgpRESUENTVmV7Ne1+ODSjVGgQtGZmSKISoifepoNVfZdOisAg62GrzaFxXmJQbCzqVlFiu10cw6g4TIzE0c8B2v23r+1JhsrA3C1U3bQCTEFRFuBYPi1wTGEkBUyNb6INzY8pkslxmHf1DgMJeNBAGXNE0SImKDL4Dak+J8MAiVS3CFGBwlph9az14ihY2pb6LZ5VX1hcS7DsfBO7nsFcoeXwNKGbNl83HnbehmE9AU1astftNMlb2Nxjcx+SHiSbGUJrTMgU46nw6hnN0OWcjPygfIapeuxc8uLaxsTzmE3a8eLaeuxeozN71o3LEHkOCFn3GagSQ6le7SXTI94SxhWplKb30rJv5vVeibwXZsfSjdfxSZHeuRNOLlYNbPzp9QKuM656pswcayjk0b1wJmDbA6YEcW713YfN184XB1JVxbefVxvb/9pnr5FtAz9BwtoxBtom30Ae2iA0TRZ/Qd/UK/O586XzpfO9+m6K25yuYpq3Oj7/b9jLg</latexit> Structured Variational Autoencoders Generative Model (Decoder) Inference Model (Encoder) Digit y q φ ( y , z | x ) q ( x ) p θ ( x | y , z ) p ( y ) p ( z ) Style z y y x z x z Goal: learn “disentangled” Assume independence Infer y from pixels x , representation for y and z between digit y and style z and z from y and x [Kingma, Mohamed, Jimenez-Rezende, Welling, NIPS 2014]
Deep Probabilistic Programs Generative Model (Decoder) Inference Model (Encoder) class Decoder (torch.nn.Module): class Encoder (torch.nn.Module): def __init__(self, x_sz, h_sz, y_sz, z_sz): def __init__(self, x_sz, h_sz, y_sz, z_sz): # intializes layers: h, x_mean, ... # intializes layers: h, y_log_weights, ... ... ... def forward(self, x, q): def forward(self, x, y_values=None): p = probtorch.Trace() q = probtorch.Trace() y = p.concrete(self.y_log_weights, 0.66, h = self.h(x) value=q[ ' y ' ], name= ' y ' ) y = q.concrete( z = p.normal(0.0, 1.0, self.y_log_weights(h), 0.66, value=q[ ' z ' ], name= ' z ' ) value=y_values, name= ' y ' ) h = self.h(torch.cat([y, z], -1)) hy = torch.cat([h, y], -1) x = p.loss(self.bce, z = q.normal(self.z_mean(hy), self.x_mean(h), x, self.z_std(hy), name= ' x ' ) name= ' z ' ) return p return q Edward Probabilistic Torch Pyro https://github.com/bleilab/edward https://github.com/probtorch/probtorch https://github.com/uber/pyro
Learned Representations (Unsupervised) Style Variables Generalization Slant Width Height Style 1 Style 2 Thickness Style 3 Inductive Bias: Style features are uncorrelated with digit label, as well as with other features. [Esmaeli, Wu, Jain, Bozkurt, Siddharth, Paige, Brooks, Dy, van de Meent, AISTATS 2019]
Model Composition Recurrent Recognition Loop Decomposition Reconstructions model architecture, the encoder uses a recurrent recurrent network to repeat- Idea: Embed model for individual MNIST digits in a recurrent model for multiple object detection [Siddharth*, Paige*, van de Meent*, Desmaison, Wood, Goodman, Kohli, Torr, NIPS 2017] based on based frame ork
Example: Modeling Aspects in Reviews Item Encoder Sentence Encoder Sentence Decoder c * ψ i,u,s : A ⨉ K ρ i : A ⨉ K ω i,u,s : A x i : V h i : H x i,u,s : V h i,u,s : H z i,u,s : A x i,u,s : V User Encoder ⨉ Element-wise Product c ⨉ * Broadcast Product c Concrete Dist ψ i,u : A ⨉ K ρ u : A ⨉ K ρ i,u : A ⨉ K x u : V h u : H Learn aspect-based representations of users, items, and reviews ( fully unsupervised ) [Esmaeli, Huang, Wallace, van de Meent, AISTATS 2019 ]
Example: Modeling Aspects in Reviews Data: Beer reviews Amber brown in color with very little head but a nice ring. Nicely carbonated. Smells like a camp fire, malts have a good sweet character with an abundance of smoke. Taste is quite good with smokiness being pungent but not overwelming. A sweet tasting bock with smokiness coming through around mid drink with a smooth mellow finish. A good warming smoky beer. Aspects: Look, Mouthfeel, Aroma, Taste, Overall [Esmaeli, Huang, Wallace, van de Meent, AISTATS 2019 ]
Recommend
More recommend