M emory A ugmented P olicy O ptimization ( MAPO ) for Program - PowerPoint PPT Presentation

M emory A ugmented P olicy O ptimization ( MAPO ) for Program Synthesis and Semantic Parsing Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao

Program Synthesis / Semantic Parsing how many more passengers flew to los angeles than to saskatoon?

Program Synthesis / Semantic Parsing how many more passengers flew to 12,467 los angeles than to saskatoon? (filter in rows ['saskatoon'] r.city) (filter in rows ['los angeles'] r.city) (diff v1 v0 r.passengers)

Program Synthesis / Semantic Parsing how many more passengers flew to 12,467 los angeles than to saskatoon? Latent (filter in rows ['saskatoon'] r.city) (filter in rows ['los angeles'] r.city) (diff v1 v0 r.passengers)

Program Synthesis / Semantic Parsing how many more passengers flew to 12,467 los angeles than to saskatoon? Sparse Latent (filter in rows ['saskatoon'] r.city) (filter in rows ['los angeles'] r.city) (diff v1 v0 r.passengers)

Policy Gradient On-policy Actor Learner Samples Updated Policy Unbiased => optimal solution High variance => slow training

Imitation Learning Demonstration Actor Learner Updated Policy Biased => suboptimal solution Low variance => fast training

Imitation Learning Demonstration Actor Learner Updated Policy Biased => suboptimal solution Low variance => fast training Requires human supervision

MAPO Actor Learner Updated Policy Unbiased => optimal solution Low variance => fast training

MAPO Memory buffer High-reward samples Actor Learner Updated Policy Unbiased => optimal solution Low variance => fast training

MAPO Memory buffer Samples inside High-reward memory samples Samples outside Actor Learner memory Updated Policy Unbiased => optimal solution Low variance => fast training

Gradient Expectation Estimate Program space

Gradient Sampling Expectation Estimate Program space Unbiased High variance

MAPO Enumeration Programs inside Memory Gradient Estimate Programs outside Sampling Memory Sampling from a smaller space => variance reduction Unbiased

MAPO Enumeration Sampling Programs inside Memory Gradient Estimate Programs outside Sampling Memory Stratified sampling => variance reduction Unbiased

MAPO ( = a program) ( = correct or not)

WikiTableQuestions: first SOTA using RL

WikiSQL: strong vs. weak supervision! Strong supervision

● MAPO converges slower than iterative maximum likelihood, but reaches a better solution. ● REINFORCE doesn’t make much progress (<10% accuracy).

● MAPO converges slower than maximum likelihood training, but reaches a better solution. ● REINFORCE doesn’t make much progress (<10% accuracy).

An efficient policy optimization method for learning to generate sequences from sparse rewards. https://github.com/crazydonkey200/neural-symbolic-machines https://arxiv.org/abs/1807.02322 Poster: Room 517 AB #137 http://crazydonkey200.github.io/

M emory A ugmented P olicy O ptimization ( MAPO ) for Program - PowerPoint PPT Presentation

M emory A ugmented P olicy O ptimization ( MAPO ) for Program Synthesis and Semantic Parsing Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao Program Synthesis / Semantic Parsing how many more passengers flew to los angeles than to

A N U NDERWATER A UGMENTED R EALITY S YSTEM FOR C OMMERCIAL D IVING O PERATIONS R OGELIO M

CIENCE P P OLICY OLICY AND AND I I NTERNA TIONAL NTERNATIONAL S CIENTIFIC CIENTIFIC C C OLLABORA

UNIVERSAL ACCESS TO CARE WORK GROUP 2018 1 LPRO : L EGISLATIVE P OLICY AND R ESEARCH O FFICE

HIPAA RESEARCH POLICY THE NEW STRUCTURE OF THE EMORY UNIVERSITY HYBRID COVERED ENTITY AND HOW IT

Sporadic torsion David Zureick-Brown Anastassia Etropolski (Emory University) Jackson Morrow

Progress on Mazurs program B David Zureick-Brown Emory University Slides available at

Progress on Mazurs Program B David Zureick-Brown Emory University Slides available at

AIDS denialism: the pseudoscience that kills Guido Silvestri, MD Emory University School of

Emory Public Art Collection Over the years, the Emory community has acquired a small but

When a Knowledge Base is not Enough Question Answering over Knowledge Bases with External Text

Managing Our Implicit Bias Kimberly Curseen, MD Associate Professor of Internal Medicine Emory

Medical Education Emory University School of Medicine Bill Eley, MD, MPH Executive Associate

Inferring Searcher Intent Eugene Agichtein Emory University Tutorial Website (for expanded and

David Zureick-Brown Emory Univeristy Slides available at http://www.mathcs.emory.edu/~dzb/slides/

Counting points, counting fields, and heights on stacks. David Zureick-Brown Emory University

CRQA: Crowd-powered Real-time Automated Question Answering System Denis Savenkov Eugene

Directed Network Topology Inference via Graph Filter Identification Rasoul Shafipour, Santiago

LegoNet: Efficient Convolutional Neural Networks with Lego Filters Zhaohui Yang 1,2,* Yunhe Wang 2

with Polynomial Filters Josiah Manson and Scott Schaefer Texas A&M University Motivation

Filters and Noise Optional Assessment of Practical Importance Rubin H Landau Sally Haerer,

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters

robfilter : An R-Package for Robust Time Series Filters Karen Schettlinger, Roland Fried, Ursula

Computer Graphics Texture Filtering Philipp Slusallek Reconstruction Filter Simple texture

Auto-Parallelizing Stateful Distributed Streaming Applica9ons Sco$

M emory A ugmented P olicy O ptimization ( MAPO ) for Program - PowerPoint PPT Presentation

M emory A ugmented P olicy O ptimization ( MAPO ) for Program Synthesis and Semantic Parsing Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao Program Synthesis / Semantic Parsing how many more passengers flew to los angeles than to

A N U NDERWATER A UGMENTED R EALITY S YSTEM FOR C OMMERCIAL D IVING O PERATIONS R OGELIO M

CIENCE P P OLICY OLICY AND AND I I NTERNA TIONAL NTERNATIONAL S CIENTIFIC CIENTIFIC C C OLLABORA

UNIVERSAL ACCESS TO CARE WORK GROUP 2018 1 LPRO : L EGISLATIVE P OLICY AND R ESEARCH O FFICE

HIPAA RESEARCH POLICY THE NEW STRUCTURE OF THE EMORY UNIVERSITY HYBRID COVERED ENTITY AND HOW IT

Sporadic torsion David Zureick-Brown Anastassia Etropolski (Emory University) Jackson Morrow

Progress on Mazurs program B David Zureick-Brown Emory University Slides available at

Progress on Mazurs Program B David Zureick-Brown Emory University Slides available at

AIDS denialism: the pseudoscience that kills Guido Silvestri, MD Emory University School of

Emory Public Art Collection Over the years, the Emory community has acquired a small but

When a Knowledge Base is not Enough Question Answering over Knowledge Bases with External Text

Managing Our Implicit Bias Kimberly Curseen, MD Associate Professor of Internal Medicine Emory

Medical Education Emory University School of Medicine Bill Eley, MD, MPH Executive Associate

Inferring Searcher Intent Eugene Agichtein Emory University Tutorial Website (for expanded and

David Zureick-Brown Emory Univeristy Slides available at http://www.mathcs.emory.edu/~dzb/slides/

Counting points, counting fields, and heights on stacks. David Zureick-Brown Emory University

CRQA: Crowd-powered Real-time Automated Question Answering System Denis Savenkov Eugene

Directed Network Topology Inference via Graph Filter Identification Rasoul Shafipour, Santiago

LegoNet: Efficient Convolutional Neural Networks with Lego Filters Zhaohui Yang 1,2,* Yunhe Wang 2

with Polynomial Filters Josiah Manson and Scott Schaefer Texas A&amp;M University Motivation

Filters and Noise Optional Assessment of Practical Importance Rubin H Landau Sally Haerer,

AngularJS Unit Testing AngularJS Filters and Services with Karma &amp; Jasmine Filters

robfilter : An R-Package for Robust Time Series Filters Karen Schettlinger, Roland Fried, Ursula

Computer Graphics Texture Filtering Philipp Slusallek Reconstruction Filter Simple texture

Auto-Parallelizing Stateful Distributed Streaming Applica9ons Sco$

with Polynomial Filters Josiah Manson and Scott Schaefer Texas A&M University Motivation

AngularJS Unit Testing AngularJS Filters and Services with Karma & Jasmine Filters