Public Policy and Deep Reinforcement Learning on AWS Emily Webber | - PowerPoint PPT Presentation

Public Policy and Deep Reinforcement Learning on AWS Emily Webber | Machine Learning Specialist at Amazon Web Services | To-be-open-sourced research project

Public Policy Has Unique Challenges Structural Inefficiency Lack of single goal Synthesize Information Leadership Turnover

What if we used machine learning to optimize public policy? Personalized Decades of Normalized Economic Collaborative Collaborative Policy Data Data Transparent Reinforcement Learning

Data-Driven Public Policy Analysis Is Not New • Causal Inference Before After • Counterfactual analysis Treatment: Illinois 100 250 • Intuitively, what would have happened if the policy (or, treatment) had not Control: New York 100 150 been applied? • Can we convince ourselves that the Did the treatment cause this difference? two groups were nearly identical otherwise? 𝑍 = 𝛾↓ 0 + 𝛾↓ 1 𝑌↓ 1 + 𝛾↓ 2 𝑌↓ 2 + 𝛾↓𝑈 𝑌↓𝑈 + … + 𝜗

Learning Theory Fundamentals Model Actions Use Use Case Case Rewards Data Machine Learning Reinforcement Learning

Mathematically speaking Available action Bellman Equation for Reinforcement Learning Current state Adjacent state, iterable Utility per state, Discount factor Transition value Recursive call or value on utility function Reward per state, For each possible a real number adjacent state

Our reward function A deep learning model maps the economic variables to a policy suggestion “Pareto” The simulator picks treatment and control states and runs a regression on historical • Ask, are they similar? T-test data • Use logical reasoning We use the estimated effect of the • Eventually, scale with another ML policy as our reward signal, scaled by validity of the experiment model using data labelled by experts Reinforcement Learning Policy Estimation Causal Inference

But how do we pick the right way to optimize?

Philosophical Foundations Egalitarianism Kantian Rights Utilitarianism Libertarianism Universal Freedom Personal Value Equality Rights Pareto Improvements Improve at least one person, without making anyone worse off

There is no single best optimization strategy What we can do is use data to automatically suggest policies based on user-defined preferences

Given your views, we What do you want What do you think recommend evaluating : to see in public policy? impacts crime the most? Outcomes Personal Freedom Crime Income In my neighborhood, Equality of outcomes people commit crimes because there Indicators Less crime Access to are no jobs here. education Employment Savings Access to social services Less waste Submit Equality of opportunity Confirm? Less traffic Better health care

These policies are Here’s how to engage Your policy impacting you today. your elected officials recommendations Bill 789 Bill 789 13.45 Reduce taxation Reducing income Bill 238 Bill 238 Continue investment 42.66 Creating jobs Please correct bill 789, it is lowering my income Bill 121 Bill 121 Build more highways Email .05 Increasing traffic

What if we could step into Would you like to see another someone else’s shoes? point of view?

Your policy Another point of view recommendations Bill 789 Bill 789 Increase taxation Reduce taxation Bill 238 Bill 238 Continue investment Continue investment Bill 121 Bill 121 Build more highways More Public Transit Personal Freedom Increase Equality Overall Increase

Technically speaking: for ism ism in philosophical_frameworks: utility = define_utility(ism ism) data = update_data(utility) model = get_pareto(data)

How should we handle air traffic delays?

Kantian Rights Utilitarianism Egalitarianism Libertarianism Do whatever increases Do what increases Uphold human rights Preserve Freedom overall utility overall equality • Don’t prioritize airline • Different people • Let people pick for • Uphold the human status travelers value timeliness themselves sanctity of travelers differently • Don’t let people pay • Don’t automatically • Provide food, lodging, more for perks • Need multiple ways make decisions for respectful notice of defining utility for travelers • Don’t do special • Make reasonable diverse stakeholders favors • Let travelers switch attempts to avoid • Use testing and across airliners delays • Treat each traveler, surveys to get a airliner, and airport • Ensure freedom of numerical estimate the same airliners and airports for how different people value certain outcomes

There is fundamental overlap between the philosophical frameworks. This overlap can be scaled by reward functions

There is no single right answer We need a computational system that can: • Synthesize different points of view • Weight these based on criteria, like population size • Be transparent, collaborative, timely • Change with the times To efficiently support existing governing bodies

Thank you! Emily Webber | Amazon Web Services | LinkedIn effective-policies@amazon.com ß email me to collaborate!

Public Policy and Deep Reinforcement Learning on AWS Emily Webber | - PowerPoint PPT Presentation

Public Policy and Deep Reinforcement Learning on AWS Emily Webber | Machine Learning Specialist at Amazon Web Services | To-be-open-sourced research project Public Policy Has Unique Challenges Structural Inefficiency Lack of single goal

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AWS Agility + Splunk Visibility = Cloud Success Splunk App for AWS Demo Laura Ripans, AWS

stewardship uptake in China Megan McLeod | AWS Asia-Pacific AWS STANDARD V2.0 AWS Water

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

AWS and OpenAI gym Tutorial 10-703: Deep Reinforcement Learning: Recitation I Objectives for

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Maspex is using AWS services for AWS allows us implement marketing activities IT

Instance Support Elastic Load Balancing Amazon EC2 AWS Elastic Beanstalk Amazon EC2 Container

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3.

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

A Conse nsus Mode l for Re se ar c h Disse mination and Uptake 14 Se pt 2009 Applying

Update and discussion of activities in ILAC related to the NMIs James Olthoff & Andy Henson

Tuesday, August 29 th 2017 Christopher M. Quinn, MACC, CPA, CFE, CGFO, CGMA Finance Director

School Enrollment Proposal, 2016-2017 Year Two of a Multi-Year (2015-2018) Approach to Growth

Social Media & Change Will Chanania and Emily Nowlin Is it moral to censor the use of

Vinnova Co-ordinators Day Communicating Across

CONSCIOUS PERSON F. George Nedumattam, sj Director of St. Xaviers School, Bettiah, India JESU

WRITING SAMPLES UCSD SD SPWP WP Amy Berg June 22, 2016 you can find links to all the files

Sambuz

Useful Links

Newsletter

Mail Us

Public Policy and Deep Reinforcement Learning on AWS Emily Webber | - PowerPoint PPT Presentation

Public Policy and Deep Reinforcement Learning on AWS Emily Webber | Machine Learning Specialist at Amazon Web Services | To-be-open-sourced research project Public Policy Has Unique Challenges Structural Inefficiency Lack of single goal

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AWS Agility + Splunk Visibility = Cloud Success Splunk App for AWS Demo Laura Ripans, AWS

stewardship uptake in China Megan McLeod | AWS Asia-Pacific AWS STANDARD V2.0 AWS Water

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

AWS and OpenAI gym Tutorial 10-703: Deep Reinforcement Learning: Recitation I Objectives for

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Maspex is using AWS services for AWS allows us implement marketing activities IT

Instance Support Elastic Load Balancing Amazon EC2 AWS Elastic Beanstalk Amazon EC2 Container

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3.

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

A Conse nsus Mode l for Re se ar c h Disse mination and Uptake 14 Se pt 2009 Applying

Update and discussion of activities in ILAC related to the NMIs James Olthoff &amp; Andy Henson

Tuesday, August 29 th 2017 Christopher M. Quinn, MACC, CPA, CFE, CGFO, CGMA Finance Director

School Enrollment Proposal, 2016-2017 Year Two of a Multi-Year (2015-2018) Approach to Growth

Social Media &amp; Change Will Chanania and Emily Nowlin Is it moral to censor the use of

Vinnova Co-ordinators Day Communicating Across

CONSCIOUS PERSON F. George Nedumattam, sj Director of St. Xaviers School, Bettiah, India JESU

WRITING SAMPLES UCSD SD SPWP WP Amy Berg June 22, 2016 you can find links to all the files

Sambuz

Useful Links

Newsletter

Mail Us

Update and discussion of activities in ILAC related to the NMIs James Olthoff & Andy Henson

Social Media & Change Will Chanania and Emily Nowlin Is it moral to censor the use of