The Big Project Aws Albarghouthi Calvin Smith University of - PowerPoint PPT Presentation

The Big λ Project Aws Albarghouthi Calvin Smith University of Wisconsin-Madison

input data map shu ffl e reduce output data m(i 1 ) i 1 reduce i 2 m(i 2 ) reduce output i 3 reduce m(i 3 ) …

Big λ : Analyses from Examples [PLDI16] Example: Output: { { i 1 o i 2 , Synthesize data-parallel programs from input/output examples

Challenges generate proven-deterministic Non-determinism solutions Variety of domains parameterize by extensible APIs syntactically restrict to data- Sparse search space parallel programs

Higher-order sketches Bias search heavily towards data-parallel programs Big λ uses 8 templates, gathered from reference implementations

Higher-order sketches Bias search heavily towards data-parallel programs map x . reduce x { e.g. flatmap x . reduce x . apply x map x . reduceByKey x . filter x Big λ uses 8 templates, gathered from reference implementations

Who uses the most #hashtags ? @Alice : “Hello AAIP #aaip #germany ” @Bob : “Co ff ee machine refilled yet? #caffeine #java #4thcup #zzz ” @Claire : “Torn between wine cellar and seminar #wine #seminar #zzz ”

Who uses the most #hashtags? { { @Alice : “Hello AAIP #aaip #germany ” @Bob : “Co ff ee machine refilled yet? @Bob #caffeine #java #4thcup #zzz ” @Claire : “Torn between wine cellar , and seminar #wine #seminar #zzz ” 2, 4, 3…must be @Bob !

@Alice : “Hello AAIP #aaip #germany ” @Bob : “Co ff ee machine refilled yet? #caffeine #java #4thcup #zzz ” @Claire : “Torn between wine cellar and seminar #wine #seminar #zzz ” let p = map m . reduce r . apply f

{2, @Alice } @Alice : “Hello AAIP #aaip #germany ” @Bob : “Co ff ee machine refilled yet? {4, @Bob } #caffeine #java #4thcup #zzz ” @Claire : “Torn between wine cellar {3, @Claire } and seminar #wine #seminar #zzz ” let p = map m . reduce r . apply f where m = λ t. (len(filter(is_hashtag, t)), author(t))

{2, @Alice } { {4, @Bob } { {4, @Bob } {4, @Bob } {3, @Claire } let p = map m . reduce r . apply f where m = λ t. (len(filter(is_hashtag, t)), author(t)) r = λ x,y. max(x, y)

{4, @Bob } { {4, @Bob } { {2, @Alice } {4, @Bob } {3, @Claire } let p = map m . reduce r . apply f where m = λ t. (len(filter(is_hashtag, t)), author(t)) r = λ x,y. max(x, y)

{3, @Claire } { {3, @Claire } { {2, @Alice } {4, @Bob } {4, @Bob } let p = map m . reduce r . apply f where m = λ t. (len(filter(is_hashtag, t)), author(t)) r = λ x,y. max(x, y)

@Bob {4, @Bob } let p = map m . reduce r . apply f where m = λ t. (len(filter(is_hashtag, t)), author(t)) r = λ x,y. max(x, y) f = λ p. snd(p)

Synthesis modulo differential privacy? [ in progress ]

map m . reduce r

map m . reduce r compute sensitivity charge price

map m . reduce r compute sensitivity add noise charge price

Key Idea Linear type system induce cheapest program

How can we automatically learn relational specifications? [FSE17, best paper award ]

add ( x , y ) = z ⇐ ⇒ add ( y , x ) = z

add i 1 i 2 r 1 2 3 3 4 7 add ( x , y ) = z ⇐ ⇒ add ( y , x ) = z 5 6 11 4 3 7 . . . . . . . . .

add i 1 i 2 r 1 2 3 3 4 7 add ( x , y ) = z ⇐ ⇒ add ( y , x ) = z 5 6 11 4 3 7 . . . . . . . . . learn constraints consistent with Unsupervised learning observations

Exploratory evaluation Applied technique to learn specifications of Python APIs Used ~1000 randomly sampled inputs per function Strings concat ( y , reverse ( y )) = x ⇒ reverse ( x ) = x Z3 valid ( x ) = p ∧ valid ( y ) = p ⇒ valid ( and ( x , y )) = p Trig x = y − π /2 ⇒ ( sin ( x ) = z ⇐ ⇒ cos ( y ) = z )

Other directions Synthesis of Datalog programs—graph analytics Synthesis of fair decision-making programs Active-learning-based user interaction Proofs as programs …

The Big Project Aws Albarghouthi Calvin Smith University of - PowerPoint PPT Presentation

The Big Project Aws Albarghouthi Calvin Smith University of Wisconsin-Madison input data map shu ffl e reduce output data m(i 1 ) i 1 reduce i 2 m(i 2 ) reduce output i 3 reduce m(i 3 ) Big : Analyses from Examples [PLDI16]

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Big Data, Big Science , Big Impact! educator slides Human Genome Project 1990 2003

From Big Data Management to Big Data Science 1 What is next? Real big data is widely available

Big ideas + big data = real life benefits Thursday 27 October 2016 synchrotron.org.au Big Data

Bethesda Big Train Partnership Presentation What is Big Train? Bethesda Big Train is a summer

Big Brothers Big Sisters of Ajax-Pickering About Us Every child should have the opportunity to

Information Retrieval > Query Us User er Query Words Query Words Search Personalization

- the options must be thoroughly compared to each other; - the decision-maker must use a

PHPE 4000 Individual and Group Decision Making Eric Pacuit University of Maryland pacuit.org 1

Scalable Web Object Inspec0on and Malfease Collec0on Charalampos Andrianakis Paul Seymer

Knowledge and awareness Hans van Ditmarsch LORIA CNRS / Universit e de Lorraine Mus ee

PubPol 201 The Gains and Losses from Trade Module 3: International Comparative advantage

ADVERSE SELECTION AND AUCTION DESIGN FOR INTERNET DISPLAY ADVERTISING NICK ARNOSTI, MARISSA BECK

C O U N T E R FAC T UA L S A N D DAG S I I PMAP 8521: Program Evaluation for Public Service

The Big Project Aws Albarghouthi Calvin Smith University of - PowerPoint PPT Presentation

The Big Project Aws Albarghouthi Calvin Smith University of Wisconsin-Madison input data map shu ffl e reduce output data m(i 1 ) i 1 reduce i 2 m(i 2 ) reduce output i 3 reduce m(i 3 ) Big : Analyses from Examples [PLDI16]

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES &amp; OPPORTUNITIES Paris Big Data

Big Data, Big Science , Big Impact! educator slides Human Genome Project 1990 2003

From Big Data Management to Big Data Science 1 What is next? Real big data is widely available

Big ideas + big data = real life benefits Thursday 27 October 2016 synchrotron.org.au Big Data

Bethesda Big Train Partnership Presentation What is Big Train? Bethesda Big Train is a summer

Big Brothers Big Sisters of Ajax-Pickering About Us Every child should have the opportunity to

Information Retrieval &gt; Query Us User er Query Words Query Words Search Personalization

- the options must be thoroughly compared to each other; - the decision-maker must use a

PHPE 4000 Individual and Group Decision Making Eric Pacuit University of Maryland pacuit.org 1

Scalable Web Object Inspec0on and Malfease Collec0on Charalampos Andrianakis Paul Seymer

Knowledge and awareness Hans van Ditmarsch LORIA CNRS / Universit e de Lorraine Mus ee

PubPol 201 The Gains and Losses from Trade Module 3: International Comparative advantage

ADVERSE SELECTION AND AUCTION DESIGN FOR INTERNET DISPLAY ADVERTISING NICK ARNOSTI, MARISSA BECK

C O U N T E R FAC T UA L S A N D DAG S I I PMAP 8521: Program Evaluation for Public Service

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Information Retrieval > Query Us User er Query Words Query Words Search Personalization