Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon - PowerPoint PPT Presentation

Sample Complexity for Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon University

Analysis and Design of Algorithms Classic algo design: solve a worst case instance. • Easy domains, have optimal poly time algos. E.g., sorting, shortest paths • Most domains are hard. E.g., clustering, partitioning, subset selection, auction design, … Data driven algo design: use learning & data for algo design. • Suited when repeatedly solve instances of the same algo problem.

Data Driven Algorithm Design Data driven algo design: use learning & data for algo design. Different methods work better in different settings. • Large family of methods – what’s best in our application? • Prior work: largely empirical. Artificial Intelligence: E.g., [Xu-Hutter-Hoos-LeytonBrown, JAIR 2008] • Computational Biology: E.g., [DeBlasio-Kececioglu, 2018] • Game Theory: E.g., [Likhodedov and Sandholm, 2004] •

Data Driven Algorithm Design Data driven algo design: use learning & data for algo design. Different methods work better in different settings. • Large family of methods – what’s best in our application? • Prior work: largely empirical. Our Work: Data driven algos with formal guarantees . Several cases studies of widely used algo families. • General principles: push boundaries of algorithm design • and machine learning. Related in spirit to Hyperparameter tuning, AutoML, MetaLearning.

Structure of the Talk • Data driven algo design as batch learning. A formal framework. • Case studies: clustering, partitioning pbs, auction pbs. • General sample complexity theorem. •

Example: Clustering Problems Clustering : Given a set objects organize then into natural groups. • E.g., cluster news articles, or web pages, or search results by topic. • Or, cluster customers according to purchase history. • Or, cluster images by who is in them. Often need do solve such problems repeatedly. • E.g., clustering news articles (Google news).

Example: Clustering Problems Clustering : Given a set objects organize then into natural groups. Objective based clustering 𝒍 -means Input: Set of objects S, d Output: centers {c 1 , c 2 , … , c k } To minimize σ p min d 2 (p, c i ) i 𝐥 -median : min σ p min d(p, c i ) . k-center/facility location : minimize the maximum radius. • Finding OPT is NP-hard, so no universal efficient algo that works on all domains.

Algorithm Selection as a Learning Problem Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. MST + Dynamic Programming Large family 𝐆 of algorithms Greedy + Farthest Location Sample of typical inputs … Input 2: Input N: Input 1: Clustering: … Input 2: Input N: Input 1: … Input 2: Input N: Input 1: Facility … location:

Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Approach: ERM, find ෡ 𝐁 near optimal algorithm over the set of samples. Key Question: Will ෡ 𝐁 do well on future instances? Seen: … New: Sample Complexity: How large should our sample of typical instances be in order to guarantee good performance on new instances?

Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Approach: ERM, find ෡ 𝐁 near optimal algorithm over the set of samples. Key tools from learning theory Uniform convergence : for any algo in F , average performance • over samples “close” to its expected performance. Imply that ෡ 𝐁 has high expected performance. • N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. •

Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Key tools from learning theory N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. dim 𝐆 (e.g. pseudo-dimension) : ability of fns in 𝐆 to fit complex patterns 𝑧 Overfitting 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑦 5 𝑦 6 𝑦 7 Training set

Sample Complexity of Algorithm Selection Goal: given family of algos 𝐆 , sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D. Key tools from learning theory N = O dim 𝐆 /ϵ 2 instances suffice for 𝜗 -close. Challenge : analyze dim(F), due to combinatorial & modular nature, “nearby” programs/ algos can have drastically different behavior. + − + + − − − − Classic machine learning Our work Challenge : design a computationally efficient meta-algorithm.

Formal Guarantees for Algorithm Selection Prior Work: [Gupta- Roughgarden, ITCS’16 &SICOMP’17] proposed model; analyzed greedy algos for subset selection pbs (knapsack & independent set) . Our results : • New algorithm classes applicable for a wide range of problems (e.g., clustering, partitioning, alignment, auctions). General techniques for sample complexity based on properties of • the dual class of fns.

Formal Guarantees for Algorithm Selection Our results : New algo classes applicable for a wide range of pbs. Clustering: Linkage + Dynamic Programming • [Balcan-Nagarajan-Vitercik-White, COLT 2017] [Balcan-Dick-Lang, 2019] DATA 𝛽 − Weighted comb … Complete linkage Single linkage Ward’s alg DP for DP for DP for k-means k-median k-center CLUSTERING DATA Clustering: Greedy Seeding + Local Search • Random Farthest first … 𝑙𝑛𝑓𝑏𝑜𝑡 + + 𝐸 𝛽 sampling seeding traversal [Balcan-Dick-White, NeurIPS 2018] Parametrized Lloyds methods 𝛾 -Local search 𝑀 2 -Local search CLUSTERING

Formal Guarantees for Algorithm Selection Our results : New algo classes applicable for a wide range of pbs. Partitioning pbs via IQPs: SDP + Rounding • Integer Quadratic Programming (IQP) [Balcan-Nagarajan-Vitercik-White, COLT 2017] Semidefinite Programming Relaxation (SDP) E.g., Max-Cut, GW s-linear … 1-linear … … Max-2SAT, Correlation Clustering rounding rounding roundig Feasible solution to IQP Computational biology (e.g., string alignment, RNA folding): • parametrized dynamic programing. [Balcan-DeBlasio-Dick-Kingsford-Sandholm-Vitercik, 2019]

Formal Guarantees for Algorithm Selection Our results : New algo classes applicable for a wide range of pbs. Branch and Bound Techniques for solving MIPs • [Balcan-Dick-Sandholm- Vitercik, ICML’18] Max 𝒅 ∙ 𝒚 s.t. 𝐵𝒚 = 𝒄 𝑦 𝑗 ∈ {0,1}, ∀𝑗 ∈ 𝐽 Max (40, 60, 10, 10, 30, 20, 60) ∙ 𝒚 1 2 , 1, 0, 0, 0, 0, 1 MIP instance s.t. 40, 50, 30, 10, 10, 40, 30 ∙ 𝒚 ≤ 100 𝒚 ∈ {0,1} 7 140 𝑦 1 = 0 𝑦 1 = 1 Choose a leaf of the search tree 1 3 0, 1, 0, 1, 0, 4 , 1 1, 5 , 0, 0, 0, 0, 1 1 2 Best-bound Depth-first 135 136 𝑦 6 = 0 𝑦 2 = 0 𝑦 2 = 1 𝑦 6 = 1 1 3 1 1 0, 1, 3 , 1, 0, 0, 1 0, 5 , 0, 0, 0, 1, 1 1, 0, 0, 1, 0, 2 , 1 1, 1, 0, 0, 0, 0, 3 Choose a variable to branch on 3 1 116 120 120 133 3 𝛽 -linear Product Most fractional 𝑦 3 = 1 𝑦 3 = 0 0, 1, 0, 1, 1, 0, 1 4 0, 5 , 1, 0, 0, 0, 1 Fathom if possible and terminate if possible 133 118

Formal Guarantees for Algorithm Selection Our results : New algo classes applicable for a wide range of pbs. General techniques for sample complexity based on properties of • the dual class of fns. [Balcan-DeBlasio-Kingsford-Dick-Sandholm-Vitercik, 2019] • Automated mechanism design for revenue maximization [Balcan-Sandholm-Vitercik, EC 2018] Generalized parametrized VCG auctions, posted prices, lotteries.

Formal Guarantees for Algorithm Selection Our results : New algo classes applicable for a wide range of pbs. • Online and private algorithm selection. [Balcan-Dick-Vitercik, FOCS 2018] [Balcan-Dick-Pedgen, 2019] [Balcan-Dick-Sharma, 2019]

Clustering Problems Clustering : Given a set objects (news articles, customer surveys, web pages, …) organize then into natural groups. Objective based clustering 𝒍 -means Input: Set of objects S, d Output: centers {c 1 , c 2 , … , c k } To minimize σ p min d 2 (p, c i ) i Or minimize distance to ground-truth

Clustering: Linkage + Dynamic Programming Family of poly time 2-stage algorithms: 1. Use a greedy linkage-based algorithm to organize data into a hierarchy (tree) of clusters. 2. Dynamic programming over this tree to identify pruning of tree corresponding to the best clustering. A B C D E F A B C D E F DEF DEF A B C A B C D E D E A B A B A A B B C C D D E E F F

Clustering: Linkage + Dynamic Programming 1. Use a linkage-based algorithm to get a hierarchy. 2. Dynamic programming to the best pruning. Both steps can be done efficiently. DATA 𝛽 − Weighted Complete Ward’s Single … comb linkage linkage algo DP for DP for DP for k-means k-median k-center CLUSTERING

Linkage Procedures for Hierarchical Clustering Bottom-Up (agglomerative) All topics Start with every point in its own cluster. • sports fashion Repeatedly merge the “closest” two • clusters. tennis Lacoste soccer Gucci Different defs of “closest” give different algorithms.

Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon - PowerPoint PPT Presentation

Sample Complexity for Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon University Analysis and Design of Algorithms Classic algo design: solve a worst case instance. Easy domains, have optimal poly time algos.

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm

Shortest path using A Algorithm Introduction History Components of A Algorithm

Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that

Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA,

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Algorithm Design An algorithm can be written out in pseudo code Then turned into source code

Applied Algorithm Design Lecture 3 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied

Dynamic Programming (Chapter 6) Algorithm Design Techniques Greedy Divide and Conquer Dynamic

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied

Algorithm Design Formulate the problem Design an algorithm Prove it is correct Analyze its

Pronouns & Weak Nouns M&R 1525 ENG240Y Old English / Fri 17 Sep 2010

Regular Expressions Lecture 11b Larry Ruzzo Outline Some string tidbits Regular

Bayesian Networks Lab Andrea Passerini and Luca Erculiani Machine Learning BN Lab The software

Exercises Recommended trials Exercises 1-8 Taisuke Ozaki (ISSP, Univ. of Tokyo) The

Game Design Pa,erns CS 4730 Computer Game Design

Observer Design Pattern Event-Driven Design EECS3311 A & E: Software Design Fall 2020 C HEN

Concurrency-Enhancing Transformations for Asynchronous Behavioral Specifications: A Data-Driven

Towards Improving the Quality of Knowledge Graphs with Data-driven Ontology Patterns and SHACL

Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon - PowerPoint PPT Presentation

Sample Complexity for Data Driven Algorithm Design Maria-Florina (Nina) Balcan Carnegie Mellon University Analysis and Design of Algorithms Classic algo design: solve a worst case instance. Easy domains, have optimal poly time algos.

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Algorithm Analysis October 12, 2016 CMPE 250 Algorithm Analysis October 12, 2016 1 / 66

Visible Surface Determination CS418 Computer Graphics John C. Hart Painters Algorithm

Shortest path using A Algorithm Introduction History Components of A Algorithm

Stoer-Wagner Algorithm A Minimum Cut Algorithm for Undirected Graphs BigNews CS214: Algorithms

Quiz I Give the SVD-based algorithm for solving least squares, and I justify the algorithm by that

Some More Critical Section Solutions Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

A-Star Algorithm &amp; Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Earley algorithm Earley: introduction Example of Earley algorithm Scott Farrar CLMA,

The BBS Algorithm The BBS Algorithm The BBS Algorithm Prof. Paolo Ciaccia Prof. Paolo Ciaccia

Avoiding Register Overflow in the Bakery Algorithm The Bakery++ Algorithm The Bakery algorithm is

Algorithm Design An algorithm can be written out in pseudo code Then turned into source code

Applied Algorithm Design Lecture 3 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied

Dynamic Programming (Chapter 6) Algorithm Design Techniques Greedy Divide and Conquer Dynamic

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied

Algorithm Design Formulate the problem Design an algorithm Prove it is correct Analyze its

Pronouns &amp; Weak Nouns M&amp;R 1525 ENG240Y Old English / Fri 17 Sep 2010

Regular Expressions Lecture 11b Larry Ruzzo Outline Some string tidbits Regular

Bayesian Networks Lab Andrea Passerini and Luca Erculiani Machine Learning BN Lab The software

Exercises Recommended trials Exercises 1-8 Taisuke Ozaki (ISSP, Univ. of Tokyo) The

Game Design Pa,erns CS 4730 Computer Game Design

Observer Design Pattern Event-Driven Design EECS3311 A &amp; E: Software Design Fall 2020 C HEN

Concurrency-Enhancing Transformations for Asynchronous Behavioral Specifications: A Data-Driven

Towards Improving the Quality of Knowledge Graphs with Data-driven Ontology Patterns and SHACL

A-Star Algorithm & Heaps/Priority Queues Mark Redekopp 2 A* Search Algorithm ALGORITHM

Pronouns & Weak Nouns M&R 1525 ENG240Y Old English / Fri 17 Sep 2010

Observer Design Pattern Event-Driven Design EECS3311 A & E: Software Design Fall 2020 C HEN