ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications - PowerPoint PPT Presentation

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications Kinjal Basu Amol Ghoting Rahul Mazumder Yao Pan LinkedIn AI LinkedIn AI MIT LinkedIn AI

1 Overview 2 ECLIPSE: Extreme Scale LP Solver Agenda 3 Applications 4 System Architecture 5 Experimental Results

Overview

Introduction Large-Scale Linear Programs (LP) has several applications on web

Problems of Extreme Scale Billions to Trillions of Variables ● Ad-hoc Solutions ● Splitting the problem to smaller sub-problem à No guarantee of optimality ● Exploit the Structure of the Problem ● Solve a Perturbation of the Primal Problem. ● Smooth Gradient ● Efficient computation ●

Motivating Example Friend or Connection Matching Problem Maximize Value ● Total invites sent is greater than a threshold ● Limit on invitations per member to prevent ● overwhelming members 𝑞 ! - Value Model Scale: ● 𝑞 " - Invitation Model 𝐽 ≈ 10 % ● • 𝑦 #$ - Probability of showing user j to user i 𝐾 ≈ 10 & ● • 𝑜 ≈ 10 !" • ( 1 Trillion Decision Variables)

General Framework c T x Users 𝑗 , Items 𝑘 , and 𝑦 #$ is the association min ● between (𝑗, 𝑘) x 𝑜 = 𝐽𝐾 can range in 100s of millions to 10s of trillions s.t. Ax  b ● 𝐷 # are simple constraints (i.e. allows for efficient ● x i 2 C i , i 2 [ I ] projections) Global Constraints ✓ A (1) Cohort Level Constraints A ✓ Eg: Total Invite Constraint 0 1 D 11 . . . D 1 I . . A (2) Item level constraints . . = B C . . · · · @ A Eg: Limits on invitation per user D m 2 1 . . . D m 2 I

ECLIPSE: Extreme Scale LP Solver

Solving The Problem c T x min s.t. x i 2 C i , i 2 [ I ] P ∗ Ax  b, 0 := Primal LP: x Old idea: Perturbation of the LP (Mangasarian & Meyer ’79; Nesterov ‘05; Osher et al ‘11…) c T x + γ 2 x T x P ∗ min s.t. x i 2 C i , i 2 [ I ] γ := Ax  b, Primal QP: x Dualize c T x + γ n 2 x T x + λ T ( Ax � b ) o g γ ( λ ) := min Dual QP: x ∈ Q C i length ( λ ) is small Key Observation: = P ∗ = max λ ≥ 0 g γ ( λ ) g ∗ γ := Solve the Dual QP: γ Strong duality

Solving The Problem c T x min s.t. x i 2 C i , i 2 [ I ] Ax  b, P ∗ 0 := Primal: x c T x + γ 2 x T x x ∗ s.t. γ 2 argmin x i 2 C i , i 2 [ I ] Ax  b, x | − • Observation-1: Exact Regularization (Mangasarian & Meyer ’79; Friedlander Tseng ‘08) γ > 0 such that x ∗ ∃ ¯ γ solves LP for all γ ≤ ¯ γ c T x + γ n 2 x T x + λ T ( Ax � b ) o g γ ( λ ) := min Dual: x ∈ Q C i g ∗ = max λ ≥ 0 g γ ( λ ) γ := • Observation-2: Error Bound (Nesterov ‘05) | g ∗ γ − P ∗ 0 | = O ( γ )

Solving The Problem = max λ ≥ 0 g γ ( λ ) ECLIPSE Algorithm • Proximal Gradient Based methods • Observation-1: Dual objective is smooth (implicitly defined) (Acceleration, Restarts) [Nesterov ‘05] • Optimal convergence rates. λ 7! g γ ( λ ) is O (1 / γ ) -smooth. n o • Observation-2: Gradient expression (Danskin’s Theorem) Q c T x + γ n 2 x T x + λ T ( Ax � b ) o r g γ ( λ ) = A ˆ x ( λ ) � b x ( λ ) 2 argmin ˆ n x ∈ Q C i ✓ ◆ � 1 γ ( A T λ + c ) i x i ( λ ) = Π C i ˆ n • Key bottleneck: Matrix-vector multiplication • Simple projection operation

Overall Algorithm Input: Get Primal: At Iteration k: Dual Compute Gradient: Next Iteration Update Dual: GD: AGD:

Applications

Volume Optimization Maximize Sessions Total number of emails / ● notifications bounded Clicks above a threshold ● Disablement below a threshold ● Generalized from global to cohort level systems and member level systems

Multi-Objective Optimization Maximize Metric 1 ● Metric 2 is greater than a ● minimum Metric 3 is bounded ● … ● Most Product Applications ● Engagement vs Revenue ● Sessions vs Notification / ● Email Volume Member Value vs Annoyance ●

System Infrastructure

System Architecture Data is collected from different sources • and restructured to form Input 𝐵, 𝑐, 𝑑

System Architecture Data is collected from different sources • and restructured to form Input 𝐵, 𝑐, 𝑑 The solver is called which runs the overall • iterations. The data is split into multiple executors and • they perform matrix vector multiplications in parallel The driver collects the dual and broadcasts • it back to continue the iterations

System Architecture Data is collected from different sources • and restructured to form Input 𝐵, 𝑐, 𝑑 The solver is called which runs the overall • iterations. The data is split into multiple executors and • they perform matrix vector multiplications in parallel The driver collects the dual and broadcasts • it back to continue the iterations On convergence the final duals are • returned which are used in online serving

Detailed Spark Implementation Data Representation Estimating Primal Estimating Gradient • Customized DistributedMatrix • • Component wise Matrix Most computationally API Multiplications and expensive step to get Projections are done in • • : BlockMatrix API from The worst-case complexity is parallel 𝑃 𝑜 = 𝐽𝐾 Apache MLLib We cache 𝐵 in executor and • • : Leverage Diagonal broadcast duals to minimize structure and implement communication cost. DistributedVector API using • RDD (index, Vector) The overall complexity to get the primal is 𝑃(𝐾)

Experimental Results

Comparative Results We compare with a technique of • splitting the problem (SOTA): Please see the full paper for other comparisons

Real Data Results Test on large-scale volume • optimization and matching problems Spark 2.3 with up to 800 • executors 1 Trillion use case • converged within 12 hours SCS: O’Donoghue et al (2016)

Key Takeaways

Key Takeaways A framework for solving structured LP problems arising in several applications • from internet industry Most multi-objective optimization can be framed through this. • Given the computation resources, we can scale to extremely large problems. • We can easily scale up to 1 Trillion variables on real data. •

Thank you

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications - PowerPoint PPT Presentation

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications Kinjal Basu Amol Ghoting Rahul Mazumder Yao Pan LinkedIn AI LinkedIn AI MIT LinkedIn AI 1 Overview 2 ECLIPSE: Extreme Scale LP Solver Agenda 3 Applications 4

Using Eclipse for Java Using Eclipse for Java 1 / 1 Using Eclipse IDE for Java Development

Introducing Eclipse Plug-ins Eclipse Standard Widget Toolkit Perspectives, views, and

Welcome Getting Started With Eclipse Setting Up Eclipse A First Project Getting Started With

ECLIPSE! Todays Target: I can differentiate between a solar and lunar eclipse, including the

CS2334 Lab2 Eclipse And Debugging Survival Guide Yu-Hsin Li and Mark Woehrer Fall 2008 Yu-Hsin

Eclipse Marketplace Client (MPC) Release and Graduation Review Submitter Ian Skerrett, Eclipse

Introducing OSGi Eclipse Plug-ins 1 Plug-in State Information Plug-in Structure

Total Solar Eclipse Project Exploratoriums eclipse history 1998 Aruba 1999 Turkey

ECLIPSE TIPS & TRICKS LAKSHMI P SHANMUGAM SARIKA SINHA Eclipse Platform Co-lead Eclipse

Eclipse Software Engineering with an Integrated Development Environment (IDE) Markus Scheidgen

Move your VS Code extension into Eclipse Che Florent Benoit 1 Eclipse Che 7 Eclipse Che 7 2

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

Extreme Heat Preparedness Objectives What is extreme heat ? How does it impact SF? What are the

2014: Extreme territories 2 2015: Extreme territories 3 2016: Extreme territories 4 2018:

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

The Great Tennessee Eclipse 2017 www.mtsu.edu/eclipse 1 Goals for today 1) We convey our

Transport methods for sampling: low-dimensional structure and preconditioning Youssef Marzouk

Object Detection Deep ConvNets for Recognition for... Images (global) Objects (local) Video

The K 3 form factor from four-flavor lattice QCD and | V us | Aida X. El-Khadra (University

Markov chain Monte Carlo methods Youssef Marzouk Department of Aeronautics and Astronautics

Sec 1 Registration 2018 22 DEC 2017 Welcome A warm welcome to all parents! School Leaders

Learning to Optimally Segment Point Clouds Peiyun Hu, David Held, Deva Ramanan Carnegie Mellon

Managing Palmer Amaranth in Peanut Eric P. Prostko Extension Weed Specialist Department of Crop

Controlling Controlling Palmer Palmer Amaranth in Amaranth in Soybean Soybean Eric P.

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications - PowerPoint PPT Presentation

ECLIPSE: An Extreme-Scale Linear Program Solver for Web-Applications Kinjal Basu Amol Ghoting Rahul Mazumder Yao Pan LinkedIn AI LinkedIn AI MIT LinkedIn AI 1 Overview 2 ECLIPSE: Extreme Scale LP Solver Agenda 3 Applications 4

Using Eclipse for Java Using Eclipse for Java 1 / 1 Using Eclipse IDE for Java Development

Introducing Eclipse Plug-ins Eclipse Standard Widget Toolkit Perspectives, views, and

Welcome Getting Started With Eclipse Setting Up Eclipse A First Project Getting Started With

ECLIPSE! Todays Target: I can differentiate between a solar and lunar eclipse, including the

CS2334 Lab2 Eclipse And Debugging Survival Guide Yu-Hsin Li and Mark Woehrer Fall 2008 Yu-Hsin

Eclipse Marketplace Client (MPC) Release and Graduation Review Submitter Ian Skerrett, Eclipse

Introducing OSGi Eclipse Plug-ins 1 Plug-in State Information Plug-in Structure

Total Solar Eclipse Project Exploratoriums eclipse history 1998 Aruba 1999 Turkey

ECLIPSE TIPS &amp; TRICKS LAKSHMI P SHANMUGAM SARIKA SINHA Eclipse Platform Co-lead Eclipse

Eclipse Software Engineering with an Integrated Development Environment (IDE) Markus Scheidgen

Move your VS Code extension into Eclipse Che Florent Benoit 1 Eclipse Che 7 Eclipse Che 7 2

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

Extreme Heat Preparedness Objectives What is extreme heat ? How does it impact SF? What are the

2014: Extreme territories 2 2015: Extreme territories 3 2016: Extreme territories 4 2018:

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

The Great Tennessee Eclipse 2017 www.mtsu.edu/eclipse 1 Goals for today 1) We convey our

Transport methods for sampling: low-dimensional structure and preconditioning Youssef Marzouk

Object Detection Deep ConvNets for Recognition for... Images (global) Objects (local) Video

The K 3 form factor from four-flavor lattice QCD and | V us | Aida X. El-Khadra (University

Markov chain Monte Carlo methods Youssef Marzouk Department of Aeronautics and Astronautics

Sec 1 Registration 2018 22 DEC 2017 Welcome A warm welcome to all parents! School Leaders

Learning to Optimally Segment Point Clouds Peiyun Hu, David Held, Deva Ramanan Carnegie Mellon

Managing Palmer Amaranth in Peanut Eric P. Prostko Extension Weed Specialist Department of Crop

Controlling Controlling Palmer Palmer Amaranth in Amaranth in Soybean Soybean Eric P.

ECLIPSE TIPS & TRICKS LAKSHMI P SHANMUGAM SARIKA SINHA Eclipse Platform Co-lead Eclipse