Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas A&M University ICML 2019 Poster @ Pacific Ballroom # 124 Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 1 / 10

Lifetime Maximization: Continuing The Play • A finite game is played for the purpose of winning. • An infinite game is for the purpose of continuing the play. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 2 / 10

Lifetime Maximization: Continuing The Play • A finite game is played for the purpose of winning. • An infinite game is for the purpose of continuing the play. Lifetime maximization Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 2 / 10

Why Lifetime Maximization? Medical treatments Portfolio selection Cloud services Salient features of these applications: 1 Each participant has a satisfaction level. 2 A participant drops if the outcomes are not satisfactory. 3 The outcomes depend heavily on the contextual information of the participant. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 3 / 10

Model: Linear Bandits With Reneging 1 { x t , a } a ∈ A are pairwise participant-action contexts (observed by the platform when participant t arrives). 2 Outcome r t , a is conditionally independent given the context and has mean θ T ∗ x t , a . 3 Participant t keeps interacting with the platform as long as r t , a ≥ β t . Otherwise, the participant drops. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 4 / 10

Heteroscedastic Outcomes • Heteroscedasticity: Outcome variations can be wildly different across different participants and actions Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 5 / 10

Heteroscedastic Outcomes • Heteroscedasticity: Outcome variations can be wildly different across different participants and actions • Example: • Two actions, 1 (red) and 2 (blue) • Participant satisfaction level = β • Heteroscedasticity is widely studied in econometrics, and is usually captured through regression on variance. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 5 / 10

Model: Heteroscedastic Bandits With Reneging 1 { x t , a } a ∈ A are pairwise participant-action contexts (observed by the platform when participant t arrives) 2 Outcome r t , a is conditionally independent given the context and satisfies that r t , a ∼ N ( θ ⊤ ∗ x t , a , f ( φ ⊤ ∗ x t , a )) . 3 Participant t keeps interacting with the platform if r t , a ≥ β t . Otherwise, the participant drops. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 6 / 10

Oracle Policy and Regret • Oracle policy π oracle already knows θ ∗ and φ ∗ . • For each participant t , π oracle keeps choosing the action that minimizes reneging probability P { r t , a < β t | x t , a } • Hence, π oracle is a fixed policy Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 7 / 10

Oracle Policy and Regret • Oracle policy π oracle already knows θ ∗ and φ ∗ . • For each participant t , π oracle keeps choosing the action that minimizes reneging probability P { r t , a < β t | x t , a } • Hence, π oracle is a fixed policy • For T participants, define Regret π ( T ) = ( the total expected lifetime under π oracle ) − ( the total expected lifetime under π ) Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 7 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime • Apply the action arg max a Q HR ( x t , a ) . t Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Proposed Algorithm: HR-UCB • When participant t arrives, obtain estimators � θ, � φ with confidence intervals C θ , C φ based on past observations. • For each action a , construct a UCB index as � �� − 1 � β t − � θ ⊤ x t , a Q HR ( x t , a ) = Φ � + ∆( C θ , C φ , x t , a ) (1) t � �� f ( � φ ⊤ x t , a ) confidence interval for lifetime � �� estimated expected lifetime • Apply the action arg max a Q HR ( x t , a ) . t Main technical challenges 1 Design estimators � θ, � φ under heteroscedasticity 2 Derive the confidence intervals C θ , C φ for � θ, � φ 3 Convert the C θ , C φ into the confidence interval of lifetime Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 8 / 10

Estimators of θ ∗ and φ ∗ (Challenge 1) • Generalized least square estimator (Wooldridge, 2015): With any n outcome observations, � � − 1 X ⊤ � X ⊤ θ n = n X n + λ I n r , � � − 1 X ⊤ � n f − 1 ( � X ⊤ ε ◦ � φ n = n X n + λ I ε ) . • X n is the matrix of n applied contexts • r is the vector of n observed outcomes ε ( x t , a ) = r t , a − � n x t , a is the estimated residual with respect to � θ ⊤ • � θ n Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 9 / 10

Estimators of θ ∗ and φ ∗ (Challenge 1) • Generalized least square estimator (Wooldridge, 2015): With any n outcome observations, � � − 1 X ⊤ � X ⊤ θ n = n X n + λ I n r , � � − 1 X ⊤ � n f − 1 ( � X ⊤ ε ◦ � φ n = n X n + λ I ε ) . • X n is the matrix of n applied contexts • r is the vector of n observed outcomes ε ( x t , a ) = r t , a − � n x t , a is the estimated residual with respect to � θ ⊤ • � θ n • Nice property (Abbasi-Yadkori et al., 2011): Let V n = X ⊤ n X n + λ I . For any δ > 0, with probability at least 1 − δ , for all n ∈ N , � � log( 1 || � θ n − θ ∗ || V n ≤ C θ ( δ, n ) = O δ ) + log n . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 9 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Theorem � � ∆( C θ ( n , δ ) , C φ ( n , δ ) , x ) := k 1 C θ ( n , δ ) + k 2 C φ ( n , δ ) · || x || V − 1 is a n confidence interval with respect to lifetime, where k 1 , k 2 are constants independent of past history and x . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Main Technical Contributions (Challenges 2 & 3) Theorem For any δ > 0, with probability at least 1 − 2 δ , we have � � log( 1 || � φ n − φ ∗ || V n ≤ C φ ( δ, n ) = O , ∀ n ∈ N . δ ) + log n (2) • The proof is more involved since � φ n depends on the residual � ε Theorem � � ∆( C θ ( n , δ ) , C φ ( n , δ ) , x ) := k 1 C θ ( n , δ ) + k 2 C φ ( n , δ ) · || x || V − 1 is a n confidence interval with respect to lifetime, where k 1 , k 2 are constants independent of past history and x . Theorem �� Under the HR-UCB policy, Regret ( T ) = O T (log T ) 3 . Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging ICML 2019 10 / 10

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas

Strategic Plan Come for the view, stay for a lifetime Rick Winters, CAO / Director of Operations

STAY HOME | STAY HEALTHY | STAY CONNECTED | RETURN STRONGER STAY HOME. STAY HEALTHY. STAY

Lifetime Products EcoHousing Initiative February 2010 WHO IS LIFETIME? Lifetime Products, Inc.

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Stay informed. Stay involved. Stay invested. Parallel Parenting Presentation Children

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

Oh, Wont You Stay? Oh, Wont You Stay? Oh, Won t You Stay? Oh, Won t You Stay? Predictors

ST STAY Y MOB MOBILE ILE, ST , STAY Y UP2GO UP2GO COMMUNI COMMUNITY TY CARPOOL CARPOOLIN

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Updating Beliefs via Maximization of Expected Epistemic Utility Ted Shear 1 Branden Fitelson 2 1

A lifetime dedicated to A lifetime dedicated to community advocacy. Co Founder of the

Notes on Neal and Hintons Generalized Expectation Maximization (GEM) Algorithm Mark Johnson

Agenda Introduction: Who can Benefit from Maryland Communities I. for a Lifetime? The

Led to Stay: The Impact of Led to Stay: The Impact of Mentorships on Retention Mentorships on

Dry Slides! That Last a Lifetime 24 480-874-3470 Dry Slides! 55' (16.76 m) long $10 per

SCHEMATIC SCHEMATIC DESIGN DESIGN KEY ISSUES STAY THE SAME STRATEGIES TO STAY THE COURSE

Why Customer Lifetime Value could revolutionise your business and key strategies to help deliver

Expectation maximization don't have any labels. Can you still do something? ! Amazingly you can!

Opening the door to Lifetime Allowance & Relevant Life Opening the door to Lifetime Allowance

Dynamic Mechanism Design: Revenue Equivalence, Prot Maximization, and Information Disclosure

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear - PowerPoint PPT Presentation

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging Ping-Chun Hsieh 1 , Xi Liu 1 , Anirban Bhattacharya 2 , and P . R. Kumar 1 1 Department of ECE Texas A&M University 2 Department of Statistics Texas

Strategic Plan Come for the view, stay for a lifetime Rick Winters, CAO / Director of Operations

STAY HOME | STAY HEALTHY | STAY CONNECTED | RETURN STRONGER STAY HOME. STAY HEALTHY. STAY

Lifetime Products EcoHousing Initiative February 2010 WHO IS LIFETIME? Lifetime Products, Inc.

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Stay informed. Stay involved. Stay invested. Parallel Parenting Presentation Children

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

Oh, Wont You Stay? Oh, Wont You Stay? Oh, Won t You Stay? Oh, Won t You Stay? Predictors

ST STAY Y MOB MOBILE ILE, ST , STAY Y UP2GO UP2GO COMMUNI COMMUNITY TY CARPOOL CARPOOLIN

On Information-Maximization On Information-Maximization Clustering: Tuning Parameter Clustering:

Updating Beliefs via Maximization of Expected Epistemic Utility Ted Shear 1 Branden Fitelson 2 1

A lifetime dedicated to A lifetime dedicated to community advocacy. Co Founder of the

Notes on Neal and Hintons Generalized Expectation Maximization (GEM) Algorithm Mark Johnson

Agenda Introduction: Who can Benefit from Maryland Communities I. for a Lifetime? The

Led to Stay: The Impact of Led to Stay: The Impact of Mentorships on Retention Mentorships on

Dry Slides! That Last a Lifetime 24 480-874-3470 Dry Slides! 55' (16.76 m) long $10 per

SCHEMATIC SCHEMATIC DESIGN DESIGN KEY ISSUES STAY THE SAME STRATEGIES TO STAY THE COURSE

Why Customer Lifetime Value could revolutionise your business and key strategies to help deliver

Expectation maximization don't have any labels. Can you still do something? ! Amazingly you can!

Opening the door to Lifetime Allowance &amp; Relevant Life Opening the door to Lifetime Allowance

Dynamic Mechanism Design: Revenue Equivalence, Prot Maximization, and Information Disclosure

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Opening the door to Lifetime Allowance & Relevant Life Opening the door to Lifetime Allowance