In-Database Machine Learning: Using Gradient Descent and Tensor - PowerPoint PPT Presentation

Chair III: Database Systems, Professorship for Data Mining and Analytics Chair XXV: Data Science and Engineering Department of Informatics Technical University of Munich In-Database Machine Learning: Using Gradient Descent and Tensor Algebra Maximilian E. Schüle, Frédéric Simonis, Thomas Heyenbrock, Alfons Kemper, Stephan Günnemann, Thomas Neumann Rostock, 04. März 2019 Maximilian E. Schüle | In-Database Machine Learning 1

What Need Database Systems for ML? Database Systems Machine Learning Why don‘t use HyPer? Maximilian E. Schüle | In-Database Machine Learning 2

What Need Database Systems for ML? Machine Learning : data in tensors and a parametrised loss function HyPer Tensors Gradient Descent CC BY-SA 3.0, TimothyRias https://commons.wikimedia.org/w/index.php? curid=14729540 Advantages: Optimisation problems are solvable in the core of database servers Goal: Make database systems more attractive What it is: Architectural blueprint for the integration of optimisation models in DBMS What it is not: Study about the quality of different optimisation problems Maximilian E. Schüle | In-Database Machine Learning 3

What is Gradient Descent? Initial weights Linear Gradient Regression Descent m a, b ( rm )= a ∗ rm + b ≈ medv Median Value Optimal weights? Gradient Descent! Optimal predict l rm,medv ( a,b )=( m a,b ( rm )− medv ) 2 weights #Rooms Training Data Test Data RM MEDV RM MEDV How to optimse weights? How to label data? Maximilian E. Schüle | In-Database Machine Learning 4

Approach Integration as operators in relational algebra Representation of mathematical functions on relations HyPer Concept of pipelines Gradient Gradient needed Descent Automatic differentiation Representation of tensors Tensors Either: one relation represents one tensor Or: own tensor data type CC BY-SA 3.0, TimothyRias https://commons.wikimedia.org/w/index.php? curid=14729540 Maximilian E. Schüle | In-Database Machine Learning 5

Integration in Relational Algebra Operator Tree Model / Loss Function Pipelining Operators for labelling and Representation of a loss- as Integration as gradient descent: well as of a model function a pipeline breaker Pipelines (Weights/Data) m o d e l f u n c t i o n m m w ( x )= ∑ x i ∗ w i ≈ y i ∈ m l o s s f u n c t i o n l l x , y ( w )=( m w ( x )− y ) 2 λ Maximilian E. Schüle | In-Database Machine Learning 6

Integration in Relational Algebra: Operator Tree Two Operators needed Gradient descent to optimise weights of a parametrised loss function Labelling operator to label predicted values L a b e l l i n g C a l c u l a t e d We i g h t s λ G r a d i e n t D e s c e n t T e s t D a t a M o d e l F u n c t i o n λ T r a i n i n g D a t a I n i t i a l We i g h t s L o s s F u n c t i o n Gradient Descent Initial weights and training data as input and optimised weights as output Lambda expression as loss function to be optimised Labelling Input: test dataset and optimal weights Label: evaluated lambda expression for each tuple Maximilian E. Schüle | In-Database Machine Learning 7

Integration in Rel. Algebra: Lambda Functions Lambda Expression To inject user-defined code Operator k-Means λ Right Pipeline Input Points λ Left Pipeline Injected Code λ ( a , b ) s q r t ( ( a . x - b . x ) ^ 2 + ( a . y - b . y ) ^ 2 ) Euclidean Distance * k m e a n s ( ( t a b l e p o i n t s ) , s q r t ( ( a . x - b . x ) ^ 2 + ( a . y - b . y ) ^ 2 ) , 2 ) s e l e c t f r o m λ ( a , b ) Maximilian E. Schüle | In-Database Machine Learning 8

Integration in Rel. Algebra: Lambda Functions Notation Relations/Lambda Functions w = ( w 1, w 2,... , w m ) W { [ w _ 1 , w _ 2 , … , w _ m ] } We i g h t s x = ( x 1, x 2,... ,x m , y ) X { [ x _ 1 , x _ 2 , . . , x _ m , y ] } n t u p l e w i t h m a t t r i b u t e s m w ( x )= ∑ x i ∗ w i ≈ y λ ( W , X ) ( W . w _ 1 * X . x _ 1 + . . . + X . x _ m ) M o d e l f u n c t i o n i ∈ m l x , y ( w )=( m w ( x )− y ) 2 L o s s f u n c t i o n λ ( W , X ) ( W . w _ 1 * X . x _ 1 + . . . + X . x _ m - y ) ² L a mb d a F u n c t i o n s i n S Q L c r e a t e t a b l e t r a i n i n g d a t a ( x f l o a t , y f l o a t ) ; c r e a t e t a b l e t e s t d a t a ( x f l o a t ) ; c r e a t e t a b l e w e i g h t s ( a f l o a t , b f l o a t ) ; c r e a t e t a b l e w e i g h t s ( a f l o a t , b f l o a t ) ; i n s e r t i n t o t r a i n i n g d a t a … i n s e r t i n t o w e i g h t s … i n s e r t i n t o t r a i n i n g d a t a … i n s e r t i n t o w e i g h t s … s e l e c t * f r o m g r a d i e n t d e s c e n t ( – l o s s f u n c t i o n a s λ - e x p r e s s i o n s e l e c t * f r o m l a b e l i n g ( λ ( d a t a , w e i g h t s ) ( w e i g h t s . a * d . x + w e i g h t s . b - d . y ) ² , – - m o d e l f u n c t i o n a s λ - e x p r e s s i o n - - t r a i n i n g s e t a n d i n i t i a l w e i g h t s λ ( d a t a , w e i g h t s ) ( w e i g h t s . a * d . x + w e i g h t s . b ) , ( s e l e c t x , y f r o m t r a i n i n g d a t a d ) , - - t r a i n i n g s e t a n d i n i t i a l w e i g h t s ( s e l e c t a , b f r o m w e i g h t s ) , ( s e l e c t x , y f r o m t e s t d a t a d ) , - - l e a r n i n g r a t e a n d m a x . n u m b e r o f i t e r a t i o n ( s e l e c t a , b f r o m w e i g h t s ) 0 . 0 5 , 1 0 0 ) ; ) ; Maximilian E. Schüle | In-Database Machine Learning 9

Integration in Relational Algebra: Pipelining L a b e l l i n g λ G r a d i e n t D e s c e n t T e s t D a t a λ T r a i n i n g D a t a I n i t i a l We i g h t s Materialising Pipelined Combined m a x . I t e r a t i o n s m a x . I t e r a t i o n s - 1 S t o c h a s t i c G r a d i e n t D e s c e n t S t o c h a s t i c G r a d i e n t D e s c e n t s u b s u b s u b m a i n B a t c h / S t o . G D s u b s u b s u b m a i n s u b s u b s u b m a i n B a t c h / S t o . G D λ λ λ . . . m T r a i n i n g D a t a I n i t i a l We i g h t s I n i t i a l We i g h t s T r a i n i n g D a t a I n i t i a l We i g h t s 1 a x I t e r Maximilian E. Schüle | In-Database Machine Learning 10

Integration in Relational Algebra: Pipelining m a x . I t e r a t i o n s Materialisation of all tuples (parallel/serial) s u b s u b s u b m a i n B a t c h / S t o . G D Any optimisation method possible Materialising Parallelism: parallel_for λ T r a i n i n g D a t a I n i t i a l We i g h t s S t o c h a s t i c G r a d i e n t D e s c e n t No materialisation s u b s u b s u b m a i n Stochastic gradient descent only Pipelined Distribution to pipelines λ Downside: multiple copys of the operator tree . . . m I n i t i a l We i g h t s 1 a x I t e r m a x . I t e r a t i o n s - 1 S t o c h a s t i c G r a d i e n t D e s c e n t s u b s u b s u b m a i n B a t c h / S t o . G D First iteration in pipelines Combined Remaining ones in the main thread λ T r a i n i n g D a t a I n i t i a l We i g h t s Maximilian E. Schüle | In-Database Machine Learning 11

In-Database Machine Learning: Using Gradient Descent and Tensor - PowerPoint PPT Presentation

Chair III: Database Systems, Professorship for Data Mining and Analytics Chair XXV: Data Science and Engineering Department of Informatics Technical University of Munich In-Database Machine Learning: Using Gradient Descent and Tensor Algebra

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

Learning to learn by gradient descent by gradient descent Liyan Jiang July 18, 2019 1

Machine Learning (CSE 446): Gradient Descent and Stochastic Gradient Descent Sham M Kakade

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Stochastic Gradient Descent (SGD) Todays Class Stochastic Gradient Descent (SGD) SGD Recap

Fitting Neural Networks Gradient Descent and Stochastic Gradient Descent CS109A Introduction to

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. NeurIPS

LOGISTIC REGRESSION, GRADIENT LOGISTIC REGRESSION, GRADIENT DESCENT, NEWTON DESCENT, NEWTON

Gradient Descent Michail Michailidis & Patrick Maiden Outline

Large Scale Machine Learning with Stochastic Gradient Descent L eon Bottou leon@bottou.org

Conjugate gradient training algorithm Steepest descent algorithm Definitions: So far: j

Artificial Neural Networks (Part 2) Gradient Descent Learning and Backpropagation Christian Jacob

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

Linear Models CMPUT 366: Intelligent Systems P&M 7.3 Lecture Outline 1. Recap 2.

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output

A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Colorado

Probabilistic Graphical Models David Sontag New York University Lecture 10, April 3, 2012 David

Printability and Complexity Co-optimization Bentian Jiang 1 , Lixin Liu 1 , Yuzhe Ma 1 , Hang Zhang

Basics on generative and discriminative classification Machine Learning and Object Recognition

Sambuz

Useful Links

Newsletter

Mail Us

In-Database Machine Learning: Using Gradient Descent and Tensor - PowerPoint PPT Presentation

Chair III: Database Systems, Professorship for Data Mining and Analytics Chair XXV: Data Science and Engineering Department of Informatics Technical University of Munich In-Database Machine Learning: Using Gradient Descent and Tensor Algebra

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

Learning to learn by gradient descent by gradient descent Liyan Jiang July 18, 2019 1

Machine Learning (CSE 446): Gradient Descent and Stochastic Gradient Descent Sham M Kakade

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Stochastic Gradient Descent (SGD) Todays Class Stochastic Gradient Descent (SGD) SGD Recap

Fitting Neural Networks Gradient Descent and Stochastic Gradient Descent CS109A Introduction to

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. NeurIPS

LOGISTIC REGRESSION, GRADIENT LOGISTIC REGRESSION, GRADIENT DESCENT, NEWTON DESCENT, NEWTON

Gradient Descent Michail Michailidis &amp; Patrick Maiden Outline

Large Scale Machine Learning with Stochastic Gradient Descent L eon Bottou leon@bottou.org

Conjugate gradient training algorithm Steepest descent algorithm Definitions: So far: j

Artificial Neural Networks (Part 2) Gradient Descent Learning and Backpropagation Christian Jacob

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

Linear Models CMPUT 366: Intelligent Systems P&amp;M 7.3 Lecture Outline 1. Recap 2.

Training Strategies CS 6355: Structured Prediction 1 So far we saw What is structured output

A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1

Part 2: Introduction to Graphical Models Sebastian Nowozin and Christoph H. Lampert Colorado

Probabilistic Graphical Models David Sontag New York University Lecture 10, April 3, 2012 David

Printability and Complexity Co-optimization Bentian Jiang 1 , Lixin Liu 1 , Yuzhe Ma 1 , Hang Zhang

Basics on generative and discriminative classification Machine Learning and Object Recognition

Sambuz

Useful Links

Newsletter

Mail Us

Gradient Descent Michail Michailidis & Patrick Maiden Outline

Linear Models CMPUT 366: Intelligent Systems P&M 7.3 Lecture Outline 1. Recap 2.