Introduction to Large-Scale ML Shan-Hung Wu shwu@cs.nthu.edu.tw - PowerPoint PPT Presentation

Introduction to Large-Scale ML Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 1 / 20

Outline What’s Machine Learning? 1 About this Course... 2 FAQ 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 2 / 20

Outline What’s Machine Learning? 1 About this Course... 2 FAQ 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 3 / 20

Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20

Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20

Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20

Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Machine learning algorithms use the a posteriori knowledge to solve problems Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20

Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Machine learning algorithms use the a posteriori knowledge to solve problems Learnt from examples (as extra input) Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20

Example Data X as Extra Input Unsupervised: i = 1 , where x ( i ) 2 R D X = { x ( i ) } N E.g., x ( i ) an email Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 5 / 20

Example Data X as Extra Input Unsupervised: i = 1 , where x ( i ) 2 R D X = { x ( i ) } N E.g., x ( i ) an email Supervised: i = 1 , where x ( i ) 2 R D and y ( i ) 2 R K , X = { ( x ( i ) , y ( i ) ) } N E.g., y ( i ) 2 { 0 , 1 } a spam label Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 5 / 20

General Types of Learning (1/2) Supervised learning : learn to predict the labels of future data points X 2 R N ⇥ D : x 0 2 R N : y 2 R N ⇥ K : y 0 2 R K : [ e ( 6 ) , e ( 1 ) , e ( 9 ) , e ( 4 ) , e ( 2 ) ] ? Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 6 / 20

General Types of Learning (1/2) Supervised learning : learn to predict the labels of future data points X 2 R N ⇥ D : x 0 2 R N : y 2 R N ⇥ K : y 0 2 R K : [ e ( 6 ) , e ( 1 ) , e ( 9 ) , e ( 4 ) , e ( 2 ) ] ? Unsupervised learning : learn patterns or latent factors in X Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 6 / 20

General Types of Learning (2/2) Reinforcement learning : learn from “good”/“bad” feedback of actions (instead of correct labels) to maximize the goal Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 7 / 20

General Types of Learning (2/2) Reinforcement learning : learn from “good”/“bad” feedback of actions (instead of correct labels) to maximize the goal AlphaGo [1] is a hybrid of reinforcement learning and supervised learning The latter is used to tell how good a “move” performed by an agent Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 7 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Testing : evaluate the performance of the learned f ⇤ using the testing 4 dataset Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Testing : evaluate the performance of the learned f ⇤ using the testing 4 dataset Apply the model to the real world 5 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20

Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20

Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20

Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20

Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Training : to solve w ⇤ = argmin w Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20

Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Training : to solve w ⇤ = argmin w Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 3 | X 0 | Σ i 1 ( x 0 ( i ) ; f ( x 0 ( i ) ) = y 0 ( i ) ) 1 Testing : accuracy 4 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20

Introduction to Large-Scale ML Shan-Hung Wu shwu@cs.nthu.edu.tw - PowerPoint PPT Presentation

Introduction to Large-Scale ML Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 1 / 20 Outline Whats Machine

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Large-Scale Electronic Voting Protocols Mike Carpenter Introduction What is meant by large-scale

Large-scale production of graphene: Introduction Large-scale production of graphene: what is

Large-Scale Survey Interviewing Following Large Scale Survey Interviewing Following the 2008

MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

Workshop Workshop on Large on Large- -Scale Disaster Recovery Scale Disaster Recovery i i

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET)

Deploying Large Scale AVB/TSN Networks Jeff Koftinoff, Meyer Sound Laboratories, Inc. June 19,

Meeting the Challenges of Ultra- -Large Large- -Scale Scale Meeting the Challenges of Ultra

Experiments with Multisource Decoding and A priori Fragments Speech and Hearing Research

Montana Commission on Sentencing Win-Wins for Local and

Modelling Unlinkability Stefan K opsell Sandra Steinbrecher Technische Universit at

ApproxJoin Approximate Distributed Joins Do Le Quoc, Istemi Ekin Akkus, Pramod Bhatotia, Spyros

Logicism - Frege Caroline Foster / Edgar Andrade Philosophy of Mathematics ILLC - Master of

Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the

Classical and Weak Solutions to Local First Order Mean Field Games through Elliptic Regularity

Plan for Today This is an introduction to Game Theory. In particular, well discuss: