Introduction to Large-Scale ML Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 1 / 20
Outline What’s Machine Learning? 1 About this Course... 2 FAQ 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 2 / 20
Outline What’s Machine Learning? 1 About this Course... 2 FAQ 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 3 / 20
Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20
Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20
Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20
Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Machine learning algorithms use the a posteriori knowledge to solve problems Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20
Prior vs. Posteriori Knowledge To solve a problem, we need an algorithm E.g., sorting A priori knowledge is enough For some problem, however, we do not have the a priori knowledge E.g., to tell if an email is spam or not The correct answer varies in time and from person to person Machine learning algorithms use the a posteriori knowledge to solve problems Learnt from examples (as extra input) Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 4 / 20
Example Data X as Extra Input Unsupervised: i = 1 , where x ( i ) 2 R D X = { x ( i ) } N E.g., x ( i ) an email Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 5 / 20
Example Data X as Extra Input Unsupervised: i = 1 , where x ( i ) 2 R D X = { x ( i ) } N E.g., x ( i ) an email Supervised: i = 1 , where x ( i ) 2 R D and y ( i ) 2 R K , X = { ( x ( i ) , y ( i ) ) } N E.g., y ( i ) 2 { 0 , 1 } a spam label Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 5 / 20
General Types of Learning (1/2) Supervised learning : learn to predict the labels of future data points X 2 R N ⇥ D : x 0 2 R N : y 2 R N ⇥ K : y 0 2 R K : [ e ( 6 ) , e ( 1 ) , e ( 9 ) , e ( 4 ) , e ( 2 ) ] ? Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 6 / 20
General Types of Learning (1/2) Supervised learning : learn to predict the labels of future data points X 2 R N ⇥ D : x 0 2 R N : y 2 R N ⇥ K : y 0 2 R K : [ e ( 6 ) , e ( 1 ) , e ( 9 ) , e ( 4 ) , e ( 2 ) ] ? Unsupervised learning : learn patterns or latent factors in X Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 6 / 20
General Types of Learning (2/2) Reinforcement learning : learn from “good”/“bad” feedback of actions (instead of correct labels) to maximize the goal Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 7 / 20
General Types of Learning (2/2) Reinforcement learning : learn from “good”/“bad” feedback of actions (instead of correct labels) to maximize the goal AlphaGo [1] is a hybrid of reinforcement learning and supervised learning The latter is used to tell how good a “move” performed by an agent Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 7 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Testing : evaluate the performance of the learned f ⇤ using the testing 4 dataset Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
General Machine Learning Steps Data collection, preprocessing (e.g., integration, cleaning, etc.), and 1 exploration Split a dataset into the training and testing datasets 1 Model development 2 Assume a model { f } that is a collection of candidate functions f ’s 1 (representing posteriori knowledge) we want to discover f may be parametrized by w 1 Define an cost function C ( w ) (or functional C [ f ] ) that measures “how 2 good a particular f can explain the training data” Training : employ an algorithm that finds the best (or good enough) 3 function f ⇤ in the model that minimizes the cost function over the training dataset Testing : evaluate the performance of the learned f ⇤ using the testing 4 dataset Apply the model to the real world 5 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 8 / 20
Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20
Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20
Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20
Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Training : to solve w ⇤ = argmin w Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 3 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20
Example for Spam Detection Random split of your past emails and labels 1 Training dataset: X = { ( x ( i ) , y ( i ) ) } i 1 Testing dataset: X 0 = { ( x 0 ( i ) , y 0 ( i ) ) } i 2 Model development 2 Model : { f : f ( x ; w ) = w > x } 1 Cost function : C ( w ) = Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 2 Training : to solve w ⇤ = argmin w Σ i 1 ( x ( i ) ; f ( x ( i ) ; w ) 6 = y ( i ) ) 3 | X 0 | Σ i 1 ( x 0 ( i ) ; f ( x 0 ( i ) ) = y 0 ( i ) ) 1 Testing : accuracy 4 Shan-Hung Wu (CS, NTHU) Introduction Machine Learning 9 / 20
Recommend
More recommend