Machine Learning sanparith.marukatat@nectec.or.th Today Example - PowerPoint PPT Presentation

Machine Learning sanparith.marukatat@nectec.or.th

Today • Example of intelligent system: OCR • k-Nearest Neighbor Classifier • Generative model • Maximum likelihood • Naïve Bayes model • Gaussian model

What is OCR? • Optical Character Recognition – Input: scanned images, photos, vdo images – Output: text file • Alternative input – Electronic stylus e.g. PDA – Online handwritten recognition

OCR process (1) • Preprocessing – Image enhancement, denoise, deskew, ... – Binarization • Layout analysis and character segmentation • Character recognition • Spell correction

Should be removed

Two columns picture

1 line

1 character

OCR process (2) • Preprocessing uses image processing tech. • Layout analysis uses rule-based + some stats. • Character recognition – Classifier (trained from training corpus) – Look-up table: no class -> ascii or UTF code (given by system designer) • Spell correction uses dictionary + some stats + some NLP tech.

How to build the recognition module? (1) • Technical choice – All separated character: Neural Network, SVM – Few touched characters: Class of touched char – Some broken characters ( อำำ ): Class of sub-char – Rule-based segmentation – Several touched chars (e.g. arab, urdu): 2D- HMM

How to build the recognition module? (2) • Normalize character image (reduce variation, get fixed size) • Select features – High-level features: e.g. head, tail • Contain more information • Cannot reliably detected – Low-level features: pixels color, edge • Single feature is not meaningful • Can be easily detected • Can be improved: PCA, LDA, NLDA, ...

How to build the recognition module? (2) • Design class • Build a feature extractor, ex: vector of pixels color • Construct a training corpus – 1 example = 1 vector and 1 class – Very large number of examples – Cover all conditions: dpi, fonts, font sizes, style (e.g. slant, bold), writing styles, pen styles, with and without noises

How to build the recognition module? (3) • Building corpus – Handwritten • Collect sample • Segment from forms or manual segmentation – Printed • Print different fonts, font sizes, ... • Scan, scan of copy, ... • Time consuming

How to build the recognition module? (4) • Select a classifier – Select tools • SNNS or fann for Neural Network • libsvm or svmlight for SVM • weka – Format of training corpus – Parameters and their values – How to use it in your code

What is Neural Network? • Biological inspired multi-class classifier • Set of nodes with oriented connections Multi-Layer Perceptron Diabolo network Recurrent network

Using neural network • Try MLP with 1 hidden layer first • 1 parameter = number of hidden nodes • Training with Gradient descent • 1 training parameter = learning rates

What is SVM? • Linear classifier using kernel trick trained to tradeoff between error and generalization – Linear classifier • output is linear combination of input features • y = sign(w T x) • Use multiple linear classifier for multi-class problem – Kernel trick • Replace all dot product with a kernel function • K(x1,x2) = <g(x1); g(x2)> with some unknown function g

Using SVM • Kernel function – Radial Basis Function (RBF) K(x,y) = exp(- γ ||x-y|| 2 ) – Polynomial K(x,y) = (<x;y>+1) d • Tradeoff parameter C – Small C = generalization is more important than error – Large C = error is more important

Exercise • MNIST dataset with libsvm using RBF kernel • SVM's parameters – gamma = inverse of area of influence around each example, try 0.005 – C = trade off parameter between error on training set and generalization, try 1000

How to build the recognition module? (5) • Multi-pass classifier – Rough classification: upper vowel, mid-level char, lower vowel – Fine classification

How to build the recognition module? (5) • Multi-pass classifier – Rough classification: upper vowel, mid-level char, lower vowel – Fine classification: กถภ , ปฝฟ , ... – Finer classification

1-Nearest Neighbor classifier • Prototype-based classifier, template-based classifier • Distance function • Useful when – We have very limited number of training examples, cannot train other classifier – We have large number of training examples, just to test as a baseline • What is performance of this model? – When n → ∞ , 1NN error < 2 bayes error

k-NN and Bayes classification rule ● P(x) ≈ k / (NV) ● N number of examples in training set ● V volume around x ● k number of points in V ● P(x|y) ≈ k y / (N y V) ● k y number of examples of class y in V ● N y number of examples of class y in training set ● P(y) ≈ N y /N ● P(y|x) ≈ k y / k (why??)

k-NN algorithm • Put the input object into the same class as most of its k nearest neighbors • Implementation – Compute distance between input and each training examples – Sort in increasing order – Count number of instances from each class amongst k nearest neighbors • There is no k which is always optimal

Some useful distances p =  ∑ p  1 / p n • Norm-p distance ∣ ∣ x − y ∣ ∣  x i − y i  i = 1 • Mahalanobis distance T  − 1  x − y  dist  x , y = x − y  • Kernel-based distance dist  x , y = K  x ,x  K  y , y − 2K  x , y  – WHY?

Generative approach (1) • Solving classification problem = build P(class|input) • P(class i |input) = P(input|class i ) P(class i ) / P(input) • P(input) = Σ i P(input|class i ) P(class i ) • P(class i ) = percentage of examples from class i in the training set • Solving classification problem = build P(input|class i ) • P(input|class i ) = likelihood of class i • P(class i |input) = posterior probability of class i

Generative approach (2) • To build P(input|class i ) we usually made an assumption about – How the data from the class i is distributed, e.g. Binomial, Gaussian, Mixture of Gaussian – How the data is created, e.g. HMM • Example: Document classification – How each input i.e. document is represented? – What is the likelihood model for these data?

Document classification (1) • Spam/not-spam • Document = set of words • Preprocessing – word segmentation – remove stop-words – stemming – word selection

Document classification (2) • Naïve Bayes assumption: all words are independent • P(w 1 ,...,w n |Spam) = Π i P(w i |Spam) • Same hypothesis for all classes • How to compute P(w i |Spam), why? • What is the process of building Naïve Bayes model for spam classification?

Maximum Likelihood (1) • x 1 ,...,x N are i.i.d. according to P(x| θ ) where θ is the parameter of this model • Q: What is the proper value for θ ? • A: The value which gives maximum P(x 1 ,...,x N | θ ) • Q: We know P(x| θ ), how to compute P(x 1 ,...,x N | θ )? • A: P(x 1 ,...,x N | θ ) = Π i P(x i | θ ), Why? • Q: How to find the maximum value? • Q: How to get ride of the product?

Maximum Likelihood (2) • Exercise: Binomial distribution – Q: What this means? What is P(w|Spam) – word “viagra” – {T, F, F, T, T, T, F, T, F, T } – Find proper parameter for P(w|Spam)

Maximum Likelihood (3) • Exercise: coin-toss – {H, T, H, H, T, H, H, H, T, H} – Q: What is the parameter of Binomial distribution that fits this data? – Q: What is the conclusion?

Maximum A Posteriori • Sometimes, we have prior knowledge about the model, i.e. we have P( θ ) • We search for maximum P( θ |x 1 ,...,x N ) instead • Q: How to compute P( θ |x 1 ,...,x N ) from P( θ ) and P(x| θ )? • Exercise: coin-toss problem θ is distributed as Gaussian with mean 5/10 and standard deviation 1/10 – Q: What is Gaussian model? – Q: What is the proper value for θ ?

ML or MAP? • ML is good when we have large enough data • MAP is prefered when we have small data • Prior can be estimated from data too

Machine Learning sanparith.marukatat@nectec.or.th Today Example - PowerPoint PPT Presentation

Machine Learning sanparith.marukatat@nectec.or.th Today Example of intelligent system: OCR k-Nearest Neighbor Classifier Generative model Maximum likelihood Nave Bayes model Gaussian model What is OCR? Optical

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Beyond Audio & Video Beyond Audio & Video Beyond Audio & Video the Internet was the

Blaise On A Window 8 Tablet The Caribbean Netherlands Implementation Lon Hofman, Blaise Team,

March 12-14 ictcm.com | # ICTCM 32 nd International Conference on Technology in Collegiate

COMP30019 Graphics and Interaction Input Chris Ewin Department of Computing and Information

recognition Lambert Schomaker chair Introductory slides for the panel session at the

Todays webinar Building a Better Toolkit Armed with the learning outcomes big picture and a

Foundations of infinitesimal calculus: surreal numbers and nonstandard analysis Vladimir Kanovei 1

Portraiture Overlapping Value Who is the artist that created their self portrait?

Machine Learning sanparith.marukatat@nectec.or.th Today Example - PowerPoint PPT Presentation

Machine Learning sanparith.marukatat@nectec.or.th Today Example of intelligent system: OCR k-Nearest Neighbor Classifier Generative model Maximum likelihood Nave Bayes model Gaussian model What is OCR? Optical

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Beyond Audio &amp; Video Beyond Audio &amp; Video Beyond Audio &amp; Video the Internet was the

Blaise On A Window 8 Tablet The Caribbean Netherlands Implementation Lon Hofman, Blaise Team,

March 12-14 ictcm.com | # ICTCM 32 nd International Conference on Technology in Collegiate

COMP30019 Graphics and Interaction Input Chris Ewin Department of Computing and Information

recognition Lambert Schomaker chair Introductory slides for the panel session at the

Todays webinar Building a Better Toolkit Armed with the learning outcomes big picture and a

Foundations of infinitesimal calculus: surreal numbers and nonstandard analysis Vladimir Kanovei 1

Portraiture Overlapping Value Who is the artist that created their self portrait?

Beyond Audio & Video Beyond Audio & Video Beyond Audio & Video the Internet was the