Introduction to Big Data and Machine Learning Classification Dr. - PowerPoint PPT Presentation

Introduction to Big Data and Machine Learning Classification Dr. Mihail September 19, 2019 (Dr. Mihail) Intro Big Data September 19, 2019 1 / 3

Linear models for classification Goal Goal of classification: take an input vector x and assign it to one of K discrete classes C k where k = 1 , . . . , K The input space is therefore divided into decision regions whose boundaries are called “ decision bounderies ” or “decision surfaces” Here, we will consider linear models where the decision boundaries are linear functions of the input vector “x” and hence are definded by D − 1-dimensional hyperplanes within the D -dimensional input space Data sets that can be separated exactly by linear decision surfaces are said to be “ linearly separable ” (Dr. Mihail) Intro Big Data September 19, 2019 2 / 3

Probabilistic models For probabilistic models, the most convenient, in the case of two-class problems is the binary representation, in which there is a single target variable t = { 0 , 1 } For K > 2 classes, it is convenient to use 1 − of − K coding scheme, in which t is a vector of length K such that if the class is C j , then all elements of t k are zero exacept t j . For instance if we have 5 classes, the a patter from class 2 would be given by the target vector t = (0 , 1 , 0 , 0 , 0) T (Dr. Mihail) Intro Big Data September 19, 2019 3 / 3

Using Bayes Theorem Model posterior class conditional probability: p ( C k | x ) = p ( x |C k ) p ( C k ) p ( x ) Notice the denominator is not a function of C Prior class distribution: p ( C k ) Class conditional density: p ( x |C k ) (Dr. Mihail) Intro Big Data September 19, 2019 4 / 3

Discriminative models Discriminative model P ( c | x ) To train a discriminative classifier, all training examples of different classes must be jointly used to build up a single discriminative classifier Output K probabilities for K class labels in probabilistic classifiers, while a single label is produced by non-probabilistic classifier (Dr. Mihail) Intro Big Data September 19, 2019 5 / 3

Discriminative classifier (Dr. Mihail) Intro Big Data September 19, 2019 6 / 3

Generative classifier P ( x | c ), c = c 1 , . . . , c K , x = ( x 1 , . . . , x n ) K probabilistic models have to be trained independently Each is trained on only the examples of the same label Output K probabilities for a given input with K models “Generative” means that model can produce data via distribution sampling (Dr. Mihail) Intro Big Data September 19, 2019 7 / 3

Generative classifier (Dr. Mihail) Intro Big Data September 19, 2019 8 / 3

Maximum a-posteriori (MAP) For an input x , find the largest one from K probabilities output by a discriminative probabilistic classifier P ( c 1 | x ) , . . . , P ( c K | x ) Assign x to label c ∗ if P ( c ∗ | x ) is the largest Generative classification with the MAP rule: P ( c i | x ) = P ( x | c i ) P ( c i ) ∝ P ( x | c i ) P ( c i ) (1) P ( x ) (Dr. Mihail) Intro Big Data September 19, 2019 9 / 3

Na¨ ıve Bayes Bayes classification P ( c | x ) ∝ P ( x | c ) P ( c ) = P ( x 1 , . . . , x n | c ) P ( c ) (2) for c = c 1 , . . . , c K (Dr. Mihail) Intro Big Data September 19, 2019 10 / 3

Na¨ ıve Bayes Bayes classification P ( c | x ) ∝ P ( x | c ) P ( c ) = P ( x 1 , . . . , x n | c ) P ( c ) (2) for c = c 1 , . . . , c K Problem The joint probability P ( x 1 , . . . , x n | c ) is not feasible to learn. (Dr. Mihail) Intro Big Data September 19, 2019 10 / 3

Na¨ ıve Bayes Bayes classification P ( c | x ) ∝ P ( x | c ) P ( c ) = P ( x 1 , . . . , x n | c ) P ( c ) (2) for c = c 1 , . . . , c K Problem The joint probability P ( x 1 , . . . , x n | c ) is not feasible to learn. Solution Assume all input features are class conditionally independent! (Dr. Mihail) Intro Big Data September 19, 2019 10 / 3

Bayes model P ( x 1 , x 2 , . . . , x n | c ) = P ( x 1 | x 2 , . . . , x n , c ) P ( x 2 , . . . , x n | c ) = P ( x 1 | c ) P ( x 2 , . . . , x n | c ) (3) = P ( x 1 | c ) P ( x 2 | c ) . . . P ( x n | c ) (Dr. Mihail) Intro Big Data September 19, 2019 11 / 3

Algorithm Discrete valued features Learning phase: Given a training set S of F features and K classes, For each target value of c i ( c i = c 1 , . . . , c K ): ˆ P ( c i ) ← estimate P ( c i ) with examples in S For every feature value x jk of each feature x j ( j = 1 , . . . , F ; k = 1 , . . . N ): ˆ P ( x j = x jk | c i ) ← estimate P ( x jk | c i ) with samples in S Output: F × K conditional probabilistic (generative) models. Test phase: Given an unknown instance x ′ = ( a ′ 1 , . . . , a ′ n ) assign label c ∗ to x ′ if [ ˆ 1 | c ∗ ) . . . ˆ n | c ∗ )] ˆ P ( c ∗ ) > [ ˆ 1 | c i ) . . . ˆ n | c i )] ˆ P ( a ′ P ( a ′ P ( a ′ P ( a ′ P ( c i ) (4) for c i � = c ∗ , c i = c 1 , . . . c K (Dr. Mihail) Intro Big Data September 19, 2019 12 / 3

Example (Dr. Mihail) Intro Big Data September 19, 2019 13 / 3

Learning phase (Dr. Mihail) Intro Big Data September 19, 2019 14 / 3

Test phase Given a new instance, predict its label: x ′ = ( Outlook = Sunny , Temperature = Cool , Humidity = High , Wind = Strong ) Look up tables: Make decision with the MAP rule: (Dr. Mihail) Intro Big Data September 19, 2019 15 / 3

Introduction to Big Data and Machine Learning Classification Dr. - PowerPoint PPT Presentation

Introduction to Big Data and Machine Learning Classification Dr. Mihail September 19, 2019 (Dr. Mihail) Intro Big Data September 19, 2019 1 / 3 Linear models for classification Goal Goal of classification: take an input vector x and assign

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

x ? Machine Learning 5/4/20 Tim Althoff, UW CS547: Machine Learning for Big Data,

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Differential Privacy Machine Learning Li Xiong Big Data + Machine Learning + Machine

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Machine Learning 1 Machine(Learning(in(a(Nutshell ( Data$ Model$ Performance$ Measure$

Locally Restricted Compositions Past, Present, Future? Ed Bender with Rod Canfield and Jason Gao

Lecture 2 : Counting Techniques 0/ 17 2.3 The Three Basic Rules 1. The Product Rule for Ordered

netlist-paths: a tool for querying paths in a Verilog design Jamie Hanlon Graphcore September

From dimers in the disc to cluster categories arXiv:1912.12475 and work in progress with .

On the chordality of polynomial sets in triangular decomposition in top-down style Chenqi Mou

Asset Pricing Chapter VIII. Arrow-Debreu Pricing June 22, 2006 Asset Pricing 8.1 Setting: An

Generating asymptotics for factorially divergent sequences Michael Borinsky 1 Humboldt-University

INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Sch utzes, linked from