Machine Learning George Konidaris gdk@cs.duke.edu Spring 2016

Machine Learning Subfield of AI concerned with learning from data . � � Broadly, using: • Experience • To Improve Performance • On Some Task � (Tom Mitchell, 1997) �

vs … ML vs Statistics vs Data Mining

Why? Developing effective learning methods has proved difficult. Why bother? � Autonomous discovery • We don’t know something, want to find out. � Hard to program � • Easier to specify task, collect data. � Adaptive behavior • Our agents should adapt to new data, unforeseen circumstances.

Types Depends on feedback available : � Labeled data: • Supervised learning � No feedback, just data: • Unsupervised learning. � Sequential data, weak labels: • Reinforcement learning

Supervised Learning Input: inputs X = {x 1 , …, x n } training data Y = {y 1 , …, y n } labels � � Learn to predict new labels . Given x: y?

Unsupervised Learning Input: inputs X = {x 1 , …, x n } � Try to understand the structure of the data. � � � E.g., how many types of cars? How can they vary?

Reinforcement Learning Learning counterpart of planning. � ∞ � γ t r t R = max π : S → A π t =0

Today: Supervised Learning Formal definition: � Given training data: inputs X = {x 1 , …, x n } Y = {y 1 , …, y n } labels � Produce: Decision function f : X → Y � That minimizes error: X err ( f ( x i ) , y i ) i

Classification vs. Regression If the set of labels Y is discrete: • Classification • Minimize number of errors � � If Y is real-valued: • Regression • Minimize sum squared error � � � Today we focus on classification.

Key Ideas Class of functions F , from which to find f . • F is known as the hypothesis space . � � E.g., if-then rules: if condition then class1 else class2 � � � Learning: • Search over F to find f that minimizes error.

Test/Train Split Minimize error measured on what? • Don’t get to see future data. • Could use test data … but! may not generalize. � General principle: Do not measure error on the data you train on! � Methodology: • Split data into training set and test set . • Fit f using training set . • Measure error on test set . � Always do this.

Decision Trees Let’s assume: • Discrete inputs. • Two classes ( true and false ). • Input X is a vector of values. � Relatively simple classifier: • Tree of tests . • Evaluate test for for each x i , follow branch. • Leaves are class labels.

Decision Trees x i = [a, b, c] a? each boolean true false b? c? true true false false y=1 y=2 b? y=1 true false y=2 y=1

Decision Trees How to make one? � Given X = {x 1 , …, x n } Y = {y 1 , …, y n } � repeat: • if all the labels are the same, we have a leaf node. • pick an attribute and split data on it. • recurse on each half. � If we run out of splits, and data not perfectly in one class, then take a max.

Decision Trees A B C L a? T F T 1 T T F 1 T F F 1 F T F 2 F T T 2 F T F 2 F F T 1 F F F 1

Decision Trees A B C L a? T F T 1 true T T F 1 T F F 1 F T F 2 y=1 F T T 2 F T F 2 F F T 1 F F F 1

Decision Trees A B C L a? T F T 1 false true T T F 1 T F F 1 F T F 2 y=1 b? F T T 2 F T F 2 F F T 1 F F F 1

Decision Trees A B C L a? T F T 1 false true T T F 1 T F F 1 F T F 2 y=1 b? F T T 2 true F T F 2 F F T 1 F F F 1 y=2

Decision Trees A B C L a? T F T 1 false true T T F 1 T F F 1 F T F 2 y=1 b? F T T 2 false true F T F 2 F F T 1 F F F 1 y=2 y=1

Attribute Picking Key question: • Which attribute to split over? � Information contained in a data set: I ( A ) = − f 1 log 2 f 1 − f 2 log 2 f 2 � � How many “bits” of information do we need to determine the label in a dataset? � Pick the attribute with the max information gain: X Gain ( B ) = I ( A ) − f i I ( B i ) i

Example A B C L T F T 1 T T F 1 T F F 1 F T F 2 F T T 2 F T F 2 F F T 1 F F F 1

Decision Trees What if the inputs are real-valued? • Have inequalities rather than equalities. a > 3.1 false true y=1 b < 0.6? false true y=2 y=1

Hypothesis Class What is the hypothesis class for a decision tree? • Discrete inputs? • Real-valued inputs?

Machine Learning George Konidaris gdk@cs.duke.edu Spring 2016 - PowerPoint PPT Presentation

Machine Learning George Konidaris gdk@cs.duke.edu Spring 2016 Machine Learning Subfield of AI concerned with learning from data . Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell, 1997)

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Free fields, Quivers and Riemann surfaces Sanjaye Ramgoolam Queen Mary, University of London 11

COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 23. PGM

Game Theory: Lecture #10 Outline: Strategic form games Dominated strategies Examples

Flips, Arrangements and Tableaux Ron Adin and Yuval Roichman Bar-Ilan University radin, yuvalr

RePast Tutorial IV Prof. Lars-Erik Cederman Center for Comparative and International Studies

Eyring-Kramers formula for Poincar e and logarithmic Sobolev inequalities Andr e

Information Retrieval Scores in a complete search system Hamid Beigy Sharif university of

The Growing Risks on Social Media 06 AUG 2020 Making the Internet better for businesses and their

Sambuz

Useful Links

Newsletter

Mail Us