Machine Learning & Decision Trees CS16: Introduction to Data - PowerPoint PPT Presentation

Machine Learning & Decision Trees CS16: Introduction to Data Structures & Algorithms Spring 2020

Outline ‣ Motivation ‣ Supervised learning ‣ Decision Trees ‣ ML Bias 2

Machine Learning ‣ Algorithms that use data to design algorithms input data Algo Algo Learning Algo output ‣ Allows us to design algorithms ‣ that predict the future (e.g., picking stocks) ‣ even when we don’t know how (e.g., facial recognition) 3

CS 147 4

Applications of ML ‣ Agriculture ‣ Advertising ‣ Astronomy ‣ Self-driving cars ‣ Bioinformatics ‣ Recommendation systems (e.g., Netflix) ‣ Classifying DNA ‣ Search engines ‣ Computer Vision ‣ Translations ‣ Finance ‣ Robotics ‣ Linguistics ‣ Risk assessment ‣ Medical diagnostics ‣ Drug discovery ‣ Insurance ‣ Fraud discovery ‣ Economics ‣ Computational Anatomy

Classes of ML ‣ Supervised learning ‣ learn to make accurate predictions from training data ‣ Unsupervised learning ‣ find patterns in data without training data ‣ Reinforcement learning ‣ improve performance with positive and negative feedback 6

Supervised Learning ‣ Make accurate predictions/classifications ‣ Is this email spam? ‣ Will the snowstorm cancel class? ‣ Will this flight be delayed? ‣ Will this candidate win the next election? ‣ How can our algorithm predict the future? ‣ We train it using “training data” which are past examples ‣ Examples of emails classified as spam and of emails classified as non-spam ‣ Examples of snowstorms that have lead to cancelations and of snowstorms that have not ‣ Examples of flights that have been delayed and of flights that have left on time ‣ Examples of candidates that won and of candidates that have lost 7

Supervised Learning ‣ Training data is a collection of examples ‣ An example includes an input and its classification ‣ inputs : flights, snowstorms, candidates, … ‣ classifications : delayed/non-delayed, canceled/not canceled, win/lose ‣ But how do we represent inputs for our algorithm? ‣ What is a student? what is a flight? what is an email? ‣ We have to choose attributes that describe the inputs ‣ flight is represented by: source, destination, airline, number of passengers, … ‣ snowstorm is represented by: duration, expected inches, winds, … ‣ candidate is represented by: district, political affiliation, experience, … 8

Example: Waiting for a Table ‣ Design algorithm that predicts if patron will wait for a table ‣ What are the inputs ? ‣ the “context” of the patron’s decision ‣ What are the attributes of this context? ‣ is patron hungry? is the line long? 9

Example: Waiting for a Table? Input attributes ‣ A 1 : Alternatives = {Yes, No} ‣ A 2 : Bar = {Yes, No} ‣ A 3 : Fri/Sat = {Yes, No} ‣ A 4 : Hungry = {Yes, No} ‣ A 5 : Patrons = {None, Some, Full} ‣ A 6 : Price = {$, $$, $$$} ‣ A 7 : Raining = {Yes, No} ‣ A 8 : Reservation = {Yes, No} ‣ A 9 : Type = {French, Italian, Thai, Burger} ‣ A 10 : Wait = {10-30, 30-60, >60} ‣ Classification: {Yes, No} ‣ 10

Training Data S. Russel & P. Norvig. Artificial Intelligence - A Modern Approach 11

Supervised Learning ‣ Classification ‣ If classifications are from a finite set ‣ ex: spam/not spam, delayed/not delayed ‣ Regression ‣ If classifications are real numbers ‣ ex: temperature 12

Outline ‣ Motivation ‣ Supervised learning ‣ Decision Trees ‣ Algorithmic Bias 13

Decision Trees ‣ A decision tree maps ‣ inputs represented by attributes… ‣ …to a classification ‣ Examples snowstorm_dt(12h,8”,strong winds) returns Yes ‣ flight_dt(DL,PVD,Paris,night,no_storm,…) returns No ‣ restaurant_dt(estimate,hungry,patrons,…) returns No ‣ 14

Decision Tree Example 15

Decision Tree Example 2 min Activity #1 16

Our Goal: Learning a Decision Tree Learn Training Data Decision Tree 21

What is a Good Decision Tree? ‣ Consistent with training data ‣ classifies training examples correctly ‣ Performs well on future examples ‣ classifies future inputs correctly ‣ As small as possible ‣ Efficient classification ‣ How can we find a small decision tree? ‣ there are possible decision trees Ω (2 2 n ) ‣ so brute force is not possible 22

Iterative Dichotomizer 3 ( ID3 ) Algorithm ID3 Ross Quinlan Data (Learned) Decision Tree 23

ID3 ‣ Starting at root ‣ node is either an attribute node or a classification node (leaf) ‣ outgoing edges are labeled with attribute values ‣ children are either a classification node or another attribute node ‣ Tree should be as small as possible 24

ID3 6xYes 6xNo Type Uncertainty about whether French we should wait or not Thai Burger Italian 1xYes 2xYes 2xYes 1xYes 1xNo 2xNo 2xNo 1xNo 25

ID3 6xYes 6xYes 6xNo 6xNo Type Patrons Subproblem No uncertainty! recur! Some French None Thai Burger Italian Full 1xYes 2xYes 2xYes 1xYes 2xYes 4xYes 2xNo 1xNo 2xNo 2xNo 1xNo 4xNo 26

ID3 ‣ Start at root with entire training data ‣ Choose attribute that creates a “good split” ‣ Attribute “splits” data into subsets ‣ good split: children with subsets that are unmixed (with same classification) ‣ bad split: children with subsets that are mixed (with different classification) ‣ Children with unmixed subsets lead to a classification ‣ Children with mixed subsets handled with recursion 27

ID3 How do we distinguish “bad” attributes from “good” attributes 6xYes 6xYes 6xNo 6xNo Patrons Type Some French None Thai Burger Italian Full 1xYes 2xYes 2xYes 1xYes 2xYes 4xYes 2xNo 1xNo 2xNo 2xNo 1xNo 4xNo many unmixed subsets many mixed subsets 28

ID3 ‣ How do we decide if attribute is good? ‣ Compute entropy of each child ‣ quantifies how mixed/alike it is ‣ quantifies amount of certainty/uncertainty ‣ Combine the entropies of all the children ‣ Compare combined entropy of children to entropy of node ‣ This is called the information gain 29

Machine Learning & Decision Trees CS16: Introduction to Data - PowerPoint PPT Presentation

Machine Learning & Decision Trees CS16: Introduction to Data Structures & Algorithms Spring 2020 Outline Motivation Supervised learning Decision Trees ML Bias 2 Machine Learning Algorithms that use data to design

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Decision Trees COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Decision

Decision Trees: Discussion Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Online machine learning with decision trees Max Halford University of Toulouse Online machine

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

Applied Machine Learning Applied Machine Learning Decision Trees Siamak Ravanbakhsh Siamak

Lecture 23: Decision Trees Decision trees Prof. Julia Hockenmaier

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

Supervised Learning via Decision Trees Lecture 4 Supervised Learning via Decision Trees October

SANTA CLARA UNIVERSITY HUMAN RESOURCES COMMUNICATIONS COMMITTEE December 4, 2019 Agenda 2

Automated Application Signature Generation Using LASER and Cosine Similarity Byungchul Park, Jae

NPFL103: Information Retrieval (9) Vector Space Classification Pavel Pecina Institute of Formal

Distributed Data Classification Chih-Jen Lin Department of Computer Science National Taiwan

Slide 1 ___________________________________ 2.1 Ge ne r al Cost Classific ations o Costs are an

Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single

The PanPipe Workflow Manager Daniel Ortiz Genome Data Science Group Institute for Research in

File System Project Seminar On-Disk Layout Prof. Andreas Polze Andreas Grapentin, Sven Khler