Policies Themes Plans Introductions Course Policies & Themes CS 795/895 machine Learning Steven J Zeil Old Dominion Univ. Fall 2010 1
Policies Themes Plans Introductions Outline Policies 1 Themes 2 What is Machine Learning? Major Machine Learning Problems How do Machines Learn? Plans 3 Projects Problem Sets Introductions 4 2
Policies Themes Plans Introductions Where & When Meets: Monday & Wednesday 9:30-10:45 Website: http://www.cs.odu.edu/ zeil/cs795ML.html Most of course content is on Blackboard Includes wiki & discussion board (forum) Syllabus: on the website All students are responsible for reading the syllabus. 3
Policies Themes Plans Introductions Pre-requisites Graduate standing Programming skills Mathematics: probability and statistics, linear algebra 4
Policies Themes Plans Introductions Academic Honesty Everything turned in for grading in this course must be your own work. In the term project, normal professional standards regarding quotation and citation will be strictly enforced. If you use someone else’s thoughts, conclusions, or ideas, you must cite them. If you use someone else’s words, you must quote and cite them. 5
Policies Themes Plans Introductions Grading Assignments 15% Term Project 60% experiment (20%) paper (20%) presentation (20%) Midterm exam 10% Final exam 15% 6
Policies Themes Plans Introductions Themes (Outline) Policies 1 Themes 2 What is Machine Learning? Major Machine Learning Problems How do Machines Learn? Plans 3 Projects Problem Sets Introductions 4 7
Policies Themes Plans Introductions What is Machine Learning? Programming computers to find approximate solutions based on sample or past data 8
Policies Themes Plans Introductions Sample Problems for Machine Learning computer vision, speech recognitions data mining named entiry recognition 9
Policies Themes Plans Introductions Machine Learning Draws From. . . A. I. 10
Policies Themes Plans Introductions Machine Learning Draws From. . . A. I. Numerical Analysis (approximation theory) 10
Policies Themes Plans Introductions Machine Learning Draws From. . . A. I. Numerical Analysis (approximation theory) Information Retrieval (clustering) 10
Policies Themes Plans Introductions Machine Learning Draws From. . . A. I. Numerical Analysis (approximation theory) Information Retrieval (clustering) Statistics 10
Policies Themes Plans Introductions We Turn to Machine Learning when. . . we have lots of data 11
Policies Themes Plans Introductions We Turn to Machine Learning when. . . we have lots of data with lots of relevent(?) features 11
Policies Themes Plans Introductions We Turn to Machine Learning when. . . we have lots of data with lots of relevent(?) features that interact in ways only partially understood 11
Policies Themes Plans Introductions We Turn to Machine Learning when. . . we have lots of data with lots of relevent(?) features that interact in ways only partially understood direct algorithmic approaches are ineffective 11
Policies Themes Plans Introductions We Turn to Machine Learning when. . . (personal observation) you require 90% accuracy of some function 12
Policies Themes Plans Introductions We Turn to Machine Learning when. . . (personal observation) you require 90% accuracy of some function you have a complicated algorithm drowning in nested ifs, exceptions, and special cases 12
Policies Themes Plans Introductions We Turn to Machine Learning when. . . (personal observation) you require 90% accuracy of some function you have a complicated algorithm drowning in nested ifs, exceptions, and special cases that only gets you 75% accuracy 12
Policies Themes Plans Introductions We Turn to Machine Learning when. . . (personal observation) you require 90% accuracy of some function you have a complicated algorithm drowning in nested ifs, exceptions, and special cases that only gets you 75% accuracy successive rewrites get you a different 75% accuracy 12
Policies Themes Plans Introductions Major Machine Learning Problems Classification: Given a set of data X and a set of classes C determine to which C i a given X j belongs, or the posterior probability P ( C I | X j ) for all i , j Regression: Given a set of data pairs ( X i , r i ) viewed as a sample from an unknown function f , estimate the value of f ( X ′ ). 13
Policies Themes Plans Introductions The Data The X i in these problems are seldom simple quantities. More often, each X i is a vector of features (a.k.a., attributes ) Each X i is called an observation , example , or instance . The number of features is often quite large. and it is often unclear whether all of them are relevant 14
Policies Themes Plans Introductions Secondary Problems in Machine Learning dimensionality reduction scaling & pre-conditioning 15
Policies Themes Plans Introductions How do Machines Learn? Supervised Learning: Start with a training set of examples for which the class/regression value is known. “Learn” how to classify/regress arbitrary inputs. Unsupervised Learning: “Learn” from the same data set that we want to classify/regress. Reinforcement Learning: “Learn” policies for generating sequences of outputs from evaluations of the sequence e.g., Game playing not covered in this course 16
Policies Themes Plans Introductions Themes (Outline) Policies 1 Themes 2 What is Machine Learning? Major Machine Learning Problems How do Machines Learn? Plans 3 Projects Problem Sets Introductions 4 17
Policies Themes Plans Introductions Projects Term project: experiment evaluating or comparing the effectiveness of different ML techniques problems sets provided by the instructor or other Dept-related research Prepare a paper in the style of an ACM/IEEE conference submission Present paper to class 18
Policies Themes Plans Introductions Time Frame Try to run through topics in 1st 2/3 of semester Allow some weeks without lectures for working on project Presentations in last week of class & exam week 19
Policies Themes Plans Introductions Problem Sets Assign subject taxonomy keywords to article-length documents Named-Entity Recognition: given paragraphs pulled from cover page(s) of a document, extract personal (author) names Text reconstruction: given positions of (blocks of) characters on a page (from OCR or PDF document), group the characters into words/tokens. 20
Policies Themes Plans Introductions Introductions Please say a few words, indicating name time in Dept current research/academic status background level in statistics why you signed up for this course 21
Recommend
More recommend