machine learning 10 701
play

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department - PDF document

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 5, 2011 Today: Readings: Kernels: Bishop Ch. 6.1 Latent Dirichlet Allocation topic models optional: Social network analysis


  1. Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University April 5, 2011 Today: Readings: • Kernels: Bishop Ch. 6.1 • Latent Dirichlet Allocation • topic models optional: • Social network analysis based • Bishop Ch 6.2, 6.3 on latent probabilistic models • “Kernel Methods for Pattern • Kernel regression Analysis”, Shawe-Taylor & Cristianini, Chapter 2 Supervised Dimensionality Reduction 1

  2. Supervised Dimensionality Reduction • Neural nets: learn hidden layer representation, designed to optimize network prediction accuracy • PCA: unsupervised, minimize reconstruction error – but sometimes people use PCA to re-represent original data before classification (to reduce dimension, to reduce overfitting) • Fisher Linear Discriminant – like PCA, learns a linear projection of the data – but supervised: it uses labels to choose projection Fisher Linear Discriminant • A method for projecting data into lower dimension to hopefully improve classification • We’ll consider 2-class case Project data onto vector that connects class means? 2

  3. Fisher Linear Discriminant Project data onto one dimension, to help classification Define class means: Could choose w according to: Instead, Fisher Linear Discriminant chooses: Summary: Fisher Linear Discriminant • Choose n-1 dimension projection for n-class classification problem • Use within-class covariances to determine the projection • Minimizes a different error function (the projected within- class variances) 3

  4. Example topics induced from a large collection of text JOB SCIENCE BALL STORY FIELD DISEASE MIND WATER WORK STUDY GAME STORIES MAGNETIC BACTERIA WORLD FISH SCIENTISTS TEAM JOBS MAGNET DISEASES DREAM TELL SEA SCIENTIFIC FOOTBALL CAREER WIRE GERMS DREAMS CHARACTER SWIM EXPERIENCE KNOWLEDGE BASEBALL CHARACTERS NEEDLE FEVER THOUGHT SWIMMING EMPLOYMENT WORK PLAYERS AUTHOR CURRENT CAUSE IMAGINATION POOL OPPORTUNITIES RESEARCH PLAY READ COIL CAUSED MOMENT LIKE CHEMISTRY FIELD WORKING POLES SPREAD THOUGHTS TOLD SHELL TECHNOLOGY PLAYER TRAINING IRON VIRUSES OWN SETTING SHARK SKILLS MANY BASKETBALL TALES COMPASS INFECTION REAL TANK CAREERS MATHEMATICS COACH PLOT LINES VIRUS LIFE SHELLS POSITIONS BIOLOGY PLAYED TELLING CORE MICROORGANISMS IMAGINE SHARKS FIELD PLAYING FIND ELECTRIC PERSON SENSE SHORT DIVING PHYSICS HIT POSITION DIRECTION INFECTIOUS CONSCIOUSNESS FICTION DOLPHINS FIELD LABORATORY TENNIS ACTION FORCE COMMON STRANGE SWAM OCCUPATIONS STUDIES TEAMS TRUE MAGNETS CAUSING FEELING LONG REQUIRE WORLD GAMES EVENTS BE SMALLPOX WHOLE SEAL SCIENTIST SPORTS OPPORTUNITY MAGNETISM BODY BEING TELLS DIVE STUDYING BAT EARN POLE INFECTIONS MIGHT TALE DOLPHIN ABLE SCIENCES TERRY NOVEL INDUCED CERTAIN HOPE UNDERWATER [Tennenbaum et al] What about Probabilistic Approaches? Supervised? Unsupervised? 4

  5. Example topics induced from a large collection of text JOB SCIENCE BALL STORY FIELD DISEASE MIND WATER WORK STUDY GAME STORIES MAGNETIC BACTERIA WORLD FISH SCIENTISTS TEAM JOBS MAGNET DISEASES DREAM TELL SEA SCIENTIFIC FOOTBALL CAREER WIRE GERMS DREAMS CHARACTER SWIM EXPERIENCE KNOWLEDGE BASEBALL CHARACTERS NEEDLE FEVER THOUGHT SWIMMING EMPLOYMENT WORK PLAYERS AUTHOR CURRENT CAUSE IMAGINATION POOL OPPORTUNITIES RESEARCH PLAY READ COIL CAUSED MOMENT LIKE CHEMISTRY FIELD WORKING POLES SPREAD THOUGHTS TOLD SHELL TECHNOLOGY PLAYER TRAINING IRON VIRUSES OWN SETTING SHARK SKILLS MANY BASKETBALL TALES COMPASS INFECTION REAL TANK CAREERS MATHEMATICS COACH PLOT LINES VIRUS LIFE SHELLS POSITIONS BIOLOGY PLAYED TELLING CORE MICROORGANISMS IMAGINE SHARKS FIELD PLAYING FIND ELECTRIC PERSON SENSE SHORT DIVING PHYSICS HIT POSITION DIRECTION INFECTIOUS CONSCIOUSNESS FICTION DOLPHINS FIELD LABORATORY TENNIS ACTION FORCE COMMON STRANGE SWAM OCCUPATIONS STUDIES TEAMS TRUE MAGNETS CAUSING FEELING LONG REQUIRE WORLD GAMES EVENTS BE SMALLPOX WHOLE SEAL SCIENTIST SPORTS OPPORTUNITY MAGNETISM BODY BEING TELLS DIVE STUDYING BAT EARN POLE INFECTIONS MIGHT TALE DOLPHIN ABLE SCIENCES TERRY NOVEL INDUCED CERTAIN HOPE UNDERWATER [Tennenbaum et al] Plate Notation 5

  6. Latent Dirichlet Allocation Model Clustering words into topics with Also extended to Latent Dirichlet Allocation case where number of topics [Blei, Ng, Jordan 2003] is not known in advance (hierarchical Dirichlet Probabilistic model for document set: processes – [Blei For each of the D documents: et al, 2004]) 1. Pick a θ d ~ P ( θ | α ) to define P(z| θ d ) 2. For each of the N d words w • Pick topic z n ~ P(z | θ d ) • Pick word w n ~ P(w |z n , φ ) Training this model defines topics (i.e., φ which defines P(W|Z)) 6

  7. Example topics induced from a large collection of text JOB SCIENCE BALL STORY FIELD DISEASE MIND WATER WORK STUDY GAME STORIES MAGNETIC BACTERIA WORLD FISH SCIENTISTS TEAM JOBS MAGNET Significance: DISEASES DREAM TELL SEA SCIENTIFIC FOOTBALL CAREER WIRE GERMS DREAMS CHARACTER SWIM EXPERIENCE KNOWLEDGE BASEBALL CHARACTERS NEEDLE FEVER THOUGHT SWIMMING EMPLOYMENT WORK PLAYERS AUTHOR CURRENT • Learned topics reveal implicit CAUSE IMAGINATION POOL OPPORTUNITIES RESEARCH PLAY READ COIL CAUSED MOMENT LIKE CHEMISTRY FIELD WORKING POLES semantic categories of words SPREAD THOUGHTS TOLD SHELL TECHNOLOGY PLAYER TRAINING IRON VIRUSES OWN SETTING SHARK SKILLS MANY BASKETBALL TALES COMPASS within the documents INFECTION REAL TANK CAREERS MATHEMATICS COACH PLOT LINES VIRUS LIFE SHELLS POSITIONS BIOLOGY PLAYED TELLING CORE MICROORGANISMS IMAGINE SHARKS FIELD PLAYING FIND ELECTRIC • In many cases, we can PERSON SENSE SHORT DIVING PHYSICS HIT POSITION DIRECTION INFECTIOUS CONSCIOUSNESS FICTION DOLPHINS FIELD LABORATORY TENNIS ACTION FORCE represent documents with 10 2 COMMON STRANGE SWAM OCCUPATIONS STUDIES TEAMS TRUE MAGNETS CAUSING FEELING LONG REQUIRE WORLD GAMES EVENTS BE topics instead of 10 5 words SMALLPOX WHOLE SEAL SCIENTIST SPORTS OPPORTUNITY MAGNETISM BODY BEING TELLS DIVE STUDYING BAT EARN POLE INFECTIONS MIGHT TALE DOLPHIN ABLE SCIENCES TERRY NOVEL INDUCED • Especially important for short CERTAIN HOPE UNDERWATER documents (e.g., emails). Topics overlap when words don’t ! [Tennenbaum et al] Analyzing topic distributions in email 7

  8. Author-Recipient-Topic model for Email Latent Dirichlet Allocation Author-Recipient Topic (LDA) (ART) [Blei, Ng, Jordan, 2003] [McCallum, Corrada, Wang, 2005] Enron Email Corpus • 250k email messages • 23k people Date: Wed, 11 Apr 2001 06:56:00 -0700 (PDT) From: debra.perlingiere@enron.com To: steve.hooser@enron.com Subject: Enron/TransAltaContract dated Jan 1, 2001 Please see below. Katalin Kiss of TransAlta has requested an electronic copy of our final draft? Are you OK with this? If so, the only version I have is the original draft without revisions. DP Debra Perlingiere Enron North America Corp. Legal Department 1400 Smith Street, EB 3885 Houston, Texas 77002 dperlin@enron.com 8

  9. Topics, and prominent sender/receivers discovered by ART [McCallum et al, 2005] Top words within topic : Top author-recipients exhibiting this topic Topics, and prominent sender/receivers discovered by ART Beck = “Chief Operations Officer” Dasovich = “Government Relations Executive” Shapiro = “Vice Presidence of Regulatory Affairs” Steffes = “Vice President of Government Affairs” 9

  10. Discovering Role Similarity Traditional SNA ART connection strength (A,B) = Similarity in Similarity in authored topics, recipients they conditioned on sent email to recipient Discovering Role Similarity Tracy Geaconne ⇔ Dan McCarty Traditional SNA ART Similar Different (send email to (discuss different same individuals) topics) Geaconne = “Secretary” McCarty = “Vice President” 10

  11. Discovering Role Similarity Lynn Blair ⇔ Kimberly Watson Traditional SNA ART Different Similar (send to different (discuss same individuals) topics) Blair = “Gas pipeline logistics” Watson = “Pipeline facilities planning” What you should know • Unsupervised dimension reduction using all features – Principle Components Analysis • Minimize reconstruction error – Singular Value Decomposition • Efficient PCA – Independent components analysis – Canonical correlation analysis – Probabilistic models with latent variables • Supervised dimension reduction – Fisher Linear Discriminant • Project to n-1 dimensions to discriminate n classes – Hidden layers of Neural Networks • Most flexible, local minima issues • LOTS of ways of combining discovery of latent features with classification tasks 11

  12. Kernel Functions • Kernel functions provide a way to manipulate data as though it were projected into a higher dimensional space, by operating on it in its original space • This leads to efficient algorithms • And is a key component of algorithms such as – Support Vector Machines – kernel PCA – kernel CCA – kernel regression – … Linear Regression Wish to learn f: X  Y, where X=<X 1 , … X n >, Y real-valued Learn where 12

Recommend


More recommend