Using Machine Learning to Study the Neural Representations of - PowerPoint PPT Presentation

Using Machine Learning to Study the Neural Representations of Language Meanings Tom M. Mitchell Carnegie Mellon University June 2017

How does neural activity encode word meanings?

How does neural activity encode word meanings? How does brain combine word meanings into sentence meanings?

Neurosemantics Research Team Research Scientists Research Scientists Marcel Just Tom Mitchell Erika Laing Kai-Min Chang Dan Howarth Recent/Current PhD Students Leila Wehbe Dan Schwartz Alona Fyshe Mariya Toneva Mark Palatucci Gustavo Sudre Nicole Rafidi funding: NSF, NIH, IARPA, Keck

Functional MRI

Typical stimuli

fMRI activation for “ bottle ”: bottle fMRI activation Mean activation averaged over 60 different stimuli: high average “ bottle ” minus mean activation: below average

Classifiers trained to decode the stimulus word Hammer Trained or Classifier Bottle (SVM, Logistic regression, Deep net,Bayesian classifier ...) (classifier as virtual sensor of mental state)

Classification task: is person viewing a “tool” or “building”? 1 statistically 0.9 Classification accuracy significant 0.8 p<0.05 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 p4 p8 p6 p11 p5 p7 p10 p9 p2 p12 p3 p1 Participants

Are neural representations similar across people? Can we train classifiers on one group of people, then decode from new person?

Are representations similar across people? YES rank accuracy classify which of 60 items

Lessons from fMRI Word Classification Neural representations Easier to decode: similar across • concrete nouns • people • emotion nouns • language • word vs. picture Harder to decode: • abstract nouns • verbs* * except when placed in context

Predictive Model? Predicted fMRI activity Arbitrary noun

Predictive Model? [Mitchell et al., Science , 2008] vector representing word meaning Predicted fMRI activity Retrieve 25 å v = f i ( w ) c vi Input noun: text i = 1 “telephone” statistics trillion word trained on other text collection fMRI data

Represent stimulus noun by co-occurrences with 25 verbs* Semantic feature values: “ celery” Semantic feature values: “ airplane” 0.8368, eat 0.8673, ride 0.3461, taste 0.2891, see 0.3153, fill 0.2851, say 0.2430, see 0.1689, near 0.1145, clean 0.1228, open 0.0600, open 0.0883, hear 0.0586, smell 0.0771, run 0.0286, touch 0.0749, lift … … … … 0.0000, drive 0.0049, smell 0.0000, wear 0.0010, wear 0.0000, lift 0.0000, taste 0.0000, break 0.0000, rub 0.0000, ride 0.0000, manipulate * in a trillion word text collection

Predicted Activation is Sum of Feature Contributions “eat” “taste” “fill” + … Celery = 0.84 + 0.35 + 0.32 f eat (celery) from corpus c 14382,eat statistics learned high 25 å prediction v = f i ( w ) c vi i = 1 low 500,000 learned c vi Predicted “Celery” parameters

“celery” “airplane” fMRI activation Predicted: high average Observed: below average Predicted and observed fMRI images for “celery” and “airplane” after training on other nouns . [Mitchell et al., Science , 2008]

Evaluating the Computational Model • Leave two words out during training celery? airplane? 1770 test pairs in leave-2-out: – Random guessing  0.50 accuracy – Accuracy above 0.61 is significant (p<0.05)

Learned activities associated with meaning components Participant P1 Semantic feature: Eat Push Run “Gustatory cortex” “somato - sensory” “Biological motion” Superior temporal Pars opercularis Postcentral gyrus sulcus (posterior) (z=24mm) (z=30mm) (z=12mm)

Alternative semantic feature sets PREDEFINED corpus features Mean Acc. 25 verb co-occurrences .79 486 verb co-occurrences .79 50,000 word co-occurences .76 300 Latent Semantic Analysis features .73 50 corpus features from Collobert&Weston ICML08 .78

Alternative semantic feature sets PREDEFINED corpus features Mean Acc. 25 verb co-occurrences .79 486 verb co-occurrences .79 50,000 word co-occurences .76 300 Latent Semantic Analysis features .73 50 corpus features from Collobert&Weston ICML08 .78 218 features collected using Mechanical Turk .83 Is it heavy? Can it break? features authored by Is it flat? Can it swim? Dean Pomerleau. Is it curved? Can it change shape? Is it colorful? Can you sit on it? feature values 1 to 5 Is it hollow? Can you pick it up? Is it smooth? Could you fit inside of it? features collected from Is it fast? Does it roll? at least three people Is it bigger than a car? Does it use electricity? Is it usually outside? Does it make a sound? people provided by Does it have corners? Does it have a backbone? Amazon’s Does it have moving parts? Does it have roots? “Mechanical Turk” Does it have seeds? Do you love it? …

Alternative semantic feature sets PREDEFINED corpus features Mean Acc. 25 verb co-occurrences .79 486 verb co-occurrences .79 50,000 word co-occurences .76 300 Latent Semantic Analysis features .73 50 corpus features from Collobert&Weston ICML08 .78 218 features collected using Mechanical Turk* .83 20 features discovered from the data** .86 * developed by Dean Pommerleau ** developed by Indra Rustandi

Discovering shared semantic basis [Rustandi et al., 2009] 1. Use CCA to discover latent features across subjects specific to study/subject CCA abstraction subj 1, word+pict å f k ( w ) = x v c vi v 20 learned … … latent features CCA abstraction f ( w ) subj 9, word+pict å f k ( w ) = x v c vi v CCA abstraction subj 10, word only å f k ( w ) = x v c vi v … … … CCA abstraction subj 20, word only å f k ( w ) = x v c vi v

Each column is one fMRI image [slide courtesy of Indra Rustandi]

Discovering shared semantic basis [Rustandi et al., 2009] 1. Use CCA to discover latent features specific to study/subject CCA abstraction subj 1, word+pict å f k ( w ) = x v c vi v 20 learned … … latent features CCA abstraction f ( w ) subj 9, word+pict å f k ( w ) = x v c vi v CCA abstraction subj 10, word only å f k ( w ) = x v c vi v … … … CCA abstraction subj 20, word only å f k ( w ) = x v c vi v

Discovering shared semantic basis [Rustandi et al., 2009] 1. Use CCA to discover latent features 2. Train regression to predict them specific to study/subject CCA abstraction subj 1, word+pict independent of study/subject å f k ( w ) = x v c vi v 20 learned 20 learned … … 218 MTurk latent latent features features features CCA abstraction f ( w ) f ( w ) b ( w ) subj 9, word+pict å å f k ( w ) = f i ( w ) = x v c vi b k ( w ) c ik v k CCA abstraction subj 10, word only å word w f k ( w ) = x v c vi v … … … … … CCA abstraction subj 20, word only å f k ( w ) = x v c vi v

Discovering shared semantic basis [Rustandi et al., 2009] 1. Use CCA to discover latent features 2. Train regression to predict them specific to study/subject 3. Invert CCA mapping predict representation subj 1, word+pict independent of study/subject å v = f i ( w ) c vi i 20 learned … … 218 MTurk latent features features predict representation f ( w ) b ( w ) subj 9, word+pict å å v = f i ( w ) = f i ( w ) c vi b k ( w ) c ik i k predict representation subj 10, word only å word w v = f i ( w ) c vi i … … … … predict representation subj 20, word only å v = f i ( w ) c vi i

CCA Components: Top Stimulus Words component component component 3 component 4 1 2 Stimuli apartment screwdriver telephone pants that church pliers butterfly dress most closet refrigerator bicycle glass activate house knife beetle coat it barn hammer dog chair things that shelter? manipulation? touch my body?

Timing?

MEG: Stimulus “hand” (word plus line drawing) [Sudre et al., NeuroImage 2012]

word length word length 100 ms 50 ms right diagonalness word length verticality (Sudre et al., under review ) 0 800 ms [Sudre et al., NeuroImage 2012]

word length word length 100 ms right diagonalness word length verticality (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

aspect ratio 150 ms word length internal details (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

aspect ratio internal details 200 ms internal details IS IT HAIRY? (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

white pixel count IS IT HOLLOW? horizontalness 250 ms IS IT MADE OF WOOD? IS IT HAIRY? IS IT AN ANIMAL? (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

WAS IT EVER ALIVE? IS IT MAN-MADE? DOES IT GROW? IS IT ALIVE? CAN IT BITE OR STING? 300 ms IS IT ALIVE? CAN YOU PICK IT UP? CAN YOU HOLD IT? DOES IT GROW? IS IT BIGGER THAN A CAR? IS IT ALIVE? (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

IS IT MAN-MADE? COULD YOU FIT INSIDE IT? WAS IT EVER ALIVE? DOES IT HAVE FOUR LEGS? CAN YOU PICK IT UP? CAN YOU HOLD IT? 350 ms CAN YOU HOLD IT IN ONE HAND? IS IT ALIVE? CAN IT BEND? (Sudre et al., under review ) 0 800 ms [Sudre et al., 2012]

Using Machine Learning to Study the Neural Representations of - PowerPoint PPT Presentation

Using Machine Learning to Study the Neural Representations of Language Meanings Tom M. Mitchell Carnegie Mellon University June 2017 How does neural activity encode word meanings? How does neural activity encode word meanings? How does

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Neural Machine Translation Philipp Koehn 6 October 2020 Philipp Koehn Machine Translation:

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks & backprop Byron C Wallace Neural

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

L1-Identification Serhiy Bykh, Detmar Meurers Second Tbingen-Berlin Meeting on Analyzing

The art of breaking and designing captchas Elie Bursztein Session ID: HT02-402 Insert

Recent progress on the Viana conjecture Stefano Luzzatto Abdus Salam International Centre for

SARS-CoV-2 Cause of COVID-19 Timothy Borelli, DO, Medical Director of Infectious Disease at

Superposition Modulo Linear Arithmetic Sup(LA) Ernst Althaus, Evgeny Kruglov, Christoph

Crowdsourcing using Mechanical Turk: Quality Management and Scalability Panos Ipeirotis Stern

Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - 1 - Institut fr

Celiac Disease Pentax Research Grant (ergonomics) & Non-Celiac Gluten Sensitivity

Sambuz

Useful Links

Newsletter

Mail Us

Using Machine Learning to Study the Neural Representations of - PowerPoint PPT Presentation

Using Machine Learning to Study the Neural Representations of Language Meanings Tom M. Mitchell Carnegie Mellon University June 2017 How does neural activity encode word meanings? How does neural activity encode word meanings? How does

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Neural Machine Translation Philipp Koehn 6 October 2020 Philipp Koehn Machine Translation:

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks &amp; backprop Byron C Wallace Neural

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

L1-Identification Serhiy Bykh, Detmar Meurers Second Tbingen-Berlin Meeting on Analyzing

The art of breaking and designing captchas Elie Bursztein Session ID: HT02-402 Insert

Recent progress on the Viana conjecture Stefano Luzzatto Abdus Salam International Centre for

SARS-CoV-2 Cause of COVID-19 Timothy Borelli, DO, Medical Director of Infectious Disease at

Superposition Modulo Linear Arithmetic Sup(LA) Ernst Althaus, Evgeny Kruglov, Christoph

Crowdsourcing using Mechanical Turk: Quality Management and Scalability Panos Ipeirotis Stern

Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - 1 - Institut fr

Celiac Disease Pentax Research Grant (ergonomics) &amp; Non-Celiac Gluten Sensitivity

Sambuz

Useful Links

Newsletter

Mail Us

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks & backprop Byron C Wallace Neural

Celiac Disease Pentax Research Grant (ergonomics) & Non-Celiac Gluten Sensitivity