Plausible reasoning based on qualitative entity embeddings Steven Schockaert (joint work with Joaquin Derrac, Shoaib Jameel, Thomas Ager) School of Computer Science & Informatics Cardi ff University, Cardi ff , UK SchockaertS1@cardi ff .ac.uk http://users.cs.cf.ac.uk/S.Schockaert
Plausible inference patterns Similarity based reasoning Mary enjoys hiking in the Alps the Alps are similar to the Pyrenees Mary enjoys hiking in the Pyrenees
Plausible inference patterns Similarity based reasoning Mary enjoys hiking in the Alps the Alps are similar to the Pyrenees Mary enjoys hiking in the Pyrenees
Plausible inference patterns Similarity based reasoning Mary enjoys hiking in the Alps the Alps are similar to the Pyrenees Mary enjoys hiking in the Pyrenees Category based induction BBC is regulated by Ofcom ITV is regulated by Ofcom BBC and ITV are representative examples of British broadcasters All British broadcasters are regulated by Ofcom
Plausible inference patterns Similarity based reasoning Mary enjoys hiking in the Alps the Alps are similar to the Pyrenees Mary enjoys hiking in the Pyrenees Category based induction BBC is regulated by Ofcom ITV is regulated by Ofcom BBC and ITV are representative examples of British broadcasters All British broadcasters are regulated by Ofcom
Plausible inference patterns Interpolation Sandwich shops in Wales are required to display food hygiene ratings Restaurants in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants Cafes in Wales are required to display food hygiene ratings
Plausible inference patterns Interpolation Sandwich shops in Wales are required to display food hygiene ratings Restaurants in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants Cafes in Wales are required to display food hygiene ratings
Plausible inference patterns Interpolation Sandwich shops in Wales are required to display food hygiene ratings Restaurants in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants Cafes in Wales are required to display food hygiene ratings A fortiori reasoning University sta ff are not permitted to travel in business class the underlying reason is because business class is too expensive first class is more expensive than business class University sta ff are not permitted to travel in first class
Plausible inference patterns Interpolation Sandwich shops in Wales are required to display food hygiene ratings Restaurants in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants Cafes in Wales are required to display food hygiene ratings A fortiori reasoning University sta ff are not permitted to travel in business class the underlying reason is because business class is too expensive first class is more expensive than business class University sta ff are not permitted to travel in first class
Plausible inference patterns Borderline effects The item for sale is original art The item for sale is a poster The concepts original art and poster are disjoint limited-edition art print is a borderline case of poster limited-edition art print is a borderline case of original art The item for sale is a limited-edition art print
Plausible inference patterns Borderline effects The item for sale is original art The item for sale is a poster The concepts original art and poster are disjoint limited-edition art print is a borderline case of poster limited-edition art print is a borderline case of original art The item for sale is a limited-edition art print
Motivation unstructured domain domain domain domain data theory theory theory theory domain Completed domain theory domain theory theory Interpretable machine learning models Ontology based data access Recognising textual entailment
Supporting plausible inference using entity embeddings
Representing lexical information Key problem: the meaning of many words can cannot be captured by necessary and sufficient conditions (e.g. “game”) Similarity plays a key role in modelling meaning: - Wittgenstein: concepts as family resemblances - Prototype and exemplar theories This leads to the use of geometric models of meaning - Neural network embeddings - Information retrieval: probabilistic topic models - Conceptual spaces (Gärdenfors)
Conceptual spaces restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
is-a relations restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Similarity restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Representativeness restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Conceptual betweenness restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Conceptual neighbourhood restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Interpretable directions restaurant fine dining restaurant bar hotel bar pub formal gastropub bistro wine bar cafe sports bar sandwich shop focused on food
Inducing a conceptual space of films PPMI weighted POS tagger, chunker term co- (Open NLP project) occurrence vectors Classical multi- dimensional scaling ... 100-dimensional Euclidean space
Betweenness aliens between star trek and cloverfield cast away between titanic and into the wild lord of the rings between harry potter and troy mission impossible between the rock and skyfall star wars between lord of the rings and star trek troy between braveheart and thor wall-e between monsters inc and 2001: a space odyssey good will hunting between dead poets society and rain man unbreakable between sin city and the sixth sense scarface between sin city and the godfather forest gump between million dollar baby and stand by me shrek 2 between wedding crashers and the lion king
Betweenness abbey between castle and chapel bistro between restaurant and tea room butcher shop between marketplace and slaughterhouse conservatory between greenhouse and playhouse duplex between detached house and triplex flower shop between garden center and gift shop grocery store between convenience store and farmers market manor between castle and mansion house rice paddy between bamboo forest and cropland sushi restaurant between Japanese restaurant and tapas restaurant veterinarian between animal shelter and emergency room wine shop between gourmet shop and liquor store
Learning interpretable directions Direction towards more “violent” films films whose associated text contains the word “violent” films whose associated text does not contain the word “violent”
Learning interpretable directions
the Godfather Blair Witch Project italian, corrupt, immoral, spooky, scary, scarier, not ADJ unsurpassed, absolutely wonderful scary, creepy, pretty creepy organised crime, the gangsters, the witch, scary movies, spooky, NOUNS mob, gangsters, the assassination, a horror flick, a horror movie, loyalty scares Fight Club Gladiator epic, historically accurate, insightful, provocative, disturbing, ADJ historical, lavish, magnificent depressed, depressing epics, the battle scenes, battle conformity, society, NOUNS scenes, an epic, the epic voyeurism, our society, a dark comedy
Commonsense classifiers measure. Foursquare GeoNames OpenCYC Algorithm Acc. F1 Acc. F1 Acc. F1 n Col 0.947 0.717 0.881 0.401 0.383 0.956 interpolation Btw A 0.949 0.717 0.883 0.395 0.373 0.956 Btw B 0.943 0.617 0.881 0.349 0.954 0.295 analogical Analog A 0.921 0.636 0.822 0.330 0.933 0.375 Analog B 0.940 0.707 0.853 0.347 0.945 0.382 classifiers Analog C 0.925 0.686 0.859 0.411 0.942 0.391 FOIL 0 0.926 0.564 0.876 0.201 0.950 0.267 a fortiori FOIL 1 50 0.925 0.596 0.860 0.272 0.943 0.329 inference FOIL 2 0.926 0.627 0.861 0.285 0.946 0.335 FOIL 3 0.928 0.594 0.876 0.300 0.949 0.268 1-NN 0.939 0.710 0.853 0.357 0.945 0.380 C4.5 MDS 0.925 0.534 0.849 0.178 0.941 0.245 C4.5 dir 0.918 0.382 0.849 0.374 0.939 0.262 SVM MDS 0.932 0.656 0.859 0.343 0.912 0.328 SVM BoW 0.913 0.358 0.874 0.172 0.946 0.205
Open-domain semantic spaces Idea: learn semantic space representation for every entity that has a Wikipedia page Problems: - Each semantic space should only contain entities of the same type - Number of dimensions of each space needs to be carefully selected - Relations between entities of di ff erent types can provide valuable information (e.g. films directed by the similar directors tend to be similar)
Recommend
More recommend