Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , Maarten de Rijke, Michael Cochez, Vadim Savenkov, Axel Polleres
Semantic Coherence? I think Monterey is a great conference location! Oh yes, it has Florida ’s most beautiful coastline… …I am looking forward to see the Eiffel Tower ! � 2 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Semantic Coherence ~ Contextual Glue I think Monterey is a great conference location! locatedIn California Oh yes, it has Florida ’s most beautiful coastline… locatedIn Aquarium …I am looking forward to see the Eiffel Tower ! � 3 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Semantic Coherence IsA Classification Task Coherence score Sense-making line Background Knowledge Nonsense � 4 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Applications ▪ Conversational analysis ▪ reconstructing dialogs in a public chat ▪ detecting topic shifts for segmentation ▪ Conversational agents ▪ interpreting context ▪ generating response � 5 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Contributions 1. Task of measuring semantic coherence of a conversation 2. Benchmark for the semantic coherence task 3. Approaches and their evaluation: 3.1.Subgraph induction approach 3.2.Graph embeddings approach 3.3.Word embeddings approach � 6 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Benchmark ▪ Conversational dataset ▪ Ubuntu Dialogue Corpus (IRC logs) ~2M ▪ Knowledge representation models ▪ KGs: DBpedia+Wikidata HDT ▪ KG embeddings: RDF2Vec, KGlove ▪ Word embeddings: Word2Vec, Glove ▪ Entity Linking : DBpedia Spotlight API � 7 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Subgraph induction: Top-k Shortest Paths w1 mdg : gksudo gedit /etc/apt/source.list w2 w3 (type from command line) crunchbang666 : the text editor has opened the file source.list but there is no content i typed source instead of sources ... ok so i have it open dbr:Gedit w1 u1 c1 w2 genre wikiPageWikiLink u2 c* c2 dbr:GNOME dbr:Text editor w3 wikiPageWikiLink p1 p2 w4 c3 dbr:Deb(file format) crunchbang666 mdg wikiPageWikiLink u3 c4 dbr:Ubuntu(OS) u4 w5 w5 w4 mdg : see the line # deb http://gb.archive. ubuntu w4 all you have to do is delete the ""#"" character � 8 crunchbang666 : just the deb or the deb-src line too? Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Benchmark: Incoherent Dialogues Nonsense � 9 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Benchmark: Incoherent Dialogues 1. Vocabulary sampling 1.1.Random uniform 1.2.Vocabulary distribution Nonsense � 10 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Benchmark: Incoherent Dialogues 1. Vocabulary sampling 2. Dialogue permutations 1.1.Random uniform 2.1.Sequence disorder 1.2.Vocabulary distribution 2.2.Horizontal split 2.3.Vertical split Nonsense � 11 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Subgraph Induction: Performance Bottleneck % entities min #hops � 12 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Embeddings Classification Approach ▪ Convolutional Neural Network (CNN) ▪ Input: sequence of words/entities ▪ Output: coherence score [0;1] Input Embeddings Convolutional Max pool Hidden Output 0.8 ReLU ReLU Sigmoid 250 filters size 3 step 1 Pre-trained embeddings ▪ Entities: RDF2Vec, KGlove ▪ Words: Word2Vec, Glove � 13 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Results: Word Embeddings perform Best { Word KG { � 14 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Results: KG Embeddings Classification % entities � 15 min cosine distance Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Results:Word Embeddings Classification % entities � 16 min cosine distance Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Results: Permutation IsA Difficult Task 1. sampling 2. permutations { Word KG { � 17 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Coherence Patterns � 18 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Coherence Patterns � 19 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Horizontal Split � 20
Horizontal Split � 21
Horizontal Split � 22
Horizontal Split � 23
Coherence Patterns � 24
Future Work ▪ Extend coherence measure beyond KG entities ▪ Integration of KG and word embeddings ▪ End-to-end training including entity linking layer Open source: https://github.com/svakulenk0/semantic_coherence � 25 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Acknowledgements THIS WORK WAS SUPPORTED BY THE FOLLOWING PROJECTS: EU H2020 PROGRAMME UNDER THE MSCA-RISE AGREEMENT 645751 (RISE_BPM) PROJECT OPEN DATA FOR LOCAL COMMUNITIES FUNDED BY THE AUSTRIAN FEDERAL MINISTRY OF TRANSPORT, INNOVATION AND TECHNOLOGY (BMVIT) UNDER THE PROGRAM "ICT OF THE FUTURE“, BETWEEN NOVEMBER 2016 AND APRIL 2019. MORE INFORMATION HTTPS://IKTDERZUKUNFT.AT/EN/ � 26 Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Recommend
More recommend