Knowlywood: Mining Activity Knowledge From Hollywood Narratives Date:2016/08/30 Author:Nilet Tandon, Gerard de Melo, Abir De, Gerhard Wrikum Source:CIKM’15 Advisor:Jia-Ling Koh Speaker:Pei-Hao Wu 1
Outline • Introduction • Method • Experiment • Conclusion 2
Introduction • Motivation • Major knowledge graphs focus on factual knowledge • Ex: songs and awards of an artist, CEOs and products of companies, etc. • With ground- breaking new products like Google Now, Apple’s Siri, there is need for commonsense knowledge • enabling smart interpretation of queries relating to everyday human activities 3
Introduction • Goal • Automatically compiling large amounts of knowledge about human activities from narrative text 4
Introduction • Flow chart 5
Outline • Introduction • Method • Experiment • Conclusion 6
Method • Input sentence: The man began to shoot a video in the moving bus • ClausIE: • (“the man”, ”began to shoot”, “a video”), (“the man”, ”began to shoot”, ”in the moving bus”)…etc. • OpenNLP: • (“the man”), (“began to shoot”), (“a video”), (“in”), (“the moving bus”) 7
Method • Sense Analysis - VerbNet • Providing the syntactic frame • Ex: Agent.animate V Patient.animate PP Instrument.solid • Selectional restriction • Ex: Patient.animate requires this patient be a lining being • Sense Analysis - WordNet • Mapping the word to disambiguated WordNet senses • Ex: shoot -> shoot#1 、 shoot#2…etc. 8
Method • WSD system It-Makes-Sense(IMS) • For an initial disambiguation of word • Notation • Sw: the set of candidate WordNet sense • Sv: the set of candidate VerbNet sense • i: word • j: sense 9
Method • Most-frequent-sense rank • Additional feature used in the ILP • Notation • Sw: the set of candidate WordNet sense • Sv: the set of candidate VerbNet sense • i: word • j: sense 10
Method • 𝑡𝑧𝑜 𝑗𝑘 • Frame match score for word i and VerbNet sense j • 𝑡𝑓𝑛 𝑗𝑘 • Selectional restriction score of the roles in a VerbNet frame j for word i 11
Method • ILP Model • 𝑌 𝑗𝑘 = 1 if word i is mapped to sense j • At most one sense is chosen for each word 12
Method • Graph inference • Connection between different activity frames • parent type(T) 、 semantic similarity edges(S) 、 temporal order(P) • Using PSL framework for relational learning and inference • Edge Priors • An activity as a (verb-sense, noun-sense) pair • U sing WordNet’s taxonomic hierarchy to estimate T and S • Using GSP to find P edge 13
Method • T edge • The prior between two pairs(v1,n1),(v2,n2) is calculate as a score t(v1,v2)*t(n1,n2) • For noun sense: using WordNet hypernymy • For verb sense: WordNet hypernymy and VerbNet verb hierarchy • The score is 1 if parent and child are connected and 0 otherwise • Ex: “go up an elevation” is the parent type of “hike up a hill” 14
Method • S edge • The prior between two pairs(v1,n1),(v2,n2) is calculate as a score sim(v1,v2)*sim(n1,n2) • For noun sense: using WordNet path similarity measure • For verb sense: using WordNet groups and VerbNet class membership • Ex: “climb up a mountain” is semantic similarity to “hike up a hill” 15
Method • P edge support = 𝑔𝑠𝑓𝑟(𝑏 1 𝑞𝑠𝑓𝑤 𝑏 2 ) • Using GSP to efficiently determine P edge 𝑔𝑠𝑓𝑟 𝑏 1 𝑔𝑠𝑓𝑟(𝑏 2 ) • maximum gap=4 、 minimum support=3 • Ex • Input: “wake up” - >“ drink water ” - >“ brush teeth ” - >“ eat breakfast ” - >“ go out” • Output: “wake up” - >”go out” 16
Method • PSL • Computing a cleaner graph of T, S, and P edges with scores • PSL model with the following soft first-order logic rules • Parents often inherit prev. (P) edges from their children • P(a,b) Λ T(a,a ’) Λ T(b,b ’) => P(a’,b’) • Similar activities are likely to share parent types • S(a,b) Λ T(b, 𝑐 0 ) => T(a, 𝑐 0 ) 17
Method • Taxonomy construction • Synsets • The previous steps may produce overly specific activities • Ex: “embrace spouse”, “hug wife”, “hug partner”, etc. • Grouping similar activities together into a single frame • Pruning S from the previous step for activity merging • WordNet path similarity as a measure of semantic distance 18
Method • Taxonomy construction • Hierarchy • Some of activities may subsume others • Ex: “divorce husband” is subsumed by “break up with a partner” • “ break up with a partner ” is more general than “ divorce husband ” • Pruning T from the previous step for activity hierarchy induction • An activity taxonomy is a directed acyclic graph (DAG) • Using WordNet path similarity but only consider hypernym 19
Outline • Introduction • Method • Experiment • Conclusion 20
Experiment • System components • Data processing • 1.89 million scenes from several sources • 560 movie scripts, scripts of 290 TV series, and scripts of 179 sitcoms form wikia.com and dailyscript.com • 103 novels for Project Gutenberg • Textual descriptions of videos about cooking 21
Experiment • System components • Had human judges annotate at least 250 random samples 22
Experiment • Knowlywood KB evaluation • Compiled a random sample of 119 activities from the KB 𝑑 • Expert human annotators to judge each attribute, and we compute the precision as 𝑑+𝑗 • c: counts of correct, i: counts of incorrect • Comparison with ConceptNet • Mapped CN’s relations to our notion of activity attribute 23
Experiment • Knowlywood KB evaluation • Comparison with ReVerb • ReVerb: a system which aims at mining all possible subject-predicate-object triple from text • Using MovieClips tag to map word to tag • Two datasets as input to ReVerb • ReVerbMCS: script data that we used for our system • ReVerbClue: ClueWeb09 24
Experiment • Knowlywood KB evaluation • Movie scene tagging • Selecting 1000 clips from Movieclips.com as gold data • Giving [participant, location, time] and then assess top-k activity recommendations 25
Outline • Introduction • Method • Experiment • Conclusion 26
Conclusion • This paper presented Knowlywood, the first comprehensive KB of human activities • The one million activity frames is an important asset for a variety of applications such as image and video understanding 27
Recommend
More recommend