A 2-Phase Frame-based Knowledge A 2-Phase Frame-based Knowledge Extractjon Framework Extractjon Framework Francesco Corcoglioniti Marco Rospocher, Alessio Palmero Aprosio francesco@corcoglioniti.name Fondazione Bruno Kessler – IRST Trento, Italy SAC 2016 htup:/ /pikes.fck.eu/ PISA, 06 April 2016
Problem Problem Knowledge Extractjon from Text – English text only – ABox (instances and facts) only → Ontology Populatjon – focus on extractjng events and their partjcipants → represented as semantjc frames, i.e., event instances (e.g. ‘sell’ event) linked to partjcipant instances via role propertjes (e.g. ‘seller’) Example: “G. W. Bush and Bono are very strong supporters of the fjght of HIV in Africa.” :support_event_1 frb:fe-taking_sides-issue frb:fe-taking_sides frb:fe-taking_sides-cognizer :fight_event_1 frb:fe-hostile_encounter frb:fe-taking_ dbpedia:Bush dbpedia:Bono frb:fe-hostile_ sides-degree dbyago:Person10007846 dbyago:Person10007846 frb:fe-hostile_ encounter-side_2 encounter-place attr:very-1r_strong-1a dbpedia:HIV dbpedia:Africa ks:Attribute owl:Thing dbyago:Location100027167 2 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Contributjon Contributjon PIKES* – a tool for Knowledge Extractjon from English text – extractjng semantjc frames ● aligned to predicate models → PropBank (PB), NomBank (NB), VerbNet (VN), FrameNet (FN) ● new: aligned to FrameBase – extractjng instances ● typed w.r.t. YAGO and SUMO ● disambiguated w.r.t. DBpedia – representjng all contents in RDF + named graph – based on a 2-phase approach – open source – htup:/ /pikes.fck.eu/ (*) PIKES Is a Knownedge Extractjon Suite 3 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Data Model Data Model Resource layer Mentjon layer Instance layer ks:Resource ks:Mention ks:Instance ks:denotes / ks:mentionOf ks:implies text offset in text describes ...metadata... ...attributes... ks:expresses Assertion (graph) ...triples on instances... a linguistjcally annotated M3 piece of text about M2 MentionSubclass an entjty (person...), semantjc something of interest frame (event…) or aturibute ...attributes... :resource1 a ks:Resource; :mentjon1 a ks:FrameMentjon; :graph1 { dct:created "2016-04-06"; ks:mentjonOf :resource1 ; :supporters_entjty nif:isString "G. W. Bush nif:beginIndex 36; nif:endIndex 46; a dbyago:Supporter110677713. and Bono are very nif:anchorOf "supporters"; :support_event_1 a strong supporters of ks:synset wn30:n-10677713 a frb:frame-taking_sides; the fjght of HIV in ks:predicate pmo:nb10_support.01; frb:fe-taking_sides-cognizer Africa.". ks:role pmo:nb10_support.01_arg01; :supporters_entjty. ks:expresses :graph1 ; } ks:denotes :supporters_entjty; ks:implies :support_event_1. based on: Corcoglionitj et al. KnowledgeStore: a storage framework for interlinking unstructured and structured knowledge. IJSWIS 2015 4 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Data Model (2) Data Model (2) Complete model: nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> ks:expresses Assertion (graph) ks: <http://dkm.fbk.eu/ontologies/knowledgestore#> foaf: <http://xmlns.com/foaf/0.1/> contains triples about ks:Resource ks:Instance dct:title rdf:type owl:sameAs ks:Mention ks:mentionOf ks:denotes / ks:implies dct:creator rdfs:label rdfs:seeAlso dct:created foaf:name ks:include frame/arg rel. ks:RelationMention ks:InstanceMention ks:Entity ks:coreferentialConjunct nif:beginIndex ks:Frame nif:endIndex ks:coreferential nif:anchorOf ks:ParticipationMention ks:CoreferenceMention ks:Attribute ks:synset ks:linkedTo ks:role ks:Time ks:argument ks:frame ks:target OWL time props. Instance layer ks:FrameMention ks:NameMention ks:TimeMention ks:AttributeMention Mention layer ks:predicate ks:nercType ks:norm.Value ks:normalizedValue Resource layer 5 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
2-Phase Approach 2-Phase Approach Resource laye r G. W. Bush and Bono are very strong supporters of the fight of HIV in Africa. Phase 1 Linguistjc Feature Extractjon Mentjon layer G. W. Bush Bono very strong supporters fight HIV Africa ks:frame ks:pred. ks:arg. ks:frame ks:arg. ks:arg. ks:coref.Conjunct very strong supporters of [...] fight fight of HIV ks:arg. supporters ks:frame fight [...] in Africa ks:coreferential G. W. Bush and Bono [...] supporters Phase 2 Knowledge Distjllatjon Instance layer frb:fe-hostile dbpedia:Bush dbpedia:Bono attr:very-1r_strong-1a :support :fight dbpedia:HIV dbpedia:Africa _encounter -side_2 frb:fe-taking_sides-degree frb:fe-taking _sides-cognizer frb:fe-taking_sides-issue frb:fe-hostile_encounter-place 6 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Linguistjc Feature Extractjon Linguistjc Feature Extractjon ① apply several NLP tasks to input text Partjcipatjon ② map their outputs to mentjons Coreference Aturibute Instance Frame Name Time NLP Task▼ Type of mention► part-of-speech tagging POS √ √ √ named entjty recognitjon & classifjcatjon NERC √ √ temporal expression recognitjon & norm. TERN √ entjty linking EL √ √ word sense disambiguatjon WSD √ √ semantjc role labeling SRL √ √ coreference resolutjon COREF √ dependency parsing DP √ √ √ 7 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Linguistjc Feature Extractjon (2) Linguistjc Feature Extractjon (2) Example: “fjght of HIV” Extracted RDF mention graph <..#char=63,66> a :NameMention ; via NERC, EL nif:anchorOf “HIV” ; :nercType :MISC ; :linkedTo dbpedia:HIV . <..#char=54,59> a :FrameMention ; via SRL nif:anchorOf “fight” :predicate pm:nb10-fight.01 . <..#char=54,66> a :ParticipationMention; via SRL, DP nif:anchorOf “fight […] HIV” :frame <..#char=54,59> ; :argument <..#char=63,66> ; :role pmo:nb10-fight.01-arg1 . 8 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Knowledge Distjllatjon Knowledge Distjllatjon ① R ule-based conversion from Mentjon to Instance data – deal with phenomena such as argument nominalizatjon and group entjtjes – use background knowledge → e.g., mappings to ontologies, characterizatjon of predicates ② Post-processing: OWL2RL inference, reduce # of named graphs Mentjon layer Instance layer :mentjon1 a ks:FrameMentjon; :g1 { :e1 a dbyago:Supporter110677713. :ev1 a frb:frame-taking_sides; nif:anchorOf "supporters"; frb:fe-taking_sides-cognizer :e1. } ks:synset wn30:n-10677713; ks:predicate pmo:nb10_support.01; :mentjon1 ks:expresses :g1; ks:role pmo:nb10_support.01_arg01; ks:denotes :e1; ks:implies :ev1. Background knowledge INSERT { ?m ks:denotes ?i; ks:implies ?if; ks:expresses ?g. pmo:nb10_support.01 GRAPH ?g { ?i a ks:Instance. ?if a ks:Frame } } a ks:ArgumentNominalization. WHERE { ?m a ks:FrameMention; nif:anchorOf ?a, ks:predicate ?s. ?s a ks:ArgumentNominalization. BIND (ks:mint(?m) AS ?g) BIND (ks:mint(?a, ?m) AS ?i) BIND (ks:mint(concat(?a, “_pred”), ?m) AS ?if) 9 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
Knowledge Distjllatjon (2) Knowledge Distjllatjon (2) Longer example: Mentjon layer Instance layer :m1 a ks:NameMentjon; :m1 ks:expresses :g1; ks:denotes :e1 . nif:anchorOf “G. W. Bush”; :g1 { :e1 a dbyago:PersonXYZ; ks:nercType ks:bbn_person. owl:sameAs dbpedia:Bush; foaf:name “G. W. Bush”. } :m2 a ks:NameMentjon; nif:anchorOf “Bono”; :m2 ks:expresses :g2; ks:denotes :e2 . ks:nercType ks:bbn_person. :g2 { :e2 a dbyago:PersonXYZ; :m3 a ks:FrameMentjon; owl:sameAs dbpedia:Bono; nif:anchorOf "supporters"; foaf:name “Bono”. } ks:predicate pmo:nb10_support.01; ks:role pmo:nb10_support.01_arg01; :m3 ks:expresses :g3; ks:denotes :e3; :m4 a ks:CoreferenceMentjon; ks:implies :ev1. ks:coreferentjal :m3; :g3 { :e3 a dbyago:SupporterXYZ. ks:coreferentjalConjunct :m1, m2. :ev1 a frb:frame-taking_sides; frb:fe-taking_sides-cognizer :e3. } Background knowledge :m4 ks:expresses :g4. :g4 { :e3 ks:include :e1, :e2 } pmo:nb10_support.01 a ks:ArgumentNominalization. 10 A 2-phase Frame-based Knowledge Extractjon Framework - Corcoglionitj, Rospocher, Palmero Aprosio
