framenet cnl
play

FrameNet CNL: A Knowledge Representation and Information Extraction - PowerPoint PPT Presentation

FrameNet CNL: A Knowledge Representation and Information Extraction Language Guntis Barzdins Institute of Mathematics and Computer Science, University of Latvia CNL-2014, 20-22 August 2014, Galway This research was partially supported by the


  1. FrameNet CNL: A Knowledge Representation and Information Extraction Language Guntis Barzdins Institute of Mathematics and Computer Science, University of Latvia CNL-2014, 20-22 August 2014, Galway This research was partially supported by the Project Nr.2DP/2.1.1.1.0/13/APIA/VIAA/014 (ERAF) “Identification of relations in newswire texts and graph visualization of the extracted relation database” under contract Nr. 1/5 -2013, LU MII Nr. 3-27.3-5-2013.

  2. Outline  FrameNet frames of interest and automatic SRL We use 26 frames –  Frame Element filler disambiguation towards the real-world entities We disambiguate only Persons and Organizations – Named Entity Linking (NEL) or Cross Document Coreference (CDC) –  IE & KR system for the Latvian News Agency Entire digitized Latvian news-archive (~12M articles) processed – The method part of Horizon-2020 ICT-15 proposals –  from MicroEvents to CompositeEvents, n-ary relation extraction from text  multilingual Information Extraction, Knowledge visualization

  3. Information Extraction and CNL FN-CNL text FN-CNL generation Source abstract Text in information documents knowledge FN-CNL extraction in NL representation FN-CNL (AKR) parsing

  4. Information Extraction and CNL FN-CNL text FN-CNL generation Source abstract Text in information documents knowledge FN-CNL extraction in NL representation FN-CNL (AKR) parsing

  5. Commercial Transaction Frame  The event known as Commercial Transaction is a small bit of history where a Buyer and a Seller exchange Money and Goods  The frame brings with it a checklist of the frame elements that have to be part of the event – Buyer, Seller, Goods, Money  Frames are langugage independent (multilinguality) From C.J.Fillmore’s slides presented at FrameNet MasterClass during TLT8, (2009)

  6. Commercial Transaction Frame  Various target words may evoke the same frame They sold me the laptop for $1100. 1. I bought the laptop for $1100. 2. They only charged me $1100 for the laptop. 3. My laptop cost me $1100. 4. I got the laptop for a mere $1100. 5. From C.J.Fillmore’s slides presented at FrameNet MasterClass during TLT8, (2009)

  7. Commercial Transaction Frame  Various target words may evoke the same frame They sold me the laptop for $1100. 1. I bought the laptop for $1100. 2. They only charged me $1100 for the laptop. 3. My laptop cost me $1100. 4. I got the laptop for a mere $1100. 5.  Frame Elements (in various syntactic realizations): Buyer, Seller, Goods, Money From C.J.Fillmore’s slides presented at FrameNet MasterClass during TLT8, (2009)

  8. FrameNet Labeling Example  Phrase head-words are labelled in the dependency tree  A complete MachineLearning pipeline developed for FN labeling (POS, NER, Syntax, Coreference, FrameNet SRL)

  9. FrameNets FrameNet 1.3 FrameNet 1.5 FrameNet LV (English) (English) (Latvian) Frame types 665 (795) 877 (1019) 26 FrameElement types 720 1068 80 Training sentences with 2198 3256 4079 full annotation Training sentences with 139439 154607 – single frame annotation Test sentences with 120 2420 844 full annotation SemEval-2007, Task19 dataset

  10. Accuracy of Automated FrameNet SRL  C6.0 FrameNet SRL demo http://c60.ailab.lv Frame Target Frame Element identification identification English FrameNet 1.3 Precision Recall F1 Precision Recall F1 SemEval-2007 dataset LTH 1) 68.9 53.6 60.3 51.6 35.4 42.0 SEMAFOR/Google 2) 69.7 54.9 61.4 58.1 38.8 46.5 C6.0 RuleSet EN 77.1 53.7 63.3 47.3 47.0 47.1 C6.0 RuleSet LV 63.5 62.7 63.1 65.9 76.8 70.9 1) Johansson, R., Nugues, P. (2007). LTH: semantic structure extraction using nonprojective dependency trees. In Proceedings of SemEval-2007: 4th International Workshop on Semantic Evaluations. Prague, pp. 227--230. 2) Das, D., Chen, D., Martins, A.F.T, Schneider, N., Smith, N.A. (2014). Frame-Semantic Parsing, Computational Linguistics, 40(1), pp. 9--56.

  11. Accuracy of Automated FrameNet SRL C6.0 builds entire Latvian FrameNet RuleSet in 5 minutes (26 frames, 5000 annotated sentences) – enables incremental learning Frame target identification F1 score for some Frame target identification F1 score as a English FrameNet function of sentences in the training set frames

  12. Named Entity Recognition (NER) and Anaphora resolution  Cross-Document  Named Entity Linking (NEL) Coreference (CDC) Assisted by alias-name lists – (multilingual) and cosine-similarity of There is no a’priori manually – context bag-of-words created entity KnowledgeBase DBpedia often used as an – Entities need to be identified entity KnowledgeBase – and linked «on the fly» Typically ignore ontological – entity relations (part-of, previous-name) The key problem: persons – with the same name  «John Brown» problem

  13. Abstract Knowledge Representation (with 26 frames) TimeFrames Time:dateTime Earnings_and_losses TimePlaceFrames Earnings:string Lending Personal_relationship Place:string Public_procurement Goods:string Possession Giving Relationship:string Theme:string Theme:string Profit:string Possession2:string Collateral:string Expected_amount:string Unit:string Theme:string Share:string Units:string Result:string Growth:string Participation Candidates Winner Institution Owner Earner Lender Borrower Recipient Donor Partners Partner_2 Partner_1 Event:string PersonOrOrganization Manner:string Participant_1 PrimaryName:string Alias:string Win_prize Prize:string Competitor Competition:string Result:string Oponent Rank:string Organizer Residence Resident Frquency:string Trial Defendant Laiks:string Person:string Prosecutor Charges:string Education_teaching Being_employed Subject:string Compensation:string Qualification:string Organization Institution Employment_start:dateTime Employment_end:dateTime Court Position:string Employer Group Medium Institution People_by_origin People_by_vocation People_by_age Vocation:string Ethnicity:string Age:string Product_line Origin:string Descriptor:string Membership Brand:string Person Person Student Person Advok ā ts Employee Products:string Standing:string Person Member Speaker Statement Message:string OWL ontology visualised with OWLGrEd: http://owlgred.lumii.lv

  14. DBpedia 3.9 – a Non-Linguistic Knowledge Representation example

  15. Information Extraction System Statement User Interface manual verification status Selected Person or Organization The selected source document (natural language) CNL verbalization of the extracted frames Documents supporting the selected statement

  16. Information Extraction and CNL FN-CNL text FN-CNL generation Source abstract Text in information documents knowledge FN-CNL extraction in NL representation FN-CNL (AKR) parsing Ieva Akuratere bija solista amatā [23] (Ieva Akuratere had a soloist position) Ieva Akuratere held a soloist position Ieva Akuratere bija Puķu burves amatā [8] (Ieva Akuratere had a Flower fairy position) Ieva Akuratere held a Flower fairy position Ieva Akuratere held a musician and actress position Ieva Akuratere bija mūziķes un aktrises amatā [5] (… had a musician and actress position) Ieva Akuratere held a member position in Riga city council Ieva Akuratere bija deputātes amatā Rīgas domē [ (… had a member position in Riga city council) Ieva Akuratere held a soloist position in a Concert Ieva Akuratere bija solista amatā Koncertuzvedumā [4] (… had a soloist position in a Concert) Ieva Akuratere held a singer position Ieva Akuratere bija dziedātājas amatā [3] (… had a singer position) Ieva Akuratere held an Honorary position in Latvia Ieva Akuratere bija triju Zvaigžņu ordeņa virsnieka amatā Latvijā [3] (…had an Honor position in Latvia)

  17. Open Challenges  Transitioning to AMR (Abstract Meaning Representation)  User interface: CNL + Graphic  Time representation

  18. Extension (1): AMR Pascale was charged with public intoxication and resisting arrest. p c p2 i a2 r a intoxicate-01 n and public resist-01 person arrest-01 c name charge-05 a2 «Pascale» i (c / charge-05 a r :ARG1 (p / person :name (n / name :op1 “Pascale”)) a2 :ARG2 (a / and p2 :op1 (i / intoxicate-01 :ARG1 p :location (p2 / public)) :op2 (r / resist-01 p :ARG0 p :ARG1 (a / arrest-01 n http://amr.isi.edu/ :ARG1 p))))

  19. Extension (2): Graphic UI

  20. Extension (3): Time Representation Background knowledge sequential snapshots of OWL A-Box (assertions) OWL T-Box (terminology) Ontology RDF NamedGraph1 RDF NamedGraph2 RDF NamedGraph3 RDF NamedGraph4 (DB shema) timeline SPARQL/ SPARQL/ SPARQL/ PROLOG PROLOG PROLOG update update update Little Red Riding Hood lived in a wood with her mother. She baked tasty bread and brought it to her grandmother. Residence Cooking_Creation Bringing FrameNet (Frames implemented as SPARQL or PROLOG update procedures) PAO-CNL reported at CNL-2009 workshop (Marettimo): http://www.semti-kamols.lv/doc_upl/LRRH.mov

  21. Questions? C6.0 FrameNet SRL demo http://c60.ailab.lv

  22. FrameNet Manual SRL Interface

  23. FrameNet SRL Review Interface

Recommend


More recommend