event extraction
play

Event Extraction Event Template for Terrorist Acts OUTPUT: filled - PowerPoint PPT Presentation

Event Extraction Event Roles vs. Named Entity Recognition Identifying descriptions of complex events and extracting the role fillers associated with each Named Entity Recognition = identifying types of entities incident. Event Roles =


  1. Event Extraction Event Roles vs. Named Entity Recognition Identifying descriptions of complex events and extracting the role fillers associated with each Named Entity Recognition = identifying types of entities incident. Event Roles = identifying entities that play a specific role with respect to an event EVENT ROLES Terrorist act perpetrator, victims, target Paul Nelson killed John Smith. Natural disaster natural force, victims, damage Paul Nelson was killed by John Smith. Plane crash vehicle, victims, cause Management changes person leaving, position, IBM purchased Microsoft. successor, organization IBM was purchased by Microsoft. Disease outbreaks disease, victims, symptoms, IBM was purchased on Tuesday by Microsoft. containment measures Event Extraction Event Template for Terrorist Acts OUTPUT: filled event INPUT: document template Date <date> Location <location> EVENT: bombing Event type <set fill> December 29, Pakistan - The TARGET: U.S. embassy Weapon <string list> U.S. embassy in Islamabad was LOCATION: Islamabad damaged this morning by a car Perpetrator individual <string list> DATE: December 29 bomb. Three diplomats were Perpetrator organization <string list> WEAPON: car bomb injured in the explosion. Al Physical target <string list> Qaeda has claimed VICTIMS: three diplomats Physical target effect <set fill> responsibility for the attack. PERPETRATOR: Al Qaeda Human target <string list> Human target effect <set fill>

  2. Event Template for Disease Outbreaks Filled Event Template for Terrorist Acts Story: <document id> Date 10 January 1990 ID: <template id> Location El Salvador: San Salvador (city) Date: <date> Event type BOMBING Event: OUTBREAK Weapon “ highpower bombs ” Status: <set fill> Perpetrator individual “ guerrilla urban commandos ” Containment: <set fill> Perpetrator organization - Country: <set fill> Physical target “ car dealership ” Victims: <string list> Physical target effect some damage Disease: <string> Human target - Human target effect no injury or death Filled Event Template for Disease Unstructured vs. Semi-structured Text Outbreaks Unstructured text depends 100% on language understanding. 20020714.4756 Story: Semi-structured text has some visual structure (layout) that 1 ID: can aid in understanding. August 14, 2002 Date: Event: OUTBREAK Status: confirmed Unstructured Text Semi-Structured Text none Containment: Professor John Skvoretz, U. of Laura Petitte Country: Switzerland South Carolina, Columbia, will Department of Psychology the 27 reported cases Victims: present a seminar entitled McGill University Creutzfeldt-Jakob Disease / [sporadic] Disease: “ Embedded Commitment, ” on Creutzfeldt-Jakob disease (CJD) / CJD / Thursday, May 4th from 4-5:30 Thursday, May 4, 1995 Sporadic CJD / hereditary dominant CJD / in PH 223D. 12:00 pm Swiss CJD / sporadic Creutzfeldt-Jakob Baker Hall 355 disease

  3. IE in the Wild: ProMed Disease Outbreak Reports Another Semi-Structured Seminar EBOLA HEMORRHAGIC FEVER - UGANDA (09) ************************************* A ProMED-mail post Announcement ProMED-mail, a program of ISID <http://www.promedmail.org> [see also: Ebola hemorrhagic fever - Uganda20001016.1769Ebola hemorrhagic fever - Uganda (08)20001022.1826] Name: Dr. Jeffrey D. Hermes [1] Date: Sun, 22 Oct 2000 22:18:31 -0200 Affiliation: Department of AutoImmune Diseases From: ProMED-mail <promed@promedmail.org> Source: WHO Disease Outbreaks Report, Sun 21 Oct 2000 [edited] Research & Biophysical Chemistry Merch Research Laboratories <http://www.who.int/disease-outbreak-news/> Title: “ MHC Class II: A Target for Specific [HEADLINE : 1 line] --------------------------------------------- Immunomodulation of the Immune Response ” [TEXT : 11 lines] Host/e-mail: Robert Murphy ****** [2] Date: Wednesday, May 3, 1995 Date: Sun, 22 Oct 2000 22:18:31 -0200 From: ProMED-mail <promed@promedmail.org> Time: 3:30 p.m. Source: WHO Disease Outbreaks Report, Sun 21 Oct 2000 [edited] <http://www.who.int/disease-outbreak-news/> Place: Mellon Institute Conference Room [HEADLINE : 1 line] Sponsor: MERCK RESEARCH LABORATORIES --------------------------------------------- [TEXT : 3 lines] -- [PROMED DISCLAIMER : 22 lines] Headline and Text Portions: Headline and Text Portions: Ebola Haemorrhagic Fever In Uganda - Update 5 Ebola Haemorrhagic Fever In Uganda - Update 5 As of Sat 21 Oct 2000, the Ugandan Ministry of Health has reported 139 cases As of Sat 21 Oct 2000, the Ugandan Ministry of Health has reported 139 cases including 51 deaths. The increase of 17 cases in the last 24 hours reflects including 51 deaths. The increase of 17 cases in the last 24 hours reflects the intensified active surveillance. the intensified active surveillance. A team from the WHO Collaborating Centre at the US Centers for Disease A team from the WHO Collaborating Centre at the US Centers for Disease Control and Prevention (CDC), United States is establishing a field Control and Prevention (CDC), United States is establishing a field diagnostic laboratory in Gulu district. The last laboratory equipment diagnostic laboratory in Gulu district. The last laboratory equipment arrived Sat 20 Oct 2000 and the laboratory is expected to be operational arrived Sat 20 Oct 2000 and the laboratory is expected to be operational shortly. A WHO information officer from Geneva arrived in Uganda on Wed 18 shortly. A WHO information officer from Geneva arrived in Uganda on Wed 18 Oct 2000 and is based in Gulu district. He is working with the Ugandan Oct 2000 and is based in Gulu district. He is working with the Ugandan Ministry of Health as media focal point. Ministry of Health as media focal point. Ebola Haemorrhagic Fever In Uganda - Update 6 Ebola Haemorrhagic Fever In Uganda - Update 6 As of Sun 22 Oct 2000, the Ugandan Ministry of Health has reported 149 As of Sun 22 Oct 2000, the Ugandan Ministry of Health has reported 149 cases, including 54 deaths. [This represents an increase of 10 cases and 3 cases, including 54 deaths. [This represents an increase of 10 cases and 3 deaths in the last 24 hours. - Mod.CP] deaths in the last 24 hours. - Mod.CP]

  4. Text Annotations for Event Extraction Patterns/Rules vs. Sequence Tagging perpetrator Alleged guerrilla urban commandos launched Two general approaches to event extraction: weapon target highpower bombs against a car dealership in downtown Pattern-based systems use patterns or rules which identify phrases that should be extracted location date for each event role. San Salvador this morning . A police report said that the damage injury Machine learning classifiers label individual attack set the building on fire , but did not result any tokens indicating whether they should be extracted, and if so, what role they play. casualties. Template-Filling Pipeline Example of Patterns IBM fired its CEO. IBM fired its CEO. [Alleged guerrilla urban commandos] launched Syntactic Analysis Subj VP Dobj Subject = perpetrator Event: FIRING Phrase Extraction Firer: IBM Employee: its CEO highpower bombs against a car dealership in downtown IBM fired its CEO. John Smith was let go on Monday. Coreference Event: FIRING San Salvador this morning . Firer: IBM Template Creation Employee: John Smith, CEO Date: Monday

  5. Example of Patterns Example of Patterns Alleged guerrilla urban commandos launched Alleged guerrilla urban commandos launched [highpower bombs] against a car dealership in downtown highpower bombs against [a car dealership] in downtown DirectObj = instrument PP(against) = target San Salvador this morning . San Salvador this morning . IE as Sequence Tagging Sequence Tagging Example • Event extraction can be modeled as a sequential tagging problem. A supervised sequential learner (e.g., MEMMs or Alleged guerrilla urban commandos launched two CRFs) can be trained with manually annotated texts. B PERP I PERP I PERP I PERP O B WEAPON • Each document is processed sequentially and each token is labeled with respect to event roles. highpower bombs against a car dealership in B (beginning) and I (inside) tags are needed for each role. I WEAPON O I TGT I WEAPON B TGT I TGT O For example: B PERP , I PERP , B VICTIM , I VICTIM , B WEAPON , I WEAPON • Common features: words, POS tags, dependency relations, downtown San Salvador this morning . semantic types, and a small context window of preceding/ I DATE I LOC B DATE B LOC I LOC following words.

Recommend


More recommend