Cross-lingual Predicate Cluster Acquisition to Improve Bilingual Event Extraction by Inductive Learning Heng Ji Computer Science Department Queens College and The Graduate Center The City University of New York hengji@cs.qc.cuny.edu have been explored for ACE multi-lingual event Abstract extraction (e.g. Grishman et al., 2005; Ahn, 2006; Hardy et al., 2006; Tan et al., 2008; Chen and Ji, In this paper we present two approaches to 2009). All of these previous literatures showed that automatically extract cross-lingual predi- one main bottleneck of event extraction lies in low cate clusters, based on bilingual parallel recall. It’s a challenging task to recognize the dif- corpora and cross-lingual information ex- ferent forms in which an event may be expressed, traction. We demonstrate how these clus- given the limited amount of training data. The goal ters can be used to improve the NIST of this paper is to improve the performance of a Automatic Content Extraction (ACE) event bilingual (English and Chinese) state-of-the-art extraction task 1 . We propose a new induc- event extraction system without accessing its inter- tive learning framework to automatically nal algorithms or annotating additional data. augment background data for low- As for a separate research theme, extensive confidence events and then conduct global techniques have been used to produce word clus- inference. Without using any additional ters or paraphrases from large unlabeled corpora data or accessing the baseline algorithms (Brown et al., 1990; Pereira et al., 1993; Lee and this approach obtained significant im- Pereira, 1999, Barzilay and McKeown, 2001; Lin provement over a state-of-the-art bilingual and Pantel, 2001; Ibrahim et al., 2003; Pang et al., (English and Chinese) event extraction sys- 2003). For example, (Bannard and Callison-Burch, tem. 2005) and (Callison-Burch, 2008) described a method to extract paraphrases from largely avail- able bilingual corpora. The resulting clusters con- 1 Introduction tain words with similar semantic information and therefore can be useful to augment a small amount Event extraction, the ‘classical’ information extrac- of annotated data. We will automatically extract tion (IE) task, has progressed from Message Un- cross-lingual predicate clusters using two different derstanding Conference (MUC)-style single approaches based on bilingual parallel corpora and template extraction to the more comprehensive cross-lingual IE respectively; and then use the de- multi-lingual Automatic Content Extraction (ACE) rived clusters to improve event extraction. extraction including more fine-grained types. This We propose a new learning method called in- extension has made event extraction more widely ductive learning to exploit the derived predicate applicable in many NLP tasks including cross- clusters. For each test document, a background lingual document retrieval (Hakkani-Tur et al., document is constructed by gradually replacing the 2007) and question answering (Schiffman et al., low-confidence events with the predicates in the 2007). Various supervised learning approaches same cluster. Then we conduct cross-document inference technique as described in (Ji and Grish- 1 http://www.nist.gov/speech/tests/ace/ 27 Proceedings of the NAACL HLT Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics , pages 27–35, Boulder, Colorado, June 2009. c � 2009 Association for Computational Linguistics
man, 2008) to improve the performance of event organization, and the time during which the event extraction. This inductive learning approach happened is “ Wednesday ”. matches the procedure of human knowledge acqui- 3 Approach Overview sition and foreign language education: analyze in- formation from specific examples and then discover a pattern or draw a conclusion; attempt 3.1 System Pipeline synonyms to convey/learn the meaning of an intri- cate word. Figure 1 depicts the general procedure of our ap- The rest of this paper is structured as follows. proach. The set of test event mentions is improved Section 2 describes the terminology used in this by exploiting cross-lingual predicate clusters. paper. Section 3 presents the overall system archi- tecture and the baseline system. Section 4 then de- Cross-lingual Predicate scribes in detail the approaches of extracting cross- Cluster Acquisition lingual predicate clusters. Section 5 describes the motivations of using cross-lingual clusters to im- Unlabeled Parallel prove event extraction. Section 6 presents an over- Corpora Corpora view of the inductive learning algorithm. Section 7 Test presents the experimental results. Section 8 com- Alignment Document Cross-lingual pares our approach with related work and Section 9 Based Clustering IE then concludes the paper and sketches our future work. Baseline Predicate Clusters Event Extraction 2 Terminology The event extraction task we are addressing is that Inductive Learning of ACE evaluations. ACE defines the following terminology: Low-confidence Background Test Events Event Document entity : an object or a set of objects in one of the Replacement semantic categories of interest mention : a reference to an entity (typically, a Cross-document Background Baseline Inference Events Event Extraction noun phrase) event trigger : the main word which most clearly expresses an event occurrence Improved event arguments : the mentions that are in- Test Events volved in an event (participants) event mention : a phrase or sentence within Figure 1. System Overview which an event is described, including trigger and arguments The following section 3.2 will give more details about the baseline bilingual event tagger. Then we The 2005 ACE evaluation had 8 types of events, will present the predicate cluster acquisition algo- with 33 subtypes; for the purpose of this paper, we rithm in section 4 and the method of exploiting will treat these simply as 33 distinct event types. clusters for event extraction in section 6. For example, for a sentence “ Barry Diller on 3.2 A Baseline Bilingual Event Extraction Wednesday quit as chief of Vivendi Universal En- System tertainment ”, the event extractor should detect all the following information: a “ Personnel_End- We use a state-of-the-art bi-lingual event extrac- Position ” event mention, with “ quit ” as the trigger tion system (Grishman et al., 2005; Chen and Ji, word, “ chief ” as an argument with a role of “posi- 2009) as our baseline. The system combines pat- tion”, “ Barry Diller ” as the person who quit the tern matching with a set of Maximum Entropy position, “ Vivendi Universal Entertainment ” as the classifiers: to distinguish events from non-events; 28
Recommend
More recommend