temporal and event analysis of natural language texts
play

Temporal and Event Analysis of Natural Language Texts Siim Orasmaa - PowerPoint PPT Presentation

Temporal and Event Analysis of Natural Language Texts Siim Orasmaa Data Estonian Reference Corpus of the University of Tartu Variety of text genres (news, popular science, legal texts, parliamentary transcripts) Automatically


  1. Temporal and Event Analysis of Natural Language Texts Siim Orasmaa

  2. Data ● Estonian Reference Corpus of the University of Tartu – Variety of text genres (news, popular science, legal texts, parliamentary transcripts) – Automatically processed: ● Sentence and clause boundaries detected ● Morphological analysis provided ● Robust temporal expressions annotation – Based on TimeML annotation language

  3. An example of annotations http://www.keeleveeb.ee

  4. I. Comparing documents by temporal similarity ● Given a newspaper article, find temporally similar newspaper articles - articles that refer to overlapping/similar time periods; ● Task: – Preprocess/index document collection – Implement a temporal similarity measure e.g Temporal Analysis of Document Collections: Framework and Applications, Alonso et al., 2010. – Add a text similarity measure e.g Exploiting Temporal References in Text Retrieval, Arikan, 2009.

  5. I. Comparing documents by temporal similarity ● Evaluation: – Using roughly temporally parallel corpus (newspaper articles from Eesti Päevaleht 1999 and Postimees 1999) – Prepare some test data ● How well can you detect documents discussing same events? ● How much the results depend on newspaper article's category (News, Opinions, Sports, Economy etc)?

  6. II. Clustering temporal expression contexts ● More fine-grained approach: an event mention should be located somewhere near the temporal expression (e.g a verb, noun or some phrase). ● Task: – Use an unsupervised algorithm to cluster temporal expression contexts, e.g like in Word Sense Induction. e.g Unsupervised corpus-based methods for WSD, Pedersen, 2006.C – Can you detect some broad event classes? – Test the algortihm on different text genres.

  7. II. Clustering temporal expression contexts ● Discussion: – Can you propose a meaningful labeling for found clusters? – Can you draw parallels between found clusters and proposed event classifications (e.g the one in TimeML)? – Does the clustering help to organize temporal expressions for information retrieval?

Recommend


More recommend