outline
play

Outline Introduction Motivation Methodology Experimental Results - PowerPoint PPT Presentation

K NOWLEDGE -B ASED L INGUISTIC A NNOTATION OF D IGITAL C ULTURAL H ERITAGE C OLLECTION Tuukka Ruotsalo, Lora Aroyo and Guus Schreiber Speaker: Chenhua Date: 24 th Feb 2010 Outline Introduction Motivation Methodology Experimental


  1. K NOWLEDGE -B ASED L INGUISTIC A NNOTATION OF D IGITAL C ULTURAL H ERITAGE C OLLECTION Tuukka Ruotsalo, Lora Aroyo and Guus Schreiber Speaker: Chenhua Date: 24 th Feb 2010

  2. Outline • Introduction • Motivation • Methodology • Experimental Results • Conclusion 2/24/2010 Text Mining Seminar 2

  3. Introduction • Paris was painted in 1888. • In Paris, Van Gogh painted the work in 1888. 2/24/2010 Text Mining Seminar 3

  4. Motivation Better run … 2/24/2010 Text Mining Seminar 4

  5. Research Question Is there a smart way to annotate such massive collection? 2/24/2010 Text Mining Seminar 5

  6. Methodology • Background knowledge – Structured vocabulary – Enhance performance of retrieval • Automatic annotation – Concept identification e.g. Paris as a city – Role identification e.g. Paris as a subject matter

  7. System Architecture Ontology knowledge base Named entity Phase1:Lingustic Phase2: tagging Concept Identification Part of speech analysis tagging Annotation Morphological analysis Phase3: Role Identification Dependency structure analysis Feature knowledge base 2/24/2010 Text Mining Seminar 7

  8. Knowledge Base • Art and Architecture Thesaurus (AAT) • Getty Thesaurus of Geographic (TGN) • Union List of Artist Names (ULAN) • WordNet • etc. 2/24/2010 Text Mining Seminar 8

  9. Linguistic Analysis Persons, organization, locations, Named entity miscellaneous NE tagging Part of speech Verbs, adjectives and nouns Syntactic tagging features Morphological Number: singular or plural analysis Dependency Internal dependency structure structure analysis Subject, direct object 2/24/2010 Text Mining Seminar 9

  10. Concept Identification • Define (chunking) and map meaningful units to concepts in structured vocabularies • Perform differently for nouns, verbs and NE's Mapping chucks, NE's, bi- words to KB Examples for matching NEs: NE tagged with persons ULAN  others  WordNet Phase2: Concept Syntactic features Identification 2/24/2010 Text Mining Seminar 10

  11. Role Identification • Difference between concept and Phase2: role identification Concept – “Rembrandt” is an instance of Identification concept “person”, independent of context – “Rembrandt” can take various role , e.g, creator or subject of artworks, Phase3: Role dependent of context Identification • How to do role identification task? – SVM – Based on features: Syntactic • syntactic and semantic features Feature • E.g. PoS tag, Voice of a sentence verb, PoS knowledge base path parsing constituent to verb or predicate 2/24/2010 Text Mining Seminar 11

  12. Evaluation • Using a collection of natural language descriptions of artworks. – ARIA collection from Rijksmuseum Amsterdam – 250 artworks randomly selected – Typical descriptions on “what, who, where, when and which people or culture related to the artworks • Using 3 structured vocabularies (Knowledge Base) – AAT, TGN,ULAN and WordNet • Using an artwork annotation schema – Visual Resources Association(VRA) specialized on artwork 2/24/2010 Text Mining Seminar 12

  13. Evaluation (Cont.) 2/24/2010 Text Mining Seminar 13

  14. Experimental Results • Accuracy – 61.2% – Baseline method: 57.8% – Human Annotator: 65.1% • Discussion – Performance close to the level of human annotator – Performance better than baseline method 2/24/2010 Text Mining Seminar 14

  15. Further Discussions & Future Work Co-reference resolution Improved Performance w.r.t. NE Advanced classification strategies More extensive context Knowledge base and Natural language processing techniques 2/24/2010 Text Mining Seminar 15

  16. Summary • Given a set of objects each accompanied by a text description, a set of structured vocabularies, a metadata schema, and a training set of annotations of the text descriptions, the method automatically produces annotations for the objects, and its performance is close to the level of human annotator. Knowledge- base Better performance on Annotation Natural language techniques 2/24/2010 Text Mining Seminar 16

  17. T HANKS ! 2/24/2010 Text Mining Seminar 17

  18. A PPENDIX 2/24/2010 Text Mining Seminar 18

  19. metadata 2/24/2010 Text Mining Seminar 19

  20. Feature knowledge base 2/24/2010 Text Mining Seminar 20

Recommend


More recommend