towards cross media information extraction
play

Towards Cross-Media Information Extraction Thierry D Thierry - PowerPoint PPT Presentation

Towards Cross-Media Information Extraction Thierry D Thierry Declerck, clerck, DFKI DFKI Language Technology Lab Presenting a collection of slides with contribtutions by Paul Buitelaar, Michael Sintek, Malte Kiesel (all DFKI), Manuel


  1. Towards Cross-Media Information Extraction Thierry D Thierry Declerck, clerck, DFKI DFKI Language Technology Lab Presenting a collection of slides with contribtutions by Paul Buitelaar, Michael Sintek, Malte Kiesel (all DFKI), Manuel Alcantara (UAM), Jan Nemrava (VSE) , David Sadlier (DCU) and others

  2. K-Space (2006-2008) � K-Space Overview: A Network of Excellence im 6 th Framework (see http://www.k-space.eu/ � K-Space -- Knowledge Space of Shared Technology and Integrative Research to Bridge the Semantic Gap: � K-Space was a network of research teams from academia and industry to conduct integrative research and dissemination activities in semantic inference for (semi-)automatic annotation and retrieval of multimedia content. � K-Space is aiming at closing the gap between low-level content descriptions that can be computed automatically by machines and the richness and subjectivity of semantics in high-level human interpretations of audiovisual media. � Our Work in K-Space � Adding Semantic Metadata to Audio-Video Material by Automatic Analysis of Complementary Source � Cross-Media Ontologies Cross-Media Knowledge Extraction � Slide 2 FEAST Talk; 28.01.2009

  3. Adding Semantic Metadata to Adding Semantic Metadata to Audio-Video Material by Audio-Video Material by Automatic Analysis of Automatic Analysis of Complementary Sources Complementary Sources Data we deal with in K-Space and other projects

  4. Concepts from visual Analysis => Sky, Sea, Sand, Person Contribution of possibly associated text? •Named Entities •Event (walking, swimming) •Location (Greece?) •Date/Time •Background knowledge •Etc. Picture: Courtesy of Yiannis Kompatsiaris and Stamatia Dasiopoulou Slide 4 FEAST Talk; 28.01.2009

  5. Still images and Text in Webpages (Esperonto) Slide 5 FEAST Talk; 28.01.2009

  6. Relevant Text Regions � Title of the document � Caption text: „ Click on the image to enlarge “ (a non relevant item, to be filtered, also on the base of lexical properties of the words). � Content of the HTML „ Alt “ tag: “ 'VEGETABLE GARDEN WITH DONKEY' ” � Content of the HTML „ Src “ tag: http://www.spanisharts.com/reinasofia/miro/burro _lt.jpg � Abstract text � Running text Slide 6 FEAST Talk; 28.01.2009

  7. Linguistic Analysis of the Text Regions � „ Alt “ text: 'VEGETABLE GARDEN WITHDONKEY' <NP HEAD= “ garden ” PRE_MOD= “ vegetable ” <POST_MOD CAT= “ PP ” HEAD= “ with ” NP_COMP_HEAD= “ donkey ” </POST_MOD> </NP> � Abstract/Running text: “… This picture depicts the rural landcape of Montroig …” <SENT SUBJ= “ This picture ” PRED= “ depicts ” OBJ= “ the rural lansdscape of Montroig ” </SENT> � Detailled annotation of the direct_object: <NP HEAD= “ landscape ” PRE_MOD= “ rural ” <POST_MOD CAT= “ PP ” HEAD= “ of ” NP_COMP_HEAD= “ Montroig ” </POST_MOD> </NP> Slide 7 FEAST Talk; 28.01.2009

  8. The Semantic Annotation (1) � The Toy Artwork Ontology (schematized) � Object > Arwtork > Painting [has_creator, has_name, has_subject, has_dimension,has_material, has_genre, has_date...] � Person > Artist > Painter [has_name, has_birth_date, part_of_artistic_movement … ] Slide 8 FEAST Talk; 28.01.2009

  9. The Semantic Annotation (2) � The Instantiation � Title: Vegetable garden with donkey � Creator: Miro � Date: 1918 � Genre: na ï ve (if correctly extracted by some reasoning on the linguistically and semantically annotated text) � Subject: rural landscape of Montroig + garden and donkey (if the association between the title and the explanation given by the art expert can be grouped). � Dimension: 65x71 � Material: Oil on canvas Slide 9 FEAST Talk; 28.01.2009

  10. TRECVid: Linguistic Analysis of Transcripts attached to Video � Language Analysis can provide till a certain extent for structured analysis of transcripts delivered with the TRECVid shots � Identify the Part-of-Speech (POS) of words contained in the “freetextannotation” MPEG-7 tags of the shots � Identify Named Entities in those annotation and structure them � Identify Phrasal Structures in those annotations Slide 10 FEAST Talk; 28.01.2009

  11. Example of Transcript attached to Video <VideoSegment id="shot6_68"> <MediaTime> <MediaTimePoint>T00:10:41:5216F30000</MediaTimePoint <MediaDuration>PT13S26416N30000F</MediaDuration> </MediaTime> <TextAnnotation confidence="0.500000"> <FreeTextAnnotation>FACILITIES OFTEN BEYOND THE REACH OF THE AVERAGE FABRICATE ARE ARE MADE AVAILABLE THROUGH THE SERVICE INCLUDING PRESS EQUIPMENT CAPABLE OF HANDLING THE LARGEST ALUMINUM DRAWS EVER MADE PLUS A WORK FORCE OF SKILLED </FreeTextAnnotation> </TextAnnotation> </VideoSegment> Slide 11 FEAST Talk; 28.01.2009

  12. Extracted Transcripts shot6_28 THIS MOST REMARKABLE METAL USE ITS TREMENDOUS ABUNDANCE AS A RAW MATERIAL IN AMERICA THE RICHEST OF THESE COMMERCIAL GRADE ALUMINUM DEPOSITS ARE LOCATED IN THE CENTRAL REGION OF ARKANSAS ALTHOUGH TRACES OF ALUMINUM MAY BE FOUND IN ALMOST ANY SOIL ONLY THOSE PLAYS CONTAINING FIFTY OR SIXTY PERCENT ALUMINUM WAR AND KNOWN AS BLOCK SITE ARE MINED FOR COMMERCIAL FORGOT AND HERE. shot6_45 INCREASING AMOUNT OF THIS A LUMINA IS BEING USED IN CHEMICAL PROCESSING IN SOIL CONDITIONERS AND ABRASIVE US AND MANY OTHER IN ORDER TO REDUCE THE LUMINA TO SOLID ALUMINUM IT'S TRANSFERRED TO. shot6_80 THE USE OF ALUMINUM IS EQUALLY EFFECTIVE INSIDE AS WELL AS OUT FROM TABLE LAMP TO A COAST TO COAST CEILING ALUMINUM CONTRIBUTES TO A DECORATIVE MODERN TOUCH TO OFFICE AND HOME BECAUSE. shot6_81 ALUMINUM REFLECTS UP TO NINETY FIVE PERCENT OF ALL RADIANT HEAT AND EFFECTIVELY STOPS MOISTURE IT FINDS EXTENSIVE YOU'LL SEE IN ALL TYPES OF INSULATION THESE SAME REFLECTIVE. shot7_1 IN THE FIELD OF HOME APPLIANCES FOR DECORATIVE AS WELL AS FUNCTIONAL USES ALUMINUM HAS NO EQUAL. …….. Slide 12 FEAST Talk; 28.01.2009

  13. POS Tagging of Transcripts shot6_28 THIS MOST REMARKABLE METAL USE ITS TREMENDOUS ABUNDANCE AS A RAW MATERIAL IN AMERICA THE RICHEST OF THESE COMMERCIAL GRADE ALUMINUM DEPOSITS ARE LOCATED IN THE CENTRAL REGION OF ARKANSAS ALTHOUGH TRACES OF ALUMINUM MAY BE FOUND IN ALMOST ANY SOIL ONLY THOSE PLAYS CONTAINING FIFTY OR SIXTY PERCENT ALUMINUM WAR AND KNOWN AS BLOCK SITE ARE MINED FOR COMMERCIAL FORGOT AND HERE. <text> <token id="1" pos="CARD">shot6_28</token> <token id="2" pos="DT" lemma="this">THIS</token> <token id="3" pos="JJS" lemma="most">MOST</token> <token id="4" pos="JJ" lemma="remarkable" morph="3">REMARKABLE</token> <token id="5" pos="NN" lemma="metal" morph="1">METAL</token> <token id="6" pos="VB" lemma="use" morph="10">USE</token> <token id="7" pos="PRP$" lemma="its">ITS</token> <token id="8" pos="JJ" lemma="tremendous" morph="3">TREMENDOUS</token> <token id="9" pos="NN" lemma="abundance" morph="1">ABUNDANCE</token> <token id="10" pos="NNP" lemma="as" morph="1">AS</token> <token id="11" pos="DT" lemma="a" morph="24">A</token> <token id="12" pos="JJ" lemma="raw" morph="3">RAW</token> <token id="13" pos="NN" lemma="material">MATERIAL</token> <token id="14" pos="IN" lemma="in">IN</token> Slide 13 FEAST Talk; 28.01.2009

  14. Problems with Transcripts and possible Remedies • Problems • Noise (context of recording) • Quality of the underlying ASR tools • Transcripts all in Capital letters • Lack of punctuation signs • Possible Remedies • Use of manually annotated speech corpora for improving POS tagging • Use of related textual sources for improving lexical coverage and syntactic boundaries • Use of related domain knowledge bases and metadata for improving lexical coverage, syntactic boundaries and semantic annotation Slide 14 FEAST Talk; 28.01.2009

  15. Metadata of a Broadcaster (MESH Project) • Deutsche Welle provides in the MESH project for data consisting on audio/video material and textual metadata. This is a very valuable data set, since the textual metadata consists also in manually manually annotated scenes descriptions. • This dataset can be used for building a training corpus for automated alignment of video, audio and text data. Slide 15 FEAST Talk; 28.01.2009

  16. The Metadata Labels � <DOC filename=„0324000-3_Journal_ ENG_F4001C_26122003_2000“> � <TYPE>Earthquake Iran</TYPE> � <SERIES>Journal F: 4001 C</SERIES> � <SEG sid=“integer”> � <TITLE></TITLE> � <DESCRIPTION></DESCRIPTION> � <SCENES> </SCENES> � <KEYWORDS></KEYWORDS> � </SEG> � </DOC> Slide 16 FEAST Talk; 28.01.2009

Recommend


More recommend