CLiMB ToolKit ToolKit: A Case Study : A Case Study CLiMB of Iterative Evaluation of Iterative Evaluation in a Multidisciplinary Project in a Multidisciplinary Project Rebecca Passonneau, Roberta Blitz, David Elson, Angela Giral Columbia University Judith Klavans University of Maryland
Motivation Motivation • Fast Fast- -growing collections of digital growing collections of digital • images images • Image search using keywords Image search using keywords • • High cost of manual indexing/cataloging High cost of manual indexing/cataloging • • Potential for automated mining from Potential for automated mining from • texts about images texts about images May 2006 CLiMB -- Iterative Evaluation 2
Types of Resources Available Types of Resources Available Art or Library Knowledge NLP Tools/Knowledge Art or Library Knowledge NLP Tools/Knowledge • POS taggers taggers • POS Sources Sources • Chunkers • Chunkers • Getty Art & Architecture • Getty Art & Architecture • NamedEntity • NamedEntity Thesaurus (AAT) Thesaurus (AAT) recognizers recognizers • Library of Congress • Library of Congress • WordNet • WordNet name and subject list name and subject list • ML toolkits • ML toolkits • Library of Congress • Library of Congress Thesaurus of Graphic Thesaurus of Graphic Materials Materials May 2006 CLiMB -- Iterative Evaluation 3
Iterative Evaluation Process Iterative Evaluation Process 1. Formative Evaluation: How to optimize use of 1. Formative Evaluation: How to optimize use of NLP/Thesaural Thesaural resources resources NLP/ • Conducted after creating a development environment to • Conducted after creating a development environment to extract potential terms from texts extract potential terms from texts • Participants: heterogeneous users • Participants: heterogeneous users 2. User Study: How to investigate a proposed work 2. User Study: How to investigate a proposed work process before it exists process before it exists • Conducted after creating CLiMB CLiMB ToolKit ToolKit (Image cataloger (Image cataloger’ ’s s • Conducted after creating workbench) workbench) • • Participants: catalogers and image professionals Participants: catalogers and image professionals May 2006 CLiMB -- Iterative Evaluation 4
Text Collection Sets (TCS): Text Collection Sets (TCS): Criteria Criteria • Image Collection Image Collection • • Substantial collection of related images in digital Substantial collection of related images in digital • form form • Authoritative list of images (E.g., database Authoritative list of images (E.g., database UIDs UIDs, , • referred to as Target Object Identifiers - - TOIs TOIs) ) referred to as Target Object Identifiers • Associated electronic texts Associated electronic texts • • Text(s) • Text(s) • Discussion of many items depicted in the images Discussion of many items depicted in the images • • Authoritative discussion of image content Authoritative discussion of image content • May 2006 CLiMB -- Iterative Evaluation 5
Formative Evaluation: Formative Evaluation: TCS1: Chinese Paper Gods TCS1: Chinese Paper Gods • Image Collection: • Image Collection: Anne S. Goodrich Collection of Anne S. Goodrich Collection of Chinese Paper Gods, C.V. Starr Chinese Paper Gods, C.V. Starr East Asian Library East Asian Library • • Texts: Texts: Goodrich, Anne S. Goodrich, Anne S. Peking Paper Peking Paper Gods: A Look at Home Worship. Gods: A Look at Home Worship. Nettetal: : Steyler Steyler Verlag Verlag, 1991. , 1991. Nettetal University). Collection, C.V. Starr East Asian Library, Columbia Figure 1.2 : Pan-hu chih shen . (Anne S. Goodrich May 2006 CLiMB -- Iterative Evaluation 6
Formative Evaluation: Formative Evaluation: TCS2: Greene & Greene TCS2: Greene & Greene • • Image Collection: Image Collection: Greene & Greene Collection of Greene & Greene Collection of Architectural Records and Papers, Avery Architectural Records and Papers, Avery Architectural and Fine Arts Library Architectural and Fine Arts Library (G&G) (G&G) • • Text Collection: Text Collection: 1) Bosley Bosley, Edward R , Edward R . Greene & Greene . Greene & Greene . . 1) London:Phaidon Phaidon, 2000. , 2000. London: 2) Makinson Makinson, , Randell Randell L. L. Greene & Greene & 2) Greene. Greene. Salt Lake City : Peregrine Salt Lake City : Peregrine Smith, c1977- -1979. 1979. Smith, c1977 3) Current, William R. Greene & Greene: Greene & Greene: 3) Current, William R. Architects in the Residential Style. Architects in the Residential Style. Fort Worth: Amon Fort Worth: Amon Carter Museum Carter Museum of Western Art [1974] . . of Western Art [1974] May 2006 CLiMB -- Iterative Evaluation 7
Formative Evaluation: Formative Evaluation: Design Design • Two Two- -part survey using two of four conditions part survey using two of four conditions • • User Scenario • User Scenario • Image • Image • Free Text • Free Text • CLiMB Checklist Checklist • CLiMB • Thirteen participants who completed the survey Thirteen participants who completed the survey • • • Librarians, art historians, computer scientists, computational linguists Librarians, art historians, computer scientists, computational l inguists • Partly crossed design Partly crossed design • May 2006 CLiMB -- Iterative Evaluation 8
Two Non- -Text Conditions Text Conditions Two Non • User Scenario • Image • User Scenario • Image In this task, the survey item In this task, the survey item contained one of two hypothetical contained one of two hypothetical user scenarios. Respondents were user scenarios. Respondents were asked to list keywords and phrases asked to list keywords and phrases that could be used that could be used “ “to search for to search for relevant images in an image relevant images in an image database.” ” database. This survey item contained an This survey item contained an image. Respondents were given image. Respondents were given 1. I am writing a paper on domestic 1. I am writing a paper on domestic the following instructions: “ “ Please Please the following instructions: architecture in Southern California architecture in Southern California th century. write keywords and phrases that write keywords and phrases that in the early part of the 20 th in the early part of the 20 century. you would use to find this image in a you would use to find this image in a I was told that there are homes with I was told that there are homes with database. You may write as many database. You may write as many exteriors clad in a type of concrete exteriors clad in a type of concrete as you wish . .” ” as you wish or cement. How can I locate or cement. How can I locate images? images? May 2006 CLiMB -- Iterative Evaluation 9
Two Text Conditions Two Text Conditions • Free Text : : • Free Text • CLiMB Checklist Checklist : : Respondents • CLiMB Respondents This task contained a passage This task contained a passage were given a long list of words and were given a long list of words and from one of the texts associated from one of the texts associated phrases (117 TCS1 entries; 194 phrases (117 TCS1 entries; 194 with TCS1 or TCS2. with TCS1 or TCS2. TS2 entries) that had been TS2 entries) that had been Respondents were asked to Respondents were asked to extracted by CLiMB CLiMB tools from the tools from the extracted by “ Suppose there is a collection of Suppose there is a collection of “ same texts presented in Task 3. same texts presented in Task 3. related images that needs related images that needs Instructions were: “ Instructions were: “ Please check off Please check off metadata keywords and phrases. metadata keywords and phrases. the words and phrases that you feel the words and phrases that you feel Please select the words and Please select the words and would be suitable metadata for the would be suitable metadata for the phrases in this text that you feel phrases in this text that you feel images in the collection images in the collection . .” ” would be good metadata for the would be good metadata for the images. images. _____ garden pergola _____ garden pergola 1. Please circle 10 words or phrases 1. Please circle 10 words or phrases _____ dark green tile _____ dark green tile as your top choices . . as your top choices _____ ridge beams _____ ridge beams 2. 2. Please underline 10 as your Please underline 10 as your second tier choices . .” ” second tier choices May 2006 CLiMB -- Iterative Evaluation 10
More recommend