How to to Ev Evaluate te Ex Explorato tory User Inte terfaces? - PowerPoint PPT Presentation

Data & Knowledge Engineering Group How to to Ev Evaluate te   Ex Explorato tory User Inte terfaces?   SIGIR 2011 Workshop on "ente terta tain me":   Supporti ting Complex Search Tasks Tatiana Gossen, Stefan Haun, Andreas Nürnberger Email: tatiana.gossen@ovgu.de

Agenda  Introduction & Background  Evaluation challenges  Methodological shortcomings  Benchmark evaluation  Conclusion T. Gossen 8/1/11 8/1/11 2 2

Introduction & Background  Complex Information Needs (CIN)  Creative discovery of information, i.e. relations between concepts in data sets  Simple example: build association chain between amino acids and Gerardus Johannes Mulder T. Gossen 8/1/11 8/1/11 3 3

Introduction & Background  Complex Information Needs (CIN)  Creative discovery of information, i.e. relations between concepts in data sets  Simple example: build association chain between amino acids and Gerardus Johannes Mulder  Using Wikipedia as a document collection:  Amino acids are critical to life, and have many functions in metabolism. One particularly important function is to serve as the building blocks of proteins, which are linear Doc 1 chains of amino acids. Amino acids can be linked together in varying sequences to form a vast variety of proteins.  Proteins were first described by the Dutch chemist Gerardus Johannes Mulder and named by the Swedish Doc 2 chemist Jöns Jacob Berzelius in 1838. T. Gossen 8/1/11 8/1/11 4 4

Introduction & Background  Complex Information Needs (CIN)  Creative discovery of information, i.e. relations between concepts in data sets  Undirected search for relevant information within the data  Scenario: analysts explore collections of text documents to help investigators uncover stories, plots, and threats embedded. T. Gossen 8/1/11 8/1/11 5 5

Introduction & Background  Tools example Screenshot of the Creative Exploration Toolkit (CET) [Haun, 2010] T. Gossen 8/1/11 8/1/11 6 6

Evaluation challenges Research question: how to evaluate such systems?  Requires collaboration with domain experts for creating scenarios and participation  CINs are usually vaguely defined and require much user time to be solved T. Gossen 8/1/11 8/1/11 7 7

Methodological shortcomings  Comparative evaluation  IR automated evaluation of ranking algorithms requires:  Set of test queries  Document collections with labels according to relevancies (e.g. TREC) Available  Measures (e.g. Average Precision)  CIN exploration system user evaluation requires:  Standardized evaluation methodology ?  Benchmark data sets  Benchmark tasks and standard solutions  Evaluation measures T. Gossen 8/1/11 8/1/11 8 8

Benchmark evaluation  Two parts:  “small" controlled experiment  Qualitative data, i.e. feedback  No explicit task  Large-scale study  Quantitative data  Time  Success rate  Interaction logs  Feedback  Use VAST (Visual Analytics Science and Technology) benchmark data with an investigative task as benchmark data set, task and solution T. Gossen 8/1/11 8/1/11 9 9

Benchmark evaluation  Evaluation measures - still open question:  How to judge creativity?  How to judge partially correct answers?  Can we do automatic evaluation of exploration systems for CIN?  Reduce costs for participants?  Can we model the user creativity process? T. Gossen 8/1/11 8/1/11 10 10

Conclusion  Evaluation of CIN exploration tools using  standardized evaluation methodology,  in combination with benchmark data sets,  tasks & solutions,  and measures  Only then can discovery tools designers evaluate their tools more efficiently T. Gossen 8/1/11 8/1/11 11 11

Q&A T. Gossen 8/1/11 8/1/11 12 12

How to to Ev Evaluate te Ex Explorato tory User Inte terfaces? - PowerPoint PPT Presentation

Data & Knowledge Engineering Group How to to Ev Evaluate te Ex Explorato tory User Inte terfaces? SIGIR 2011 Workshop on "ente terta tain me": Supporti ting Complex Search Tasks Tatiana Gossen, Stefan Haun,

Re g ula tory Drive rs for Bioma ss in the Unite d King dom BRE A 2016 Inte rna tiona l Confe

ILAC ILAC The The Inte Intern rnat ationa ional l La Labo bora rato tory ry Acc

Assignment 5 groups of 3-4 students Purpose: learning to evaluate orga ganised nised

1 Inte r nal Contr ol Compone nts F ive Co mpo ne nts o f I nte rna l Co ntro l Co

Multiple Attribute Scoring Test to Evaluate Ecological and User Capacities for National Parks

CS 5150 So(ware Engineering 11. Evalua5on and User Tes5ng William Y. Arms The

User interface for learning Aim: Design for and evaluate learnability Writing inline

User interface for learning Aim: Design for and evaluate learnability Writing inline

2020-07-29_SHPWG_Issue1-Themes Address Calibrate, dynamics of Review the Evaluate Evaluate

Summa ry of Comme nts Inte r fund L oans E xplaine d Se we r F und Inte r fund L

.. Warm Welcome .. To WHO WE ARE 2 T T he Inte rna tiona l L he Inte rna tiona l L

1 5 6 7 8 Analyze and Produce Evaluate design understand paper-based With end-users user

Alte ternate te De Definiti tions (Ru Human inte telligence (Russell + Norv ssell + Norvig

YMMV Ov Overv erview iew In Inte tel NV l NVM M Em Emul ulat ator or

AGE GEND NDA 0 In Introducti oduction on of of C# C# 0 History tory of of C# C# 0 Des

Pla nt Re se a rc h Workshop Gre e nhouse Re se a rc h a nd L a bora tory Sa fe ty Offic e

Beyond Classification: La Latent User Interests Profiling from Visual Contents Analysis Lon

In Inte ter-Deanery ry Sem Semin inar ar Pape aper Compe mpetitio ion n On On Hum

Heuristi tic Search. In uninformed search, we dont try to evaluate which of the nodes on

A Study of Urba n T ra nsport Institutiona l, F ina nc ia l a nd Re g ula tory F ra me

T ELL Y OUR S TORY WITH S TYLE : P UBLIC SPEAKING TIPS with Courtney Coffey, Susan Richards and

where user experience and software engineering meet Andrew J. Ko

Evaluate the effectiveness of your social media marketing plan - implement - evaluate --- amend

PRESENTATION Busb a r suppo rts with po rc e la in a nd po lyme r insula tio n a re inte nd e d