Robustness? Robustness ? Robustness? - PDF document

� �� Robustness? Robustness ? Robustness? �� Thomas Mandl �� !�"##$� Information Science • Robust … means … capable of functioning Universität Hildesheim mandl@uni-hildesheim.de correctly, (or at the very minimum, not failing catastrophically) under a great many Robust Task - conditions. (http://www.reference.com/) Result Overview and Lessons Learned from Robustness • Robust IR means the capability of an IR Evaluation system to work well (and reach at least a minimal performance) under a variety of conditions (topics, difficulty, collections, users, languages …) �� Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 1 2 Variety of of conditions conditions … … System Variance System System Variance Variance Variety Variety of conditions … 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.1 0.2 0 0.1 Mono FR Mono EN Mono PT Bi ->FR 0 Mono FR Mono EN Mono PT Bi ->FR Variance between topics Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 3 4 Robust Task Task 2007 2007 History of Robust IR Evaluation of Robust IR Evaluation Robust Robust Task 2007 History History of Robust IR Evaluation • TREC • Again … – Mono-lingual Retrieval – Use topics and relevance assessment from previous CLEF campaigns – 2003 - 2005 – Take a different perspective and use a robust • CLEF evaluation measure (GMAP) – Mono-, bi- and Multilingual Retrieval – Emphasize the difficult (= low performing) – 2006 six languages topics – 2007 three languages Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 5 6 1

Training and Test Training and Test Which system Which system is is better? better? Training and Test Which system is better? • CLEF 2001, 2002 and 2003 for training 1 0.9 • CLEF 2004, 2005 and 2006 for testing 0.8 n Topics ∏ = geoAve 0.7 x n I i 0.6 II = 1 i 0.5 III 0.4 0.3 T o p ic S y s te m R e s u lt T o p ic S y s te m R e s u lt 0.2 1 A 0 .1 1 B 0 .2 0.1 0 2 A 0 .1 2 B 0 .2 Result A Result B 3 A 0 .9 3 B 0 .6 G e o A v e A 0 .2 1 G e o A v e B 0 .2 9 M A P A 0 .3 7 M A P B 0 .3 3 Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 7 8 Collections Collections Collections Robust Task Robust Task 2007 Robust Task 2007 2007 Language Target Collection Training Test • �� Topics Topics • �� English Los Angeles Times 1994 41-200 251-350 • �� • �� French Le Monde 1994 41-140 251-350 • �� Swiss News Agency 94 Portuguese P ú blico 1995 - 201-350 Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 9 10 Participation Results Participation Participation Results Results Mono English • 63 runs submitted by 7 groups Rank Participant Experiment MAP GMAP 1st reina 10.2415/AH-ROBUST-MONO-EN-TEST- 38.97% 18.50% • 2006: 133 runs by 8 groups CLEF2007.REINA.REINAENTDNT 2nd daedalus 10.2415/AH-ROBUST-MONO-EN-TEST- 37.78% 17.72% CLEF2007.DAEDALUS.ENFSEN22S 3rd hildesheim 10.2415/AH-ROBUST-MONO-EN-TEST- 5.88% 0.32% CLEF2007.HILDESHEIM.HIMOENBRFNE Mono Portuguese Rank Participant Experiment MAP GMAP 10.2415/AH-ROBUST-MONO-PT-TEST- 1st reina CLEF2007.REINA.REINAPTTDNT 41.40% 12.87% 10.2415/AH-ROBUST-MONO-PT-TEST- 2nd jaen CLEF2007.JAEN.UJARTPT1 24.74% 0.58% 10.2415/AH-ROBUST-MONO-PT-TEST- 3rd daedalus CLEF2007.DAEDALUS.PTFSPT2S 23.75% 0.50% 10.2415/AH-ROBUST-MONO-PT-TEST- 4th xldb CLEF2007.XLDB.XLDBROB16 1.21% 0.071% Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 11 12 2

Results Results Mono English Mono English Results Mono Results Mono Portuguese Portuguese Results Mono English Results Mono Portuguese Ad−Hoc Robust Monolingual English Test Task Top 5 Participants − Standard Recall Levels vs Mean Interpolated Precision Ad−Hoc Robust Monolingual Portuguese Test Task Top 5 Participants − Standard Recall Levels vs Mean Interpolated Precision 100% 100% reina [Experiment REINAENTDNT; MAP 38.97%; Not Pooled] reina [Experiment REINAPTTDNT; MAP 41.40%; Not Pooled] daedalus [Experiment ENFSEN22S; MAP 37.78%; Not Pooled] jaen [Experiment UJARTPT1; MAP 24.74%; Not Pooled] 90% hildesheim [Experiment HIMOENBRFNE; MAP 5.88%; Not Pooled] 90% daedalus [Experiment PTFSPT2S; MAP 23.75%; Not Pooled] xldb [Experiment XLDBROB16_10; MAP 1.21%; Not Pooled] 80% 80% 70% 70% 60% 60% Precision Precision 50% 50% 40% 40% 30% 30% 20% 20% 10% 10% 0% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Recall Recall Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 13 14 Results Results Results Results Mono French Results Results Mono French Mono French Ad−Hoc Robust Monolingual French Test Task Top 5 Participants − Standard Recall Levels vs Mean Interpolated Precision Mono French 100% unine [Experiment UNINEFR1; MAP 42.13%; Not Pooled] Rank Participant Experiment MAP GMAP reina [Experiment REINAFRTDET; MAP 38.04%; Not Pooled] 1st unine 10.2415/AH-ROBUST-MONO-FR-TEST- 42.13% 14.24% 90% jaen [Experiment UJARTFR1; MAP 34.76%; Not Pooled] CLEF2007.UNINE.UNINEFR1 daedalus [Experiment FRFSFR22S; MAP 29.91%; Not Pooled] hildesheim [Experiment HIMOFRBRF2; MAP 27.31%; Not Pooled] 2nd reina 10.2415/AH-ROBUST-MONO-FR-TEST- 38.04% 12.17% 80% CLEF2007.REINA.REINAFRTDET 70% 3rd jaen 10.2415/AH-ROBUST-MONO-FR-TEST- 34.76% 10.69% CLEF2007.JAEN.UJARTFR1 4th daedalus 10.2415/AH-ROBUST-MONO-FR-TEST- 29.91% 7.43% 60% CLEF2007.DAEDALUS.FRFSFR22S Precision 50% 5th hildesheim 10.2415/AH-ROBUST-MONO-FR-TEST- 27.31% 5.47% CLEF2007.HILDESHEIM.HIMOFRBRF2 40% Bi -> French 30% Rank Participant Experiment MAP GMAP 10.2415/AH-ROBUST-BILI-X2FR-TEST- 20% 1st reina CLEF2007.REINA.REINAE2FTDNT 35.83% 12.28% 10.2415/AH-ROBUST-BILI-X2FR-TEST- 10% 2nd unine CLEF2007.UNINE.UNINEBILFR1 33.50% 5.01% 10.2415/AH-ROBUST-BILI-X2FR-TEST- 0% 3rd colesun CLEF2007.COLESUN.EN2FRTST4GRINTLOGLU001 22.87% 3.57% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Recall Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 15 16 Results Bi Bi- -lingual X lingual X - -> French > French Approaches Results Results Bi-lingual X -> French Approaches Approaches Ad−Hoc Robust Bilingual Test Task, French target collection(s) Top 5 Participants − Standard Recall Levels vs Mean Interpolated Precision 100% reina [Experiment REINAE2FTDNT; MAP 35.83%; Not Pooled] • Adoption of traditional and “advanced” CLIR unine [Experiment UNINEBILFR1; MAP 33.50%; Not Pooled] 90% colesun [Experiment EN2FRTST4GRINTLOGLU001; MAP 22.87%; Not Pooled] methods 80% – BM 25 ( Miracle ) 70% – N-gram translation ( CoLesIR ) 60% Precision – Weighting, stemming ( Uni NE ) 50% 40% 30% • Adoption of “robust” heuristics 20% – Expansion with an external resource ( SINAI ) 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Recall Thomas Mandl: Robust CLEF 2007 - Overview Thomas Mandl: Robust CLEF 2007 - Overview 17 18 3

Robustness? Robustness ? Robustness? - PDF document

Robustness? Robustness ? Robustness? Thomas Mandl

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Robustness and SMC Adam Pechner Overview What is Robustness and why do we care? Different

S9932: LEARNING TO BOOST S9932: LEARNING TO BOOST ROBUSTNESS FOR ROBUSTNESS FOR AUTONOMOUS

Trade-off between Efficiency and Robustness Doctoral Colloqium @ SenSys18, Shenzhen Robert

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Point sets, Maps and Navigation - II D.A. Forsyth Robustness is a serious problem Robustness is

Algorithms in Nature Network robustness Slides adapted from Carl Kingsford Network robustness

Matrix Robustness, with an Application to Power System Observability Matthias Brosemann Jochen

Distributed Robustness Analysis Anders Hansson Division of Automatic Control Link oping

Optimization over Integers with Robustness in Cost and Few Constraints Kai-Simon Goetzmann

Disk Drives and Geometry File Systems: Performance & Robustness 11G. File System Performance

Numerical Robustness (for Geometric Calculations) Christer Ericson Sony Computer Entertainment

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Funciones de agregaci c Torra 1 Vicen Octubre, 2014 1 Institut dInvestigaci o en Intel

Vtor Silva Sousa Panografia Unio de imagens de um dado repositrio, dado que existem

Sumrio 1 Introduo ao Processamento de Consultas 2 Otimizao de Consultas 3 Plano de

Resoluo Numrica de Sistemas Lineares Parte I Profs.: Bruno Correia da Nbrega Queiroz

Specification and Abstraction of Semantics Patrick Cousot Radhia Cousot cole normale

Future Outlook Nufact2017, Uppsala, 25-30 September 2017 Apologies ... ... my record as crystal

Dont kill the Internet of Things Jaap-Henk Hoepman TNO ICT, Groningen, the Netherlands

Mix Unitary Categories Robin Cockett, Cole Comfort, and Priyaa Srinivasan CT2018, Ponta Delgada,

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Robustness? Robustness ? Robustness? - PDF document

Robustness? Robustness ? Robustness? Thomas Mandl

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Robustness and SMC Adam Pechner Overview What is Robustness and why do we care? Different

S9932: LEARNING TO BOOST S9932: LEARNING TO BOOST ROBUSTNESS FOR ROBUSTNESS FOR AUTONOMOUS

Trade-off between Efficiency and Robustness Doctoral Colloqium @ SenSys18, Shenzhen Robert

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Point sets, Maps and Navigation - II D.A. Forsyth Robustness is a serious problem Robustness is

Algorithms in Nature Network robustness Slides adapted from Carl Kingsford Network robustness

Matrix Robustness, with an Application to Power System Observability Matthias Brosemann Jochen

Distributed Robustness Analysis Anders Hansson Division of Automatic Control Link oping

Optimization over Integers with Robustness in Cost and Few Constraints Kai-Simon Goetzmann

Disk Drives and Geometry File Systems: Performance &amp; Robustness 11G. File System Performance

Numerical Robustness (for Geometric Calculations) Christer Ericson Sony Computer Entertainment

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Funciones de agregaci c Torra 1 Vicen Octubre, 2014 1 Institut dInvestigaci o en Intel

Vtor Silva Sousa Panografia Unio de imagens de um dado repositrio, dado que existem

Sumrio 1 Introduo ao Processamento de Consultas 2 Otimizao de Consultas 3 Plano de

Resoluo Numrica de Sistemas Lineares Parte I Profs.: Bruno Correia da Nbrega Queiroz

Specification and Abstraction of Semantics Patrick Cousot Radhia Cousot cole normale

Future Outlook Nufact2017, Uppsala, 25-30 September 2017 Apologies ... ... my record as crystal

Dont kill the Internet of Things Jaap-Henk Hoepman TNO ICT, Groningen, the Netherlands

Mix Unitary Categories Robin Cockett, Cole Comfort, and Priyaa Srinivasan CT2018, Ponta Delgada,

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Disk Drives and Geometry File Systems: Performance & Robustness 11G. File System Performance