IO IR4; November 2002 Course Work The trec_eval tool IR4 Course Iadh Ounis Winter 2002 1 About trec_eval • This evaluation tool works only on a Solaris machine – Location: /local/ir/ir4tools • Please * Read * the README file! – Instructions + useful information about the tool 2 IR Assessed Exercise 1
IO IR4; November 2002 The trec_eval Syntax •Syntax trec_eval [-q] [-a] MED.REL YOUR_TOP_REL MED.REL: The MEDLINE relevant docs file (provided in /local/ir/ir4tools) YOUR_TOP_REL: The relevant docs given by YOUR system 3 MED.REL (given) qid iter docno rel 1 0 13 1 1 0 14 1 1 0 15 1 1 0 72 1 ... 2 0 80 1 2 0 90 1 ... i.e. tuples of the form (qid, iter, docno, rel) 4 IR Assessed Exercise 2
IO IR4; November 2002 YOUR_TOP_REL qid iter docno rank sim run_id 1 0 18 0 2.789045 Bingo! 1 0 19 0 2.129078 Bingo! 1 0 31 0 2.000091 Bingo! 1 0 45 0 1.889005 Bingo! ... 2 0 58 0 4.567980 Bingo! 2 0 99 0 3.210000 Bingo! ... i.e. tuples of the form (qid, iter, docno, rank, sim, run_id) Bingo! is the name of our system (please feel free to use any other name) 5 Ensure that ... • Your 2 input files are sorted numerically by qid • YOUR_TOP_REL is also sorted so that higher similarity measures (sim) are given first (regarding a particular query) • You *read* the (short) README file! 6 IR Assessed Exercise 3
IO IR4; November 2002 At the end …. • Once your YOUR_TOP_REL file is ready (MED.REL is given in /local/ir/i4tools!), all you have to do is to write on your console: trec_eval MED.REL YOUR_TOP_REL (ensure that trec_eval, MED.REL and Your_Top_REL are copied to your local directory) Look at the -q option (but do not print it) 7 As a Result you should Get ... • A lot of tables (these tables should be included in your report and/or floppy disk according to the instructions of the README file!), things like …. Interpolated Recall - Precision Averages: at 0.00 0.49 at 0.10 0.36 at 0.20 0.32 at 0.30 0.26 etc… at 1.00 0.09 Use these values to draw your precision/recall graphs 8 IR Assessed Exercise 4
IO IR4; November 2002 Query 1 Query 2 R P R P 0.1 1 0.1 0.8 0.2 0.8 0.3 0.6 0.4 0.6 0.5 0.5 0.6 0.4 0.7 0.4 0.8 0.3 0.9 0.4 PR Curve 1 0.8 Precision 0.6 0.4 0.2 0 0 0. 0. 0. 0. 0. 0. 0. 0. 0. 1 1 2 3 4 5 6 7 8 9 Recall 9 About Matching …. • You should first process the MED.QRY file. – Hint: Open up the Query File. Take the first query. Compute the result list. Write the result list into the file YOUR_TOP_REL using the right trec_eval output format! Take the second query, so the same, etc. • For the implementation of the similarity function, we suggest you to use the similarity matching of the Best-match Model 10 IR Assessed Exercise 5
IO IR4; November 2002 Submission Guidelines • Final system and reports are due (on or before 20/12/2002) – A short design report (4-5 pages). • Inverted file structure • details of matching • details of building your inverted index – Input and output files of trec_eval ( YOUR_TOP_REL and -q flag output should be given in electronic format only) – print out of your codes. – Precision-Recall graph of the trec_eval output • For more details : See /local/ir/ir4tools/README 11 IR Assessed Exercise 6
Recommend
More recommend