Native script Documents Transliterated Queries Transliterated Documents Native script Queries
5 teams, 25 runs
FIRE 2014 Shared Task on Transliterated Search Overview
Contributors: Yesha Shah, Swati Jhawar, Ria Gupta (DA-IICT), Kalika Bali (MSR India) All participants, Task coordinators
Team Runs System BIT 2 word bi-gram query in both scripts using Google Transliterate; query expansion using pseudo-relevance feedback. BITS- 2 Back-transliterated the queries and docs to Devanagari using Google Transliteration engine; removed vowels as part of the normalising step and Lipyantaran indexed character n-grams as tokens. DCU 2 Dictionary of cross-script equivalents from the documents in the corpus which contained the song in both scripts. Transliteration engine for OOV. Edit-distance based term matching; word bigram index IIITH 1 Roman as the operating script; normalisation rules like repetition of the same character was replaced by single occurrence. Edit distance based matching Total: 7
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 BITS-L IIITH DCU BIT NDCG5 NDCG10 MAP CSR@10
Type Example nDCG@5 join/split koi ek taara ek taara (0.14) 0.286 tune diithi iye anguthi (0.15) 2013 best: 0.805 Other bin tere (0.67) 0.617 2014 best: 0.757 Roman din dhal jaye (0.69) तेरे मेरे बिच मेः क ै सा है ये िंधन (0.91) Devanagari 0.722 तेरे मेरे सपने अि एक रंग (0.93)
Total Runs received: 54 Runs accepted: 39 Rejected Runs: Transliterate-Kgp (3X3=9) 1 more team, 6 more runs. 13
Training Data: FIRE 2013 (query): 100 • Facebook forum: 700 (no • transliteration) #tokens: 20.6k (364) • 1 0.9 Test Data: 0.8 FIRE 2013 (query): 100 • 0.7 Facebook forum: 639 (no • 0.6 transliteration) 0.5 #tokens: 17.3k (397) • 0.4 0.3 Sub-track Winner: 0.2 • JU-NLP-Lab 0.1 0 Contributors: BMS-Brainz IIITH IITP-TS ISI JU-NLP • Amitava Das LA TF ETPM Score
Training Data: FIRE 2013 (query): 150 • FIRE 2013 best: 0.976 #tokens: 937 (890) • 1 0.9 Test Data: 0.8 FIRE 2013 (query): 150 • 0.7 #tokens: 1078 (1064) • 0.6 0.5 Sub-track Winner: 0.4 • DA-IR 0.3 0.2 Contributors: 0.1 • DA-IICT 0 BMS-Brainz DA-IR IIITH LA TF ETPM Score
Training Data: • FIRE 2013 (query): 500 • Facebook forum: 700 (no transliteration) • Facebook forum: 30 • #tokens: 27.6k (2420) 1 0.9 Test Data: 0.8 • FIRE 2013 (query): 500 0.7 • Facebook forum: 708 (no 0.6 transliteration) 0.5 • Facebook forum: 63 0.4 • #tokens: 32.1k (2512) 0.3 0.2 0.1 Sub-track Winner: 0 • IITP-TS Contributors: Amitava Das • MSR India • LA TF Score
Training Data: None • Test Data: 1 Blogs: 119 • #tokens: 1271 (815) 0.9 • 0.8 0.7 Sub-track Winner: • BMS-Brainz 0.6 0.5 0.4 Contributors: Dr. Shambhavi B. R. (BMS) • 0.3 Dr. B. M. Sagar (RVCE) • 0.2 Sandesh (BMS) • 0.1 Shweta Kulkarni (BMS) • 0 • Abhishek J. (BMS) BMS-Brainz I1 IIITH LA TF ETPM Score
Training Data: Blogs: 150 • #tokens: 1914 (0) • 1 Test Data: 0.9 Blogs: 120 • 0.8 #tokens: 1473 (885) • 0.7 0.6 Sub-track Winner: 0.5 • IIITH 0.4 0.3 Contributors: 0.2 Rekha Vaidyanathan (NIT • 0.1 Bhopal, TCS) 0 BMS-Brainz IIITH LA TF ETPM Score
Training Data: None • Test Data: 1 Blogs: 49 • 0.9 #tokens: 974 (0) • 0.8 0.7 Sub-track Winner: 0.6 • IIITH 0.5 0.4 Contributors: 0.3 Dr. Dinesh Jayagopi (IIIT • 0.2 Bangalore) 0.1 Arun Prasad (IIITB) • 0 Kumaresh Krishnan (IIITB) • BMS-Brainz IIITH • P. S. Srinivasan (IIITB) LA TF Score
Training Data: • FIRE 2013 (query): 500 • Facebook forum: 700 (no transliteration) • Facebook forum: 30 • #tokens: 27.6k (2420) 1 0.9 Test Data: 0.8 • FIRE 2013 (query): 500 0.7 • Facebook forum: 708 (no 0.6 transliteration) 0.5 • Facebook forum: 63 0.4 • #tokens: 32.1k (2512) 0.3 0.2 0.1 Sub-track Winner: 0 • IITP-TS Contributors: Amitava Das • MSR India • LA TF Score
Recommend
More recommend