i t introduction to ntcir 7 d ti t ntcir 7
play

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k - PowerPoint PPT Presentation

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of Informatics, Japan http://research.nii.ac.jp/ntcir/ h // h ii j / i / kando (at) nii. ac. Jp Noriko Kando NTC intro 2008-12-16 1 Road map


  1. I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of Informatics, Japan http://research.nii.ac.jp/ntcir/ h // h ii j / i / kando (at) nii. ac. Jp Noriko Kando NTC intro 2008-12-16 1

  2. Road map Road map • What is NTCIR • Leason learned from past NTCIRs Leason learned from past NTCIRs • Brief Introction to NTCIR-7 • Conclusion NTC intro 2008-12-16 Noriko Kando 2

  3. NTCIR: NTCIR: NII Test Collection for Information Retrieval NII Test Collection for Information Retrieval Research Infrastructure for Evaluating IA Research Infrastructure for Evaluating IA A series of evaluation workshops designed to enhance research in information-access technologies by h h i i f ti t h l i b providing an infrastructure for large-scale evaluations. ■ Data sets, evaluation methodologies, and forum ■ Data sets, evaluation methodologies, and forum Project started in late 1997 Once every 18 months Data sets (Test collections or TCs) Scientific, news, patents , and web Chin s Chinese, Korean, Japanese, and English K r n J p n s nd En lish Tasks IR: Cross-lingual tasks, patents, web, QA : Monolingual tasks, cross-lingual tasks Summarization, trend info., patent maps Opinion analysis, text mining C Community-based Research Activities it b d R h A ti iti NTCIR-7 participants Noriko Kando 3 NTC intro 2008-12-16 82 groups from 15 countries

  4. NTCIR provides ; NTCIR provides ; A scientific basis for understanding the effectiveness of automated search systems effectiveness of automated search systems Document set, a set Large-scale “T est Collections ” or TC of topics, and a list • Organizers provide a data set • Organizers provide a data set of relevant documents of relevant documents • Participants use the same data set to for each topic compare the effectiveness of their NTCIR enables: NTCIR bl systems • TCs are available for research purpose * Cross-system compar son on a comparison on a F Forum of researcher groups f h common infrastructure Show-case of the State-of-the- * Speeds up R&D * S d R&D art technologies t t h l i and technology Investigations into evaluation transfers methodologies and metrics methodologies and metrics NTC intro 2008-12-16 Noriko Kando 4

  5. Information retrieval Information retrieval • Retrieve RELEVANT i f information from vast collection ti f t ll ti to meet users’ information needs Using computers since the 1950s g p First CS uses human assessments as success criteria it i – Judgments vary – Comparative evaluations on C mparative evaluati ns n the same infrastructure NTC intro 2008-12-16 Noriko Kando 5

  6. Information access (IA) Information access (IA) • Whole process ofpreparing information from the vast collection of documents usable by the vast collection of documents usable by users. • For example, IR, text summarization, QA, F l IR t t i ti QA text mining, and clustering • Use human assessments as success criteria NTC intro 2008-12-16 Noriko Kando 6

  7. Focus of NTCIR Focus of NTCIR N New Challenges Ch ll Lab-type IR Test Intersection of IR NLP Intersection of IR + NLP Asian Languages/cross-language Asian Languages/cross-language To make information in the Variety of Genre documents more usable for Parallel/comparable Corpus Parallel/comparable Corpus users! users! Realistic eval/user task Forum for Researchers Researchers Idea Exchange Discussion/Investigation on Evaluation methods/metrics Evaluation methods/metrics

  8. History History… Project starts late 1997 Nov ’98 – Sep ’99 NTCIR-1 Apr ’06 – May ’07 NTCIR-6 Jun ’00 – Mar ’01 J ’00 M ’01 NTCIR-2 NTCIR 2 O t ’07 D Oct ’07 – Dec ’08 NTCIR-7 ’08 NTCIR 7 Sep ’01 – Oct ’02 NTCIR-3 A Apr ’03 – Jun ’04 ’03 J ’04 NTCIR 4 NTCIR-4 Oct ’04 – Dec ’05 NTCIR-5 NTCIR-7 Workshop Meeting Dec 16-19

  9. Tasks at past NTCIRs Tasks (Research Areas) of NTCIR Workshops p 1st 2nd 3rd 4th 5th 6th Japanese IR news sci T Cross-lingual IR Cross lingual IR T a Patent Retrieval s map/classif k k W b R Web Retrieval i l s Navigational Geo Result Classification Term Extraction QuestionAnswering Info Access Dialog S Summ metrics t i s Cross-Lingual Text Summarization Trend Information Opinion Analysis NTC intro 2008-12-16 Noriko Kando 9

  10. NTCIR-7 Clusters NTCIR-7 Clusters Cluster 1. Advanced CLIA Mu uST; V - Complex CLQA ( Chinese, Japanese, English) - IR for QA (Chinese, Japanese, English) Visuali Cluster 2. User-Generated : - Multilingual Opinion Analysis Multilingual Opinion Analysis zation Cluster 3. Focused Domain : Patent - Patent Translation ; English -> Japanese, P t t T sl ti ; E n Chall li h J - Patent Mining paper -> IPC Cluster 4. MuST : enge - Multi-modal Summarization of Trends NTC intro 2008-12-16 Noriko Kando 10

  11. Number of Participants by Tasks Opinion Opinion 120 ACLIA CCLQA CLQA ups 100 100 QA QA tingGrou Trend Info 80 Summarization mm articipat 60 Term Extraction Web Retrieval 40 40 # of Pa Chinese Patent MT Chinese Korean 20 Patent Mining J � E,E � J 、 # J � E J � E x � CJEK x � CJEK E � C Patent Retrieval 0 NonJapanese IR ) ) ) ) ) ) ) 9 7 1 2 4 5 8 - - - - - - - - - - - - 0 0 8 8 6 6 1 1 3 3 4 4 7 7 CLIR 0 0 9 0 0 0 0 0 0 9 0 0 0 0 2 2 2 1 2 2 2 ( ( ( ( ( ( ( t d h h d s h Japanese IR n t t 1 r h t 2 4 5 3 t 7 6 ACLIA IR4QA CL R4Q

  12. NTCIR-7 PC Meeting@NTCIR-6 Mark Sanderson Doug Oard Atsushi Fujii Tatsunori Mori Mark Sanderson, Doug Oard, Atsushi Fujii, Tatsunori Mori, Fred Gey, Noriko Kando (and others)

  13. NTCIR-7: Advanced CLIA Teruko Mitamura (CMU) Eric Nyberg (CMU) Eric Nyberg (CMU) Ruihua Chen (MSRA) Fred Gey (UCB), Donghong Ji (Wuhan Univ) Donghong Ji (Wuhan Univ) Noriko Kando (NII) Chin-Yew Lin (MSRA) Chuan-Jie Lin (Nat Taiwan Ocean Univ) Tsuneaki Kato (Tokyo Univ) Tatsunori Mori (Yokohama N Univ) Tatsunori Mori (Yokohama N Univ) Tetsuya Sakai (NewsWatch) Ad i Advisor: K.L.Kwok (Queen College) K L K k (Q C ll ) CLEF2008 2008-09-18 Noriko kando 14

  14. NTCIR-7: UGC (Blog) ( g) David K Evans (NII -> Amazon Japan) David K Evans (NII > Amazon Japan) Yohei Seki (Toyohashi U Tech -> Columbia U) LunWei Ku (National Taiwan Univ) Le Sun (Chinese Academy of Science) ( y ) Hsin-Hsi Chen (National Taiwan Univ) Noriko Kando (NII) CLEF2008 2008-09-18 Noriko kando 15

  15. NTCIR-7: Focused Domain (Patent) ( ) Atsuhi Fujii (Univ Tsukuba) j Taiich Hashimoto (Tokyo Insti Tech) Makoto Iwayama (Tokyo Insti Tech/ Hitach) Hidetsugu Nanba (Hiroshima City Univ) Masao Utiyama (NICT), M U i (NICT) Mikio Yamamoto, U Tsukuba) T k hit Uts Takehito Utsuro (U Tsukuba) (U Ts k b ) CLEF2008 2008-09-18 Noriko kando 16

  16. MuST: Multimodal Summarization for Trend Information Tsuneaki Kato (Tokyo Univ) y Mitsunori Matsushita (NTT Comm Sci Lab � Kansei Univ) CLEF2008 2008-09-18 Noriko kando 17

Recommend


More recommend