how do users formulate their queries a morpho syntactic
play

How do users formulate their queries? A morpho-syntactic analysis - PowerPoint PPT Presentation

11th European Conference of Medical and Health Libraries How do users formulate their queries? A morpho-syntactic analysis Nicolas Ariste Fairon Life sciences library, University of Liege, 4000 Lige, Belgium <nicolas.fairon@ulg.ac.be>


  1. 11th European Conference of Medical and Health Libraries How do users formulate their queries? A morpho-syntactic analysis Nicolas Ariste Fairon Life sciences library, University of Liege, 4000 Liège, Belgium <nicolas.fairon@ulg.ac.be> 24 th of June, Nicolas Fairon

  2. Queries formulated in French natural language French Medline search strategy MeSH � Natural Language Processing � Automatic extraction of concepts 2

  3. Introduction – Material & Methods – Results - Conclusions The Facts � Despite the efforts, many users remain unable to perform an efficient Medline research. Why? � Bad query formulation � Bad knowledge of MeSH terms � Not enough practice � Problems with boolean operator 3

  4. Introduction – Material & Methods – Results - Conclusions What exists � Medline interfaces, with interesting features: � Query expansion � Searching MeSH and keywords � Automatic explosion... � Permuted index � MeSH translations � Elementary tools for natural language searching 4

  5. Introduction – Material & Methods – Results - Conclusions Natural Language Approach Analyzing the query to find relevant concepts Medline interfaces complexity Efficiency Natural language Precision Recall Controlled language Torticollis 83.7% 100% Torticollis [MeSH] Congenital torticollis 40.0% 90.0% Torticollis/cn [MeSH] Smoking adverse effects 4.2% 44.1% Smoking/ae [MeSH] 5

  6. Introduction – Material & Methods – Results - Conclusions What we want to do 6

  7. Introduction – Material & Methods – Results - Conclusions Materials & Methods Query submitted Corrected Semantically tagged by user Manual CORPUS All queries Approaches Automatic Dictionary Descriptive Analysis Local grammar Concepts extraction Hybrid 7

  8. Introduction – Material & Methods – Results - Conclusions Queries'collecting Query submitted Corrected Semantically tagged by user Je cherche des articles sur le tr é tement du can s er du sein. Correcting Je cherche des articles sur le traitement du cancer du sein. Tagging Je cherche des articles sur le {w11s* traitement *} du {w21* cancer du sein *} . 8

  9. Introduction – Material & Methods – Results - Conclusions Manual tagging Query submitted Semantically tagged Corrected by user � To append semantic flags to useful concepts � To identify and keep track of every concept � To evaluate the efficiency of our application 9

  10. Introduction – Material & Methods – Results - Conclusions The Corpus Query submitted Corrected Semantically tagged by user CORPUS All queries � A web application to store for each query � Raw, corrected, and tagged versions � Medline search history done by a scientific librarian � 195 queries formulated by 68 different users 10 � 6 985 words

  11. Introduction – Material & Methods – Results - Conclusions Extracting concepts Descriptive Analysis UNITEX Concepts extraction Dictionary Hybrid Local grammar Dictionnaries French MeSH Local grammars 11 Hand-made

  12. Introduction – Material & Methods – Results - Conclusions Evaluation of automatic extraction Queries Concepts extraction Concepts List A d e g g a t n u Recall COMPARISON VS CORPUS Precision List B tagged (reference) � 12

  13. Introduction – Material & Methods – Results - Conclusions Descriptive analysis 464 concepts have been identified 13

  14. Introduction – Material & Methods – Results - Conclusions Concepts' extraction: dictionary approach � Applying MeSH dictionary to queries in order to identify them. % 100 Recall 90 Precision 80 70 60 50 40 30 20 10 0 MeSH terms Subheadings Keywords 14

  15. Introduction – Material & Methods – Results - Conclusions Concepts'extraction: Local grammar approach � Use recognition patterns relying on queries'morphology and syntax. % 100 Recall 90 Precision 80 70 60 50 40 30 20 10 0 MeSH terms Subheadings Keywords 15

  16. Introduction – Material & Methods – Results - Conclusions Concepts'extraction: Hybrid approach � Using local grammars combined with dictionaries % 100 Recall 90 Precision 80 70 60 50 40 30 20 10 0 MeSH terms Subheadings Keywords 16

  17. Introduction – Material & Methods – Results - Conclusions Conclusions � Creating a new interface based on natural language processing involves � Concept mapping � Concepts combination � Hybrid approach shows best results � Dictionaries � Local grammar � Dictionaries'quality influes on performance 17

  18. Introduction – Material & Methods – Results - Conclusions What's next? � Disambiguiation of fuzzy MeSH concepts � Combination of the concepts with adequate booleans operators � Made the tool available to users as a web application 18

  19. Thank you for your attention nicolas.fairon@ulg.ac.be Open source tools used for the work and the presentation : 19

Recommend


More recommend