academic recommendation using citation analysis with the
play

Academic Recommendation using Citation Analysis with the advisor - PowerPoint PPT Presentation

Academic Recommendation using Citation Analysis with the advisor Erik Saule c, Kamer Kaya, joint work with Onur K u c uktun Umit V. C ataly urek esaule@bmi.osu.edu Department of Biomedical Informatics The Ohio State


  1. Academic Recommendation using Citation Analysis with the advisor Erik Saule c, Kamer Kaya, ¨ joint work with Onur K¨ u¸ c¨ uktun¸ Umit V. C ¸ataly¨ urek esaule@bmi.osu.edu Department of Biomedical Informatics The Ohio State University CSTA 2013 Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc :: 1 / 37

  2. Table of Contents Introduction 1 Why? Overview Citation Analysis for Document Recommendation 2 Previous Approaches Direction Aware Recommendation A High Performance Computing Problem 3 A specialization of SpMV Ordering and Partitioning Result Diversification 4 Other Features 5 Final Thoughts 6 Conclusion Future Works Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc :: 2 / 37

  3. Once upon a time : a survey paper The Jimmy John’s scheduling problem linear pipeline scheduling chain workflow × × partitioning sequences data flow mapping (tree) task graph (serial parallel) Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 3 / 37

  4. Once upon a time : a survey paper The Jimmy John’s scheduling problem linear pipeline scheduling chain workflow × × partitioning sequences data flow mapping (tree) task graph (serial parallel) But also... “Scheduling problems in parallel query optimization” “Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming” Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 3 / 37

  5. Once upon a time : a survey paper The Jimmy John’s scheduling problem linear pipeline scheduling chain workflow × × partitioning sequences data flow mapping (tree) task graph (serial parallel) But also... “Scheduling problems in parallel query optimization” “Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming” After 6 months, unknown papers where still uncovered Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 3 / 37

  6. Once upon a time : a survey paper The Jimmy John’s scheduling problem linear pipeline scheduling chain workflow × × partitioning sequences data flow mapping (tree) task graph (serial parallel) But also... “Scheduling problems in parallel query optimization” “Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming” After 6 months, unknown papers where still uncovered Develop software to make the search easier! Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 3 / 37

  7. Design Goals Personalized The user should be able to make a query that describes precisely what she is looking for. Conceptual The system should free of linguistic problems. Ambiguity and synonymy should be taken into accounts. Exploratory Different perspective should be available. The system should enhance the user’s search. Easy to use The user should not need to know anything about data mining or algorithms. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 4 / 37

  8. The Academic Web Service Ecosystem DBLP List of CS papers with clean reference and disambiguated names. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  9. The Academic Web Service Ecosystem DBLP List of CS papers with clean reference and disambiguated names. Citeseer, { Ref,Ack,Collab } Seer Automatically crawled papers in CS. Give PDFs. Contain citation information, full text. Compute similarity. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  10. The Academic Web Service Ecosystem DBLP List of CS papers with clean reference and disambiguated names. Citeseer, { Ref,Ack,Collab } Seer Automatically crawled papers in CS. Give PDFs. Contain citation information, full text. Compute similarity. CiteUlike Social paper tagging application. Find paper from researchers with similar interest. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  11. The Academic Web Service Ecosystem DBLP List of CS papers with clean reference and disambiguated names. Citeseer, { Ref,Ack,Collab } Seer Automatically crawled papers in CS. Give PDFs. Contain citation information, full text. Compute similarity. CiteUlike Social paper tagging application. Find paper from researchers with similar interest. ArnetMiner Academic network analysis. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  12. The Academic Web Service Ecosystem DBLP Mendeley List of CS papers with clean reference and Application for managing references. disambiguated names. Database of reference. Citeseer, { Ref,Ack,Collab } Seer Automatically crawled papers in CS. Give PDFs. Contain citation information, full text. Compute similarity. CiteUlike Social paper tagging application. Find paper from researchers with similar interest. ArnetMiner Academic network analysis. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  13. The Academic Web Service Ecosystem DBLP Mendeley List of CS papers with clean reference and Application for managing references. disambiguated names. Database of reference. Google Scholar Citeseer, { Ref,Ack,Collab } Seer Keyword-based search engine (with citation Automatically crawled papers in CS. Give informations). PDFs. Contain citation information, full text. Compute similarity. CiteUlike Social paper tagging application. Find paper from researchers with similar interest. ArnetMiner Academic network analysis. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  14. The Academic Web Service Ecosystem DBLP Mendeley List of CS papers with clean reference and Application for managing references. disambiguated names. Database of reference. Google Scholar Citeseer, { Ref,Ack,Collab } Seer Keyword-based search engine (with citation Automatically crawled papers in CS. Give informations). PDFs. Contain citation information, full text. Compute similarity. Microsoft Academic Search CiteUlike Keyword-based search engine and Social paper tagging application. Find Academic network analysis. paper from researchers with similar interest. ArnetMiner Academic network analysis. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  15. The Academic Web Service Ecosystem DBLP Mendeley List of CS papers with clean reference and Application for managing references. disambiguated names. Database of reference. Google Scholar Citeseer, { Ref,Ack,Collab } Seer Keyword-based search engine (with citation Automatically crawled papers in CS. Give informations). PDFs. Contain citation information, full text. Compute similarity. Microsoft Academic Search CiteUlike Keyword-based search engine and Social paper tagging application. Find Academic network analysis. paper from researchers with similar interest. IEEE, ACM, Elsevier, JSTOR, ... ArnetMiner Publishers or digital libraries with complete text and references. Some suggestions. Academic network analysis. Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Why? 5 / 37

  16. A Use Case Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Overview 6 / 37

  17. System Overview Architecture A web-server as a front end. A cluster in the back-end. New instances are dynamically created as the load varies. Functional parameters { k,d, κ } Venue Rec. .bib venues π π Paper paper IDs Recommendation reviewers Visualization .ris Reviewer Rec. Mapper Engine π papers .xml Diversification Engine Relevance Feedback Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Introduction::Overview 7 / 37

  18. Outline Introduction 1 Why? Overview Citation Analysis for Document Recommendation 2 Previous Approaches Direction Aware Recommendation A High Performance Computing Problem 3 A specialization of SpMV Ordering and Partitioning Result Diversification 4 Other Features 5 Final Thoughts 6 Conclusion Future Works Ohio State University, Biomedical Informatics the advisor : http://theadvisor.osu.edu/ Erik Saule HPC Lab http://bmi.osu.edu/hpc Citation Analysis:: 8 / 37

Recommend


More recommend