wasp web archiving and search personalized
play

WASP: Web Archiving and Search Personalized Johannes Kiesel , Arjen P - PowerPoint PPT Presentation

WASP: Web Archiving and Search Personalized Johannes Kiesel , Arjen P . de Vries, Matthias Hagen, Benno Stein and Martin Potthast @KieselJohannes, @arjenpdevries, @matthias_hagen, @bennostein, @martinpotthast DESIRES, August 29 th 2018 1


  1. WASP: Web Archiving and Search Personalized Johannes Kiesel , Arjen P . de Vries, Matthias Hagen, Benno Stein and Martin Potthast @KieselJohannes, @arjenpdevries, @matthias_hagen, @bennostein, @martinpotthast DESIRES, August 29 th 2018 1 @KieselJohannes 2018

  2. The Personal Search Engine: Motivation 2 @KieselJohannes 2018

  3. The Personal Search Engine: Motivation 3 @KieselJohannes 2018

  4. The Personal Search Engine: Motivation 4 @KieselJohannes 2018

  5. The Personal Search Engine: Motivation 5 @KieselJohannes 2018

  6. The Personal Search Engine: Motivation 6 @KieselJohannes 2018

  7. The Personal Search Engine: Motivation 7 @KieselJohannes 2018

  8. The Personal Search Engine: Motivation 8 @KieselJohannes 2018

  9. The Personal Search Engine: Motivation 9 @KieselJohannes 2018

  10. The Personal Search Engine: Motivation 10 @KieselJohannes 2018

  11. The Personal Search Engine: Motivation 11 @KieselJohannes 2018

  12. The Personal Search Engine: Motivation 12 @KieselJohannes 2018

  13. The Personal Search Engine: Inspiration 13 @KieselJohannes 2018

  14. The Personal Search Engine: Inspiration Personal search engine! 14 @KieselJohannes 2018

  15. WASP 15 @KieselJohannes 2018

  16. WASP 16 @KieselJohannes 2018

  17. WASP 17 @KieselJohannes 2018

  18. WASP 18 @KieselJohannes 2018

  19. WASP 19 @KieselJohannes 2018

  20. WASP Search Index Interface World Wide Web pywb WARCs Browser warcprox 20 @KieselJohannes 2018

  21. WASP Search Index Interface World Wide Web pywb WARCs Browser proxy warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 21 @KieselJohannes 2018

  22. WASP /search Search Index Interface World Wide Web pywb WARCs Browser warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 22 @KieselJohannes 2018

  23. WASP Search Index Interface /archive/<time>/<url> World Wide Web pywb WARCs Browser warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 23 @KieselJohannes 2018

  24. WASP Personal search engine! 24 @KieselJohannes 2018

  25. WASP 25 @KieselJohannes 2018

  26. Insight 1: Not Indexing Near-duplicates ❑ What changes warrant a re-archiving? 26 @KieselJohannes 2018

  27. Insight 2: Browsable and Deletable History 27 @KieselJohannes 2018

  28. Insight 3: Easy (De-)activation of Archiving Easy activation, deactivation, and status-check Patterns 28 @KieselJohannes 2018

  29. Insight 4: Combined Indexing of Sub-pages ❑ Should visited sub-pages of a single article be indexed as one? ❑ If so, to which sub-page should be linked in the result list? 29 @KieselJohannes 2018

  30. Insight 5: Personalized Search 30 @KieselJohannes 2018

  31. Insight 5: Personalized Search 31 @KieselJohannes 2018

  32. WASP: Web Archiving and Search Personalized Insights overview Code and Instructions on Github ❑ Not Indexing Near-duplicates ❑ Browsable and Deletable History ❑ Easy (De-)activation of Archiving ❑ Combined Indexing of Sub-pages ❑ Personalized Search github.com/webis-de/wasp Thank you for your attention! 32 @KieselJohannes 2018

Recommend


More recommend