WASP: Web Archiving and Search Personalized Johannes Kiesel , Arjen P . de Vries, Matthias Hagen, Benno Stein and Martin Potthast @KieselJohannes, @arjenpdevries, @matthias_hagen, @bennostein, @martinpotthast DESIRES, August 29 th 2018 1 @KieselJohannes 2018
The Personal Search Engine: Motivation 2 @KieselJohannes 2018
The Personal Search Engine: Motivation 3 @KieselJohannes 2018
The Personal Search Engine: Motivation 4 @KieselJohannes 2018
The Personal Search Engine: Motivation 5 @KieselJohannes 2018
The Personal Search Engine: Motivation 6 @KieselJohannes 2018
The Personal Search Engine: Motivation 7 @KieselJohannes 2018
The Personal Search Engine: Motivation 8 @KieselJohannes 2018
The Personal Search Engine: Motivation 9 @KieselJohannes 2018
The Personal Search Engine: Motivation 10 @KieselJohannes 2018
The Personal Search Engine: Motivation 11 @KieselJohannes 2018
The Personal Search Engine: Motivation 12 @KieselJohannes 2018
The Personal Search Engine: Inspiration 13 @KieselJohannes 2018
The Personal Search Engine: Inspiration Personal search engine! 14 @KieselJohannes 2018
WASP 15 @KieselJohannes 2018
WASP 16 @KieselJohannes 2018
WASP 17 @KieselJohannes 2018
WASP 18 @KieselJohannes 2018
WASP 19 @KieselJohannes 2018
WASP Search Index Interface World Wide Web pywb WARCs Browser warcprox 20 @KieselJohannes 2018
WASP Search Index Interface World Wide Web pywb WARCs Browser proxy warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 21 @KieselJohannes 2018
WASP /search Search Index Interface World Wide Web pywb WARCs Browser warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 22 @KieselJohannes 2018
WASP Search Index Interface /archive/<time>/<url> World Wide Web pywb WARCs Browser warcprox ❑ All requests ( ) and responses ( ) while browsing are stored and indexed ❑ Page on localhost allows to search. Result page links to archive where... ❑ Visited pages are reproduced for the corresponding time 23 @KieselJohannes 2018
WASP Personal search engine! 24 @KieselJohannes 2018
WASP 25 @KieselJohannes 2018
Insight 1: Not Indexing Near-duplicates ❑ What changes warrant a re-archiving? 26 @KieselJohannes 2018
Insight 2: Browsable and Deletable History 27 @KieselJohannes 2018
Insight 3: Easy (De-)activation of Archiving Easy activation, deactivation, and status-check Patterns 28 @KieselJohannes 2018
Insight 4: Combined Indexing of Sub-pages ❑ Should visited sub-pages of a single article be indexed as one? ❑ If so, to which sub-page should be linked in the result list? 29 @KieselJohannes 2018
Insight 5: Personalized Search 30 @KieselJohannes 2018
Insight 5: Personalized Search 31 @KieselJohannes 2018
WASP: Web Archiving and Search Personalized Insights overview Code and Instructions on Github ❑ Not Indexing Near-duplicates ❑ Browsable and Deletable History ❑ Easy (De-)activation of Archiving ❑ Combined Indexing of Sub-pages ❑ Personalized Search github.com/webis-de/wasp Thank you for your attention! 32 @KieselJohannes 2018
Recommend
More recommend