using lod to crowdsource dutch ww2 underground newspapers
play

Using LOD to crowdsource Dutch WW2 underground newspapers on - PowerPoint PPT Presentation

Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia Olaf Janssen, National Library of the Netherlands & Wikipedia Gerard Kuys , DBpedia & Wikimedia Nederland olaf.janssen@kb.nl - @ookgezellig -


  1. Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia Olaf Janssen, National Library of the Netherlands & Wikipedia Gerard Kuys , DBpedia & Wikimedia Nederland olaf.janssen@kb.nl - @ookgezellig - slideshare.net/OlafJanssenNL SWIB 2016, Bonn, 29-11-2016

  2. http://www.4en5meiamsterdam.nl/attachment/47454

  3. http://www.4en5meiamsterdam.nl/attachment/47454 During WW2 the Dutch resistance issued many underground newspapers. In every shape & form…

  4. http://resolver.kb.nl/resolve?urn=ddd:010436323 http://resolver.kb.nl/resolve?urn=ddd:010442948 From well-organized, ‘professional’ big titles… (o.a. Parool, Vrij Nederland, Trouw, de Waarheid) http://resolver.kb.nl/resolve?urn=ddd:010450508 http://resolver.kb.nl/resolve?urn=ddd:010447825

  5. …to very small, amateur, home -made, pamphlet-like issues

  6. After the war 1.300 newspaper titles were (physically) preserved at the NIOD … The national Institute for War, Holocaust and Genocide Studies https://commons.wikimedia.org/wiki/File:Verzetskrant_in_archiefdozen_bij_het_NIOD.jpg – CC-BY-SA - OlafJanssen in Amsterdam

  7. Underground students’ newspaper Bibliographic from The Hague metadata .. and were described in formal library catalogues (1.300 titles) http://opac-gonext.oclc.org:8180/DB=8/XMLPRS=Y/PPN?PPN=107123223

  8. In 2010 these WW2 newspapers were digitized…..

  9. …into full-texts in Delpher … (1.300 titles) The Dutch national aggregator for historic full-texts • Newspapers • Books • Magzines www.delpher.nl/kranten

  10. In Delpher you can read and search these newspapers… • Scans • Full-text OCR • ALTO

  11. But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers? • Etc…

  12. But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers or resistance groups? • Etc…

  13. But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers? • Etc… You can’t answer these questions from Delpher

  14. Big drawback of Delpher: No contextual information about WW2 underground newspapers https://thejungleisneutral.files.wordpress.com/2013/11/lost.jpg

  15. Where would many people go to find contextual information about historic newspapers? Probably Wikipedia (via Google) http://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

  16. http://2.bp.blogspot.com/_BWzuYwiS6-I/TMgeRsFd3mI/AAAAAAAAElw/3cvgbZSPWcs/s1600/doctor+macro+judy+scared.jpg

  17. http://2.bp.blogspot.com/_BWzuYwiS6-I/TMgeRsFd3mI/AAAAAAAAElw/3cvgbZSPWcs/s1600/doctor+macro+judy+scared.jpg

  18. http://2.bp.blogspot.com/_BWzuYwiS6-I/TMgeRsFd3mI/AAAAAAAAElw/3cvgbZSPWcs/s1600/doctor+macro+judy+scared.jpg Information on underground newspapers is distributed across multiple, unconnected sources 1. Descriptions (metadata in library catalogue, 1.300 titles ) 2. Content (full-text in Delpher, 1.300 titles ) 3. Context (in Wikipedia…. at least... )

  19. This Wikipedia article is a carefully chosen exception

  20. 1. There are very few illegal 2. The inventory of these newspapers newspapers with their own WP articles on WP is far from complete <<< 1.300 titles

  21. We can tackle both problems!

  22. Wikiproject Systematically and uniformly describe & interlink all 1.300 Dutch underground newspapers from WW2 on Wikipedia tinyurl.com/verzetskranten

  23. 2) Automatically make data 1) Reach big audiences available for other open purposes Wikidata -- DBpedia -- Dataviz Wikiproject Systematically and uniformly describe & interlink all 1.300 Dutch underground newspapers from WW2 on Wikipedia tinyurl.com/verzetskranten

  24. We badly need contextual information about the newspapers. Where do we get it? De Ondergrondse Pers 1940-1945 Lydia E. Winkel, H. de Vries , 1989, ISBN 9021837463, Veen Uitgevers This paper book contains entries about all 1.300 illegal newspapers https://thejungleisneutral.files.wordpress.com/2013/11/lost.jpg

  25. Entry 199 – De Geus; (onder studenten) Unique ID (within the book)

  26. Entry 199 – De Geus; (onder studenten) Place of publication Newspaper  Place name

  27. Entry 199 – De Geus; (onder studenten) Context Raw material for Wikipedia article!

  28. Entry 199 – De Geus; (onder studenten) Person names Newspaper  Persons

  29. Entry 199 – De Geus; (onder studenten) IDs of related students’ newspapers This newspaper  Other newspapers

  30. We OCRed this book into PDF (CC-BY-SA) http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

  31. We OCRed this book into PDF (CC-BY-SA) Available online (PDF, flat file) Open license (CC-BY-SA) Convert PDF into structured database. Link: titles  places, persons, other titles Link: titles  library catalogue (metadata) http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF) and Delpher (full-text) Link: titles, persons and places  external sources

  32. Convert PDF into structured database. Link: titles  places, persons, other titles Link: titles  library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places  external sources My co-author Gerard Kuys

  33. Convert PDF into structured database. Link: titles  places, persons, other titles Link: titles  library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places  external sources VIAF

  34. Technical appendix from slide 48 onwards

  35. We OCRed this book into PDF (CC-BY-SA) Available online (PDF, flat file) Open license (CC-BY-SA) Convert PDF into structured database. Link: titles  places, persons, other titles Link: titles  library catalogue (metadata) and Delpher (full-text) http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF) Link: titles, persons and places  external sources

  36. https://www.pinterest.com/freethewronged/world-war-ii/ Summer 2016 This LOD triple store (Virtuoso) is unique in the Netherlands. First time data about underground newspapers is systematically collected and linked online! 2) For other open reuse purposes 1) For Wikipedia Wikidata -- DBpedia -- Dataviz

  37. Wikiproject Systematically and uniformly describe & interlink all 1.300 Dutch underground newspapers from WW2 on Wikipedia

  38. https://c1.staticflickr.com/9/8281/7699231918_11a7356c38_b.jpg We have: LOD-database Using an article template we generated 1.300 uniform and interlinked Wikipedia stubs

  39. https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad) Non-grey = Wikipedia article stub Automatically generated from database using a template

  40. https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad) This bit was added manually to expand stub into full article  Crowdsourcing by Dutch Wikipedia community

  41. A group of Wikipedia volunteers is currently working to expand the 1.300 stubs … gradually creating more and more full articles. Door Sebastiaan ter Burg [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

  42. Before the project

  43. The number of articles is growing steadily …

  44. … making many Dutch people happy! http://www.formerdays.com/2011/05/dutch-liberation.html

  45. Thanks! olaf.janssen@kb.nl - @ookgezellig tinyurl.com/verzetskranten

  46. Technical appendix Slides by Gerard Kuys http://www.ilord.com/vintage.html - http://www.ilord.com/images/enigma-8-rotors-1000px.jpg

  47. Transforming Descriptive Data into Linked Open Data - Locations

  48. Transforming Descriptive Data into Linked Open Data - Persons

  49. Transforming Descriptive Data into Linked Open Data - interlinking

Recommend


More recommend