introduction to historical text reuse detection
play

Introduction to Historical Text Reuse Detection Marco Bchler, Emily - PowerPoint PPT Presentation

Introduction to Historical Text Reuse Detection Marco Bchler, Emily Franzini, Greta Franzini, Maria Moritz eTRAP Research Group Gttingen Centre for Digital Humanities Institute of Computer Science Georg August University Gttingen, Germany


  1. Introduction to Historical Text Reuse Detection Marco Büchler, Emily Franzini, Greta Franzini, Maria Moritz eTRAP Research Group Göttingen Centre for Digital Humanities Institute of Computer Science Georg August University Göttingen, Germany KITAB DH Hackathon 2015 20. Oktober 2015

  2. Overview • What is text reuse? • Aspects of text reuse • ACID for the Digital Humanities • Big (Humanities) Data • Language Model 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  3. My interests :) 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  4. What do you associate with text reuse/intertextuality? 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  5. Typical expectation of a computer scientist: oversimplification 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  6. Expectations of a humanists: oversimplification 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  7. Text Reuse for Humanities and Computer Science • Question : Why is Text Reuse so relevant for Humanities and Computer Science? • Premise : The amount of digitally available data is growing exponentially (Big Data) • Humanities: – Lines of transmission and textual criticism – Transmissions of ideas/thoughts under different circumstances and conditions • Computer Science: – Text Decontamination for stylometry and authorship attribution, dating of texts – gen. Text Mining, Corpus Linguistics 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  8. Temperature Map 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  9. Respect to the topic • ACID for the Digital Humanities: – A cceptance – C omplexity – I nteroperability – D iversity 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  10. ACID for the Digital Humanities – Acceptance I 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  11. ACID for the Digital Humanities – Acceptance II How to be accepted by humanists if text mining is a black box we can't look into? 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  12. ACID for the Digital Humanities – Acceptance III Transparency: How to provide user- friendly insights into complex mining techniques and machine learning? 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  13. Current approach 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  14. ACID for the Digital Humanities – Acceptance IV 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  15. ACID for the Digital Humanities – Acceptance V 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  16. ACID for the Digital Humanities – Acceptance VI 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  17. ACID for the Digital Humanities – Acceptance VII 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  18. ACID for the Digital Humanities – Acceptance VII 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  19. ACID for the Digital Humanities – Complexity 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  20. ACID for the Digital Humanities – Interoperability 2015 DH Estonia – Text Reuse Hackathon 20. Oktober 2015

  21. ACID for the Digital Humanities – Diversity (Reuse Types) • Stability (yellow) • Purpose (green) • Size of text reuse (blue) • Classification (light blue) • Degree of distribution (purple) • Written and oral transmission KITAB DH Hackathon 2015 20. Oktober 2015

  22. ACID for the Digital Humanities – Diversity (Reuse Styles) KITAB DH Hackathon 2015 20. Oktober 2015

  23. Key problem Basic question: Distribution of Reuse Types und Reuse Styles are often unknown: Which model(s) should be chosen? KITAB DH Hackathon 2015 20. Oktober 2015

  24. Outline KITAB DH Hackathon 2015 20. Oktober 2015

  25. Thank you! " Stealing from one is plagiarism, stealing from many is research " (Wilson Mitzner, 1876-1933) Visit us at http://etrap.gcdh.de DH Hackathon 2015: "Don't leave your data problems at home!" 20. Oktober 2015

Recommend


More recommend