ELECTRONIC TEXT REUSE ACQUISITION PROJECT INTRODUCTION & MOTIVATION M arco Büchler, Greta Franzini, Emily Franzini
TABLE OF CONTENTS 1. Who are we? 2. Motivation 2/16
WHO ARE WE?
WHO AM I? • 2001-2002: Head of Quality Assurance department in a software company; • 2006: Diploma in Computer Science on big scale co-occurrence analysis; • 2007: Consultant for several SMEs in IT sector; • 2008: Technical project management of the eAQUA project; • 2011: PI and project manager of the eTRACES project; • 2013: PhD in Digital Humanities on Text Reuse; • 2014: Head of Early Career Research Group eTRAP at the University of Göttingen. 4/16
ABOUT ME Education • Humanities & Further Maths Diploma (IT) • Classics BA Honours (UK) • Digital Humanities MA (UK) • Part-time PhD student (UCLDH, UK): Digital Editions • Catalogue of Digital Editions (now a collaboration with ACDH) • Digital edition of an ancient Latin manuscript Work • Full-time post-doctoral researcher for eTRAP Early Career Research Group (DE): Automatic Text Reuse Detection and Analysis 5/16
WHO AM I? • 2008-2011: BA Latin & Ancient Greek at University College London • 2011-2012: MSc Management Science & Innovation at University College London • 2012-2013: Liaison Officer and Administration for the preservation of cultural assets at FAI • 2013-2014: Research Associate at University of Leipzig (Chair for Digital Humanities) • 2014-2016: Research Associate at University of Göttingen (Digital Humanities in Dept. Computer Science) 6/16
MOTIVATION
VENICE 2016 - TRACER TUTORIAL 8/16
WHO IS THIS PERSON? 9/16
“REUSE FROM SAME SOURCE”: COMMONALITIES & DIFFERENCES 10/16
WITTGENSTEIN’S “FAMILY RESEMBLANCE” Family resemblance is an equivalence relation that clusters common objects of similar and not identical characteristics together. Family resemblance is hierarchical such as in the examples before “Greta”, “Franzinis”, “Human”, ”creature“. 11/16
FORENSIC VIEW Evaluation of the reuse detection process by forensic criterions (standard in biometry): • Universality: How univeral can a characteristic be? (example: for about 2% of all humans no fingerprint can be taken) • Uniqueness: Different and independent “instances” should not share common characteristic. • Permanence: How resistent is a characteristic over time? • Collectability: Characteristics should be easy and simple to detect. • Performance: It includes precision, speed and robustness of the measuring technique. • Acceptability: Acceptance of the technique in (academic) usage. • Circumvention: It should be as difficult as possible to cheat a detection system. 12/16
ETRAP’S OBJECTIVE Title: eTRAP - electronic Text Reuse Acquisition Project Premise: Language is a changing system. Compared to biometry the volatility is much higher. • Research on the characteristics • What are good characteristics? • Which characteristics are stable and which are volatile and therefore not helpful in the detection process? • Research on the reuse process • Begins with: Why do we quote what we quote? • Passes by: If changes in the reuse process happen, why do they happen and what is the model behind (if one exists)? • Ends with: Understanding paraphrases and allusions 13/16
ABOUT ETRAP E lectronic T ext R euse A cquisition P roject (eTRAP) Interdisciplinary Early Career Research Group funded by the German Ministry of Education & Research (BMBF). Budget : e 1.6M. Duration : March 2015 - February 2019. Research since October 2015. Team : 4 core staff; 5-9 research & student assistants; Bachelor, Masters and PhD thesis students. • Interdisciplinary: Classics, Computer Science, German Literature, Mathematics, Philosophy, Cognitive Psychology and Literature Studies. • International: Currently from eight nationalities. 14/16
CONTACT Visit us http://www.etrap.eu contact@etrap.eu Stealing from one is plagiarism, stealing from many is research (Wilson Mitzner, 1876-1933) 15/16
LICENCE The theme this presentation is based on is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Changes to the theme are the work of eTRAP. cba 16/16
Recommend
More recommend