research platform for old indo aryan texts
play

Research Platform for Old Indo-Aryan Texts Brge Kiss (IDH), Daniel - PowerPoint PPT Presentation

It Takes a Village: Co-developing VedaWeb , a Digital Research Platform for Old Indo-Aryan Texts Brge Kiss (IDH), Daniel Klligan (HVS), Francisco Mondaca (CCeH), Claes Neuefeind (IDH), Uta Reinhl (ASW), Patrick Sahle (CCeH) 05.03.2019


  1. It Takes a Village: Co-developing VedaWeb , a Digital Research Platform for Old Indo-Aryan Texts Börge Kiss (IDH), Daniel Kölligan (HVS), Francisco Mondaca (CCeH), Claes Neuefeind (IDH), Uta Reinöhl (ASW), Patrick Sahle (CCeH) 05.03.2019

  2. Research Goals Traditional research with large corpora - concordances / word indexes, lexica: make usage patterns and frequencies visible - determination of meanings, functions, syntactic patterns based on researchers' individual assessments and their "reading experience" - problems: rather intuitive, subjective; the more texts, the more intractable

  3. Research Goals - online platform allowing combined searches of (1) lexical, (2) morphological, (3) metrical and (4) syntactic information, e.g. - (1): lexical fields: differences between words for x, e.g. 'man/woman' [Kazzazi 2001]; 'light' [Roesler 1997] etc. - (2): use/distribution/functional difference of allomorphs: e.g. áśv -a- ʻhorseʼ , nom.pl. áśv ās / áśv āsas ‘ horses ’ - ( http://ifl.phil-fak.uni-koeln.de/36486.html?&L=1 ) - (3): position of forms in verse; word-shapes - (4): information structure (topic/focus)

  4. Background Rigveda - oldest text of Indo-Aryan, part of Indo-European language family, ca. 1300 / 1000 BC - ca. 160.000 words (in 1028 hymns grouped into 10 books = "mandalas"); cf. Homer's Iliad + Odyssey = ca. 190.000 words hymns to gods (Indra, Soma, Varuna , Mitra, …) recited - mostly during Soma sacrifice (juice of intoxicating plant) Further texts to be integrated: Atharvaveda (c. 170.000 words), Yajurveda ; Vedic prose: Aitareya Brahmana (c. 100.000 words), Maitrayani Samhita (c. 120.000 words)

  5. Background Data - morphology - annotation provided by Prof. G. Dunkel, Prof. P. Widmer et al., University of Zurich - metre - Prof. K. Ryan, University of Harvard - syntax - Prof. H. Hettrich (University of Würzburg), Dr. O. Hellwig (University of Düsseldorf); - Dr. U. Reinöhl (University of Cologne/Mainz) using GRAID ( Grammatical Relations and Animacy in Discourse )

  6. Team ASW/HVS IDH - Spinfo PD Dr. Daniel Kölligan, P.I. Dr. Claes Neuefeind, P.I. Dr. Uta Reinöhl , P.I. Börge Kiss, M.A. Jakob Halfmann Natalie Korobzow CCeH/DCH Felix Rau, M.A. Apl. Prof. Dr. Patrick Sahle, P.I. Francisco Mondaca, M.A. Jonathan Blumtritt, M.A. Martina Gödel, M.A.

  7. Co-operation partners Prof. Dr. Paul Widmer, Universität Zürich Dr. Salvatore Scarlata, Universität Zürich Prof. Dr. Kevin Ryan, University of Harvard Dr. Dieter Gunkel, University of Richmond Prof. Dr. Laurent Romary, Inria/HU Berlin, TEI Prof. Dr. Nikolaus P. Himmelmann, Universität zu Köln

  8. VedaWeb : A digital platform for working with Old Indic texts  make available RV + translations + morphological glossings for view & export  connecting all word-forms of the annotated RV with the corresponding lexical entries in Grassmann, Böhtlingk / Roth, Monier Williams and vice versa  allowing combinatorial searches of lemmas, word- forms, morphological and metrical information via cascading search index

  9. State of the Art  revisions & additions of Zurich glossings  development of data model and APIs for dictionaries (Francisco Mondaca)  development of web application (Börge Kiss)  integration of further resources

  10. Morphological Glossings (Zurich)

  11. Translations: German, English, French, Latin, Russian …

  12. Workflow

  13. TEI - Modelling  Appropriate data model is of central importance for consistence, transfer, persistence and presentation  TEI (Text Encoding Initiative) offers the best way for textual data to persist in time, due to its active community of scholars and a detailed documentation. It’s the de facto standard in Digital Humanities projects.  modelling of texts (RV, translations) and dictionaries (Grassmann; Vedic Index of Names and Subjects)

  14. Software Architecture

  15. VedaWeb App  http://vedaweb.uni-koeln.de

  16. Cooperation within the project  not traditional "chasm" between IT and humanities people, but rather different ranges of competences and overlapping responsibilities:  "family constellation"

  17. Cooperation within the project  overlap of competence areas makes project feasible  regular communication  close feedback loops  gitlab, issue tracking system  regular team meetings (once a month)

  18. simple and challenging issues  different expectations of what is easy and difficult to implement, e.g.  multiple, combinable full-text search  search functions over diversely structured sets of data  complex structure of the base text:  books, hymns, verses, half-verses  different counting systems (by books, by hymns)  different text versions (editions; lemmas and annotations; "padapatha")

  19. learning from each other  for linguists:  insights into opportunities provided by digital research platforms  getting to know affordances of data for building an online platform and ensure data longevity (TEI)  for technical researchers:  complexity of ancient texts (internal structure, variation, different layers of form and meaning)  interests of linguists and other humanities scholars in the data  both:  make one's terminology explicit and clear  make the data consistent

  20. improved collaboration  general understanding  for DH researchers:  of the objects studied in various humanities disciplines and the relevant research questions and methods  for humanities scholars:  of the different fields and methods in DH (e.g. building a web- platform vs data modelling in TEI)

  21. Future plans: next version  metrical data (D. Gunkel/K. Ryan)  audio & video:  some recordings of A. Daniélou available  complete recording of RV in Copenhagen - not really available  http://www.kb.dk/en/nb/samling/os/Sydasien/veda.html  texts: Atharvaveda, Maitrayani Samhita  annotation layers / user accounts: GRAID etc.  semantic search … (Semantic Web)

  22. C-SALT : Cologne South Asian Languages and Texts http://c-salt.uni-koeln.de/  overview of projects and digital resources related to South Asian languages, texts, and culture at the University of Cologne (TEI Sanskrit dictionaries, Pali dictionary…)  C-SALT coordinates the activity of these projects and facilitates sustainable development of the diverse resources.  further plans:  Iranian (Avestan corpus + annotation; digital version of Bartholomae's dictionary; Middle Persian texts)  Nuristani (A. Degener [Mainz]: Kalasha-Ala, Prasun)

  23. धन्रवाद Thank you!

Recommend


More recommend