introduction to dialectometry iii
play

Introduction to Dialectometry III Wilbert Heeringa German Academic - PowerPoint PPT Presentation

Introduction to Dialectometry III Wilbert Heeringa German Academic Exchange Service DAAD University of Bielefeld, Faculty of Linguistics and Literary Studies Frisian Academy Abidjan, December, 1923, 2016 1 Topics Gabmap Literature 2


  1. Introduction to Dialectometry III Wilbert Heeringa German Academic Exchange Service – DAAD University of Bielefeld, Faculty of Linguistics and Literary Studies Frisian Academy Abidjan, December, 19–23, 2016 1

  2. Topics Gabmap Literature 2

  3. Gabmap 3

  4. What is Gabmap? • A web application that visualizes dialect variation: Doing dialect analysis on the web • Developed by Peter Kleiweg under supervision of John Nerbonne. • Based on functions in the R u G/L 04 package which exists since 2001, and has been freely distributed since 2004 • Gabmap was developed since the end of 2010 and first published on Github on June 4, 2011. 4

  5. What is Gabmap? • Original version available at: http://www.let.rug.nl/~kleiweg/L04/webapp • Version forked and maintained by C ¸agri C ¸¨ oltekin: http://www.gabmap.nl/ and maintained by Martijn Wieling. 5

  6. 6

  7. Gabmap running on USB stick • Peter Kleiweg developed a Docker image of Gabmap which is available at: at: https://github.com/pebbe/Gabmap-docker and which enables us to run Gabmap without internet connection. • The ’Docker version’ of Gabmap is installed in Lubuntu 16.04, an operating system based on the Linux kernel. • GpsPrune and LibreOffice are also installed. The ‘Calc spreadsheet’ component enables you to create and view dialect data tables. 7

  8. How to boot from the USB stick • Turn on your computer. If you have a PC, ◦ then right after this press F9 or F12 or ESC or ...; Mac, ◦ then hold the Alt/Option key as soon as you hear the Macs startup chime. • A list of list of devices will appear. Choose something with ‘Lexar JumpDrive S75 USB 3.0’. • After a while, another ‘boot menu’ will appear. Just press ENTER. • Login with: username: guest password: guest • After some time Gabmap is opened in Firefox. 8

  9. How to boot from the USB stick • If booting from the USB does not succeed, check your UEFI settings. • Right after turning on your computer press ESC or ?. • Disable Secure Boot (enable this again when you are done). • Change boot mode to CSM or Legacy (legacy BIOS compatibility mode, legacy USB support, ?) 9

  10. 10

  11. 11

  12. Input • Gabmap needs three input files: map ◦ dialect data ◦ feature definition file ◦ 12

  13. Input: map • A map consists of at least: an outline of the area; ◦ placemarks are added for the locations where the data was collected. NB: place ◦ names should be spelled exactly as in your data file! • Optionally, more details can be added to the map, for example internal borders, rivers. 13

  14. Input: map • The maps can be created with Google Earth or Google Maps • For a manual about creating maps with Google Earth see: http://www.let.rug.nl/~kleiweg/L04/kml/manual.html and with Google Maps: http://coltekin.net/cagri/courses/leuven/ • The two manuals are also found in the folder Gabmap/manuals on the USB stick. • Save the map as .kml or .kmz file 14

  15. Input: map • When at least an outline is available as .kml file, we can edit the file with GpsPrune. • Locations can be added when the coordinates are known. Coordinates can be obtained via Google Maps. 15

  16. 16

  17. 17

  18. 18

  19. 19

  20. 20

  21. 21

  22. Input: map • GpsPrune changes polygons in lines. However, Gabmap requires polygons! • Load and edit your .kml file in Leafpad. • Use Search , Replace and check Replace all at once in order to perform the replacements throughout the whole document. • Replace: ‘ < LineString > ’ ◦ by ‘ < Polygon >< outerBoundaryIs >< LinearRing > ’ ‘ < /LineString > ’ ◦ by ‘ < /LinearRing >< /outerBoundaryIs >< /Polygon > ’ 22

  23. Input: dialect data • The dialect data should be in a table where: the rows represent the locations where the data was collected; ◦ the columns represent the data items. ◦ • Prepare the data file using LibreOffice Calc (on USB stick) or Microsoft Excel. • Use the IPA chart Unicode keyboard at: https://westonruter.github.io/ipa-chart/keyboard/ for finding the Unicode characters. • The chart covers the The International Phonetic Alphabet revised to 2005. 23

  24. 24

  25. 25

  26. Input: dialect data • For uploading the data file in Gabmap it has to be a tab-separated plain text file encoded as Unicode (UTF-8 or UTF-16). • To save the file in this format choose ‘Save As’ in the ‘File’ menu, and choose ‘Text CSV (.csv)’ in the lower right corner of the window ‘Save’. • In the window ‘Export Text File’ choose ‘Tab’ as field delimiter. • The resulting file with the extension .csv can be uploaded in Gabmap. 26

  27. 27

  28. 28

  29. Input: dialect data • When loading an existing file in LibreOffice Calc load the file as Unicode (UTF8 or UTF-16) and the tab as separator. 29

  30. 30

  31. 31

  32. 32

  33. 33

  34. Input: dialect data • Other types of data than transcriptions can be analyzed in Gabmap, too, especially categorical data. 34

  35. 35

  36. Input: dialect data • See also the manual about preparing dialect data for Gabmap which is found under Help in Gabmap; ◦ in the folder Gabmap/manuals on the USB stick. ◦ 36

  37. Input: feature definition file • The file IPA.def is found in /Gabmap/datasets . • Covers the Unicode characters of the IPA revised until 2005. • Using this file assures that in an alignment of two pronunciations: a vowel matches with a vowel ◦ a consonant matches with a consonant ◦ and allows that: the [j] or [w] matches with a vowel ◦ the [i] or [u] matches with a consonant ◦ the schwa matches with a sonorant ◦ • Substitutions, insertions and indels have weight of 1. 37

  38. Input: feature definition file • If two segments are the same, but they have different suprasegmentals and diacritics, the weight is 0.3. • Not processed are: primary stress, secondary stress, minor (foot) group, major (intonation) group, syllable break, linking (absence of a break). • NB: language-specific adjustments may be necessary! However, be careful when changing IPA.def . 38

  39. Running Gabmap • Now we have a map, a table and a feature definition file, we can run Gabmap. 39

  40. 40

  41. 41

  42. Literature 42

  43. Literature (1) Goebl, H. (1982). Dialektometrie; Prinzipien und Methoden des Einsatzes der numerischen Taxonomie im Bereich der Dialektgeographie . Wien: Verlag der ¨ Ost. Akademie der Wissenschaften. Goebl, H. (1984). Dialektometrische Studien anhand italoromanischer, r¨ atoromanischer und galloromanischer Sprachmaterialien aus AIS und ALF . Volume 1. (Volumes 2 and 3 contain maps and tables). T¨ ubingen: Max Niemeyer. Goebl, H. (2010a). Dialectometry and quantitative mapping. In Language and Space. An International Handbook of Linguistic Variation . Volume 2: Language Mapping. Handb¨ ucher zur Sprach- und Kommunikationswissenschaft [HSK], edited by Alfred Lameli, Roland Kehrein and Stefan Rabanus, 30.2, 433–457, 2201–2212. Berlin: de Gruyter Mouton. Goebl, H. (2010b). Dialectometry: Theoretical prerequisites, practical problems, and concrete applications (mainly with examples drawn from the “Atlas Linguistique de La France”, 1902–1910). Dialectologia . Special Issue, I(2010): 63–77. Goebl, H. (2006). Recent Advances in Salzburg Dialectometry. Literary and Linguistic Computing 21(4), 411–435. 43

  44. Literature (2) Gooskens, Ch, Beijering, K. and Heeringa, W. (2008). Phonetic and lexical predictors of intelligibility. International Journal of Humanities and Arts Computing , 2(1–2): 63–81. Gooskens, Ch. and Heeringa W. (2004). Perceptive Evaluation of Levenshtein Dialect Distance Measurements using Norwegian Dialect Data. Language Variation and Change, 16(3), 189–207. Heeringa, W. (2004). Measuring dialect pronunciation differences using Levenshtein distance . Phd thesis, University of Groningen. Heeringa, W., Kleiweg, P., Gooskens, Ch. and Nerbonne, J. (2006). Evaluation of String Distance Algorithms for Dialectology. In Linguistic Distances Workshop at the joint conference of International Committee on Computational Linguistics and the Association for Computational Linguistics, Sydney, July, 2006 , edited by John Nerbonne and Erhard Hinrichs, 51–62. Stroudsburg PA: The Association for Computational Linguistics (ACL). Kessler, B. (1995). Computational dialectology in Irish Gaelic. In Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics , 60–67, Dublin. EACL. 44

Recommend


More recommend