lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Applications in finite state automata The lexc language Kurt Eberle kurt.eberle@uni-tuebingen.de (includes material from Karttunen, Beesley, Butt and others) November 29, 2016 1 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 2 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Topics of this session � Understand lexc ◮ lexc files ◮ Continuation classes ◮ Defining lexical transducers ◮ Include definitions in lexc ◮ Useful strategies ◮ Integration and testing 3 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 4 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc files ◮ for definition of one or several lexica ◮ structure ◮ Multichar symbols declaration ◮ Declarations (definitions) ◮ Named lexica with one (start) lexicon named root 5 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file structure 6 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file The start lexicon 7 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file The start lexicon: notational variants 8 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file end of lexc files 9 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file special symbols 10 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc files compile file ◮ read lexc < ex1-lex.txt → network: 1 stack element ◮ save stack ex1-lex.fst 11 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file compiled lexicon 12 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing lexc file a kind of abbreviation 13 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 14 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes stems and continuations ◮ analyze lexical entries into ◮ stem (root) ◮ ending(s) (continuations) 15 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes example 16 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Syntax 17 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes continuation classes and morphotactics ◮ continuation classes inherited from Koskenniemi’s two-level morphology ◮ more difficult to model: ◮ separated dependencies ◮ interdigitation ◮ infixation ◮ reduplication 18 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Multiple classes of words ◮ style: stem + affix 19 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Multiple classes of words ◮ PoS classified stems 20 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Optionality ◮ style: aspect morphem as an option 21 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Optionality ◮ variant: aspect morphem as an option 22 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Continuation classes Loops ◮ several instances of a class in a string ◮ (N.B. ’no procedural reading’) 23 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 24 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Defining lexical transducers Transitions in lexc 25 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Lexical transducers regular expressions 26 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Lexical transducers Multichars ◮ Introductory section! 27 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 28 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Include definitions in lexc Definitions section ◮ Defined vars 29 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Outline lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing 30 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies ◮ visualize the morphotactic templates before coding (informal sketch) ◮ a lexicon should subsume a generally coherent collection of morphemes ◮ a lexicon is a potential target for continuations from other morphemes ◮ prefer most general solutions if possible (large classes) ◮ avoid multiple copies of a morpheme if possible ◮ separated dependencies are difficult to model with continuation classes: use filters instead, if possible ◮ read error messages carefully. Be aware of loops and filters increasing complexity heavily 31 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Visualization 32 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Organization into coherent lexicons 33 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Using sublexicons 34 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Classes and subclasses . . . 35 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Classes and subclasses: add irregularities 1 36 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Classes and subclasses: add irregularities 2 (better solution) 37 / 56
lexc files Continuation classes Defining lexical transducers Include definitions in lexc Useful strategies Interation and testing Useful strategies Tag-names ◮ prime directive: appropriate and known tags (if possible) ◮ secondary directive: same name for same phenomenon ◮ tertiary directive: take up naming used in grammars of related languages if possible 38 / 56
Recommend
More recommend