Quantitative evidence for paradigm structure Olivier Bonami Université Paris Diderot 18th International Morphology Meeting Budapest, June 10, 2018 1
An inflectional morphologist’s view on derivational paradigms I ▶ The idea of a paradigmatic view of derivational morphology is certainly not new ▶ See among many others: van Marle (1984); Becker (1993); Bochner (1993); Booij (1997); Pounder (2000); Roché et al. (2011) ▶ Yet this idea has been faced with skepticism by many, in particular by many inflectional morphologists. ▶ I see three immediate causes for this: 1. Unclarity as to what the term ‘paradigm’ designates. 2. Purported properties setting apart derivation from inflection 3. The fact that our conceptualizations of inflectional paradigms and derivational families seem incompatible. ▶ The present talk reflects my own point of view on the issue: I will present arguments meant to convince the skeptical inflectional morphologist. 2
An inflectional morphologist’s view on derivational paradigms II ▶ I will make 3 points: 1. As we learn more about inflection systems, we have fewer reasons to believe that inflection and derivation differ in the relevant ways. 2. A common conceptualization encompassing both inflectional paradigms and structured derivational families is possible. 3. Arguments for paradigmatic organization in inflection can be redeployed fruitfully in the context of derivation. ▶ Abstractive point of view (Blevins, 2006): focus on relations between surface words, as they can be inferred from direct observations of usage. ▶ Instrumented approach: ▶ Generalizations are extracted from large lexica and/or corpora ▶ Computational implementation provides an operational, fully explicit formulation of linguistic hypotheses. ▶ I focus mainly on French. 3
Renouncing skepticism
Classical arguments against derivational paradigms ▶ Derivational families can not be structured into paradigms because… 1. Lexical gaps: Paradigms are supposed to be exhaustive, but derivational families are full of gaps. 2. Variation: Paradigms are supposed to have a unique form in each of their cells, but derivational families contain lots of doublets. 3. Semantic irregularity: Paradigms are supposed to encode reliable contrasts, but derived forms differ in unpredictable ways from their bases. ▶ In each case, I will argue that what we have learned on inflection in the past two decades changes the picture. 5
1. Renouncing skepticism 1.1. Gaps
Gaps ▶ The skeptic’s argument: ▶ Postulating paradigms supposes that we have words to fill the cells in these paradigms. ▶ In inflection this is fine, because inflection is “fully productive”. ▶ This has to be so, otherwise the demands of syntax could not be met (“inflection is obligatory”). ▶ On the other hand, derivation is usually less than fully productive: there are lots of gaps. ▶ This has to be so, because new lexemes are coined only as the need for them arises. ▶ So, paradigms in derivation do not make sense because they would be hollow. 7
Problem 1: the requirements of syntax ▶ Paradigm cells exhibit a Zipfian distribution (Blevins et al., 2017). 20000 15000 Frequency in the FTB 10000 5000 0 p3s w kms p3p kfs i3s kmp g kfp f3s c3s i3p f3p s3s p1p j3s c3p p1s s3p j3p y1p p2p f1p y2p i1p i1s c1s t3s c1p s1s f1s p2s f2p i2p s1p y2s s2p c2p j1s f2s t1s t3p yfp Paradigm cell Frequency of verbs by paradigm cell in the French Treebank (Abeillé et al., 2003) 8
Problem 1: the requirements of syntax ▶ As a result, even at very large corpus sizes, inflectional paradigms do not “fill up” on average (Bonami and Beniamine, 2016). Average number of distinct orthographic forms for verbs from the Lefff lexicon (Sagot, 2010) when progressing through the FrWac corpus (Baroni et al., 2009) 9
Problem 2: ‘‘full productivity” ▶ Although syntax may require any forms of any lexeme, most forms of most lexemes will never be required. ▶ Given this, it is unclear what ‘‘full productivity” means. ▶ Operational measures of productivity (Baayen, 2001; O’Donnell, 2015) are inherently gradient. ▶ As Gaeta (2007) shows, some inflectional processes are less productive than some derivational processes. ▶ This strongly suggests that, while inflectional relations may be more productive than derivational relations on average, they are not in general. 10
Problem 3: Defectiveness ▶ We are used to thinking of defectiveness as an anomaly, unlike lexical gaps. ▶ The notion of defectiveness itself is gradient (Sims, 2015): ▶ Defective forms are usually attested in large enough corpora. ▶ Note the contrast with the fact that many nondefective forms are not attested . ▶ Defectiveness is the failure for a form to reach an expected frequency of occurrence, given prior knowledge on the frequency distribution of inflected forms for comparable lexemes. ▶ Crucially, defectiveness is thus doubly gradient: ▶ The frequency may be more or less close to zero ▶ The unexpectedness of that frequency may be more or less large. ▶ No reason to think that the same does not hold for “lexical gaps”. 11
1. Renouncing skepticism 1.2. Variation
Variation ▶ The skeptic’s argument: ▶ Postulating paradigms supposes that we can identify a unique word to fill each paradigm cell. ▶ In inflection this is fine, because doublets are vanishingly rare. Exceptions can and should be reduced. ▶ This has to be so, because inflection is a function (Stump, 2001; Bonami and Boyé, 2007). ▶ In derivation, more often than not, there are multiple lexemes for the same derivational category, which may or may not contrast semantically (Fradin, to appear). ▶ So, paradigms in derivation do not make sense because cells would be overpopulated. 13
Overabundance I ▶ Thornton (2011, 2012, forthcoming) was instrumental in demonstrating that overabundance is a real and widespread phenomenon, directly falsifying the claim that doublets do not occur in inflection. ▶ Hence, if there is a difference between inflection and derivation here, it is at most a difference of extent. ▶ So, what is the extent of the difference? 14
Overabundance II ▶ Guzman Naranjo and Bonami (2016) on Czech nominal declension: nom gen dat acc voc loc ins sg 1.3% 2.8% 1.2% 2.1% 0.7% 10.0% 1.0% pl 8.6% 2.5% 4.2% 1.6% 1.5% 4.9% 14.9% Proportions of lexemes attested in more than one form for each paradigm cell – SYN v4 corpus (Hnátková et al., 2014, 4.3 billion tokens), forms validated in the MorfFlex lexicon (Hajič and Hlaváčová, 2013) ▶ Compare numbers for French derivational families documented in the Démonette database (Hathout and Namer, 2014). Morphosemantic category Proportion Verb 1.6% Action noun 16.5% Agent noun 0.7% Proportions of categories attested in the form of more than one lexeme in the FrWaC corpus (Baroni et al., 2009, 1.6 billion tokens) 15
Overabundance III ▶ Although a more principled comparison is in order, the evidence points to comparable amounts of overabundance in inflection and derivation. 16
1. Renouncing skepticism 1.3. Stability of contrast
Setting the stage ▶ The skeptic’s argument: ▶ The syntactic and semantic contrast between cells in an inflectional paradigm is stable across lexemes: e.g. the opposition between present and past is the same for all verbs. laughs washed = pay wash laughed = paid ▶ On the other hand, the meaning and distribution of a derived lexeme is somewhat unpredictable, and hence the contrasts between lexemes standing in the same derivational relation is somewhat unstable across lexemes. laugh wash pay laughable ̸ = washable ̸ = payable ▶ As a result, derivational families can’t be structured in paradigms, because we can’t decide what counts as “filling the same cell”. ▶ Bonami and Paperno (submitted) explores the issue of stability of contrasts in inflection and derivation using a distributional approach. 18
Distributional semantics in a nutshell I ▶ The distributional hypothesis (see also Harris 1954; Firth 1957): The degree of semantic similarity between two linguistic expres- sions A and B is a function of the similarity of the linguistic con- texts in which A and B can appear. (Lenci, 2008, 3) ▶ Contemporary computational linguistics operationalizes this idea to deduce semantic representations from large corpora. ▶ Toy example: we start with a cooccurrence table: ride eats dog 1 5 horse 3 4 car 5 0 19
Distributional semantics in a nutshell II ▶ Such cooccurrence counts are vectors: 5 4 ride eats 3 dog 1 5 dog eats horse 3 4 horse 2 car 5 0 1 car 0 0 1 2 3 4 5 ride ▶ In practice: ▶ Realistic representations rely on cooccurrences with very large lexica in large corpora ⇒ many more dimensions. ▶ For efficiency reasons, most current systems rely on prediction tasks rather than explicit cooccurrence counts to infer vector representations (see e.g. Mikolov et al., 2013). ▶ These technical aspects can be ignored here. 20
Distributional semantics in a nutshell III ▶ One highly relevant application: proportional analogies through vector arithmetics (Mikolov et al., 2013) • prediction + woman queen − man g g queen n n i i k k woman m a n 21
Recommend
More recommend