1
play

1 Illustration of measurements Boston Corpus: Structural hypothesis - PDF document

The problem Modeling stress assignment in English noun-noun compounds: compounds in English are stressed on the left-hand member (e.g. blckboard, wtchmaker ). a quantitative perspective nuclear stress rule vs. compound stress rule


  1. The problem Modeling stress assignment in English noun-noun compounds: • compounds in English are stressed on the left-hand member (e.g. bláckboard, wátchmaker ). a quantitative perspective • nuclear stress rule vs. compound stress rule (Chomsky and Halle 1968:17) • many unexplained exceptions, and cross-variety variation (e.g. BrE vs. Gero Kunter, Ingo Plag, Sabine Lappe & Maria Braun AmE) Boston márathon Penny Láne Universität Siegen summer níght aluminum fóil may flówers silk tíe In general: • claims on compound stress are largely based on anecdotal evidence and introspection Conference Quantitative Investigations in Theoretical Linguistics 2 , 1-2 June 2005, Osnabrück • no systematic large-scale empirical evidence available yet Three approaches Testing the hypotheses 1. The structural hypothesis • Plag (2006, experimental study): (e.g. Giegerich 2004, Bloomfield 1933, Lees 1963, Marchand 1969 or Payne/Huddleston 2002) all three types of factor interact in compound stress • modifier-head structures are regularly stressed on the RIGHT assignment in complex ways. constituent ( steel brídge ) • argument-head structures are always LEFT-stressed ( ópera singer ) • this paper: corpus study testing the three hypotheses more • left stress on modifier-head structures is due to lexicalization thoroughly ( ópera glasses ) - many more different word types - many more tokens 2. The semantic hypothesis - many more semantic relations (e.g. Fudge 1984, Ladd 1984, Liberman and Sproat 1992, Olsen 2000, 2001) - computational modeling of analogical effects stress assignment according to semantic categories • Data 3. The analogical hypothesis - Boston University Radio Speech Corpus (Ostendorf et al. 1996) (e.g. Schmerling 1971, Liberman and Sproat 1992, Plag 2006) (N = 4410, V = 2476, AmE) stress assignment in analogy to similar compounds in the lexicon - CELEX lexical data base (Baayen et al. 1995) (N = 4491, V = N, BrE) Procedure Boston Corpus: Example (cf. Farnetani et al. 1988, Ingram et al. 2003, Plag 2006) Step 1 Measure mean fundamental frequency (F 0 ) of the main stressed The device is attached to a plastic wristband . It looks like vowels of the two members, respectively, and calculate the a watch. It functions like an electronic probation officer . difference (left F 0 minus right F 0 , logarithmically transformed into semitones (ST), ’ pitch difference‘ ) When a computerized call is made to a former prisoner's home phone , that person answers by plugging in the device. The wrístband home phóne +5.39 ST -0.97 ST wristband can be removed only by breaking its clasp, and if that's done the inmate is immediately returned to jail. The Step 2 Look for statistically significant pitch differences between distinct description conjures up images of big brother watching. But Jay kinds of compound Ash, deputy superintendent of the Hampton County jail in Example: Springfield, says the surveillance system is not that Left-headed compounds (such as attorney géneral ) should have a significantly smaller pitch difference than right-headed compounds (e.g. sinister. wrístband ) 1

  2. Illustration of measurements Boston Corpus: Structural hypothesis Right-headed vs. left-headed compounds in Boston Corpus Argument-head vs. modifier-head compounds 20 20 mean pitch difference in mean pitch difference in semitones semitones 15 15 right- left- argument- modifier- 10 10 headed headed head head 5 3.332 0.052 3.736 3.250 5 0 0 significant difference, but -5 large overlap between the -5 -10 two groups -10 effect size is very small wrístband attorney géneral t (4089) = 2.36, p < 0.05, Cohen’s d = 0.01 t (4408) = 4.91, p < 0.01, Cohen‘s d = 0.80 Boston Corpus: Structural hypothesis Boston Corpus: Structural hypothesis Interaction of structure and morphology of head A closer look at argument-head vs. modifier-head compounds not significant morphology argument-head modifier-head of head - er law makers house speaker - ing fundraising spring training - ion jury selection health education con (N=572) conversion tax increase litmus test 15 10 (also, with low frequency: - age , - al , - ance , …) 5 0 -5 -10 F (9, 4062) = 2.89 Argument-Head Modifier-Head p < 0.01 R ² = 0.015 Boston Corpus: Lexicalization effect? Boston Corpus: Lexicalization effect? Pitch difference by Google frequency 5 Two ways of quantifying lexicalization only very small tendency only very small tendency • relation between pitch for highly frequent 15 for highly frequent difference and Google 4 compounds to be more compounds to be more frequency shows an S- - Frequency left-stressed 10 left-stressed shaped distribution 3 Higher frequency should correlate with higher degree of • typical of categorical 5 lexicalization no difference between no difference between changes 2 AH or MH compounds 0 AH or MH compounds F (1, 4069) < 1 F (1, 4069) < 1 1 -5 - Spelling -10 Lexicalized compounds are more prone to one-word spellings 0 F (1, 4071) = 15.58, p < 0.001, R ² = 0.004 2

  3. Boston Corpus: Lexicalization effect? Boston Corpus: Lexicalization effect? Spelling and lexicalization Interaction between structure and spelling Predictions: Assumptions: 5.0 • Modifier-Head compounds • one-word spellings are 20 spelled as one word should be indicative of lexicalization more left-stressed than • high frequency is indicative Argument-H 4.5 Modifier-head compounds Modifier-He of lexicalization spelled as two words 15 • no effect of that kind with Prediction: 4.0 Argument-Head compounds compounds spelled as one word should have higher 10 3.5 frequency than those spelled Results: as two words • Modifier-Head compounds spelled as one word are indeed Results: 3.0 more left-stressed • expected effect • spelling of Argument-Head 5 • large effect size compounds does not interact => spelling is an indicator of with stress position lexicalization • only very weak effect F (3, 4030) = 12.79, p < 0.001, R ² = 0.009 t (3388) = 15.58, p < 0.001, Cohen´s d = 0.89 Boston Corpus: Structural hypothesis Boston Corpus: Semantic hypothesis A summary Methodological problems • significant effect of argument vs. modifier only with a subset of • Semantic categories and semantic relations mentioned in the potential compounds (i.e. – er as righthand head morphemes) literature (such as ‚N2 is a material‘, ‘N2 is located at N1’) are hard to test due to their being generally ill-defined • a measurable lexicalization effect (based on frequency and spelling) • Items are often ambiguous (i.e. show more than one relation) • effect sizes are all very small – a lot of the variation is unaccounted • The number of potentially relevant semantic categories and for under this hypothesis relations is unclear The structural hypothesis is not well supported by the data Our methodology • We used a set of 18 semantic relations (based mainly on Levi 1978), also widely used in studies on compound interpretation (e.g. Gagné & Shoben 1997, Gagné 2001) • Semantic classification was done by two independent raters – only those data are analyzed where the two ratings agreed Boston Corpus: Semantic hypothesis Boston Corpus: Semantic hypothesis Categories referring to constituents or the compound as a whole The literature on rightward stress makes use of either Rightward stress is predicted if... • N1 refers to a period or point in time ( morning edition ) categories referring to constituents or the • N2 is a geographical term ( Boston area ) compound as a whole • N2 is a type of thoroughfare ( Sesame Street ) • N1 and N2 form a proper noun ( Tufts University ) or (e.g. Fudge 1984: 144ff, Liberman & Sproat 1992) categories referring to semantic relation 3

Recommend


More recommend