Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Preàmbul: presentació perfil: llicenciada Filologia Espanyola, UAB, 2000 doctora, UPF (àrea: Lingüística Computacional), 2007 Doctorat Interuniversitari en Ciència Cognitiva i Llenguatge GLiCom: http://glicom.upf.es/ post-doc Juan de la Cierva, UPC, mitjan 2008 - mitjan 2011 (?) pla de la xerrada: tesi mica intenció futur 1 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Automatic acquisition of semantic classes for adjectives Gemma Boleda Torrent GLiCom Universitat Pompeu Fabra / Barcelona Media Centre d’Innovació NLP Seminar, UPC, November 14th 2007 2 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Overview automatic acquisition of semantic classes for Catalan adjectives two main hypotheses: adjective meanings can be assigned to a set of classes semantic distinctions mirrored at different linguistic levels Lexical Acquisition infer properties of words from their linguistic behaviour in corpora 3 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Approach (I) no general, well established semantic classification → propose and test classification iterative methodology deductive phase: define a classification and apply it to a set of adjectives → manual annotation and machine learning experiments inductive phase: use the evidence gathered to refine the classification proposal 4 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Approach (II) three iterations Experiment Technique Main goal A refine classification Unsupervised B validate refined classification C Supervised integrate polysemy 5 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Contents Introduction 1 Initial classification 2 Experiments A and B: Testing the classification 3 Experiment C: Integrating polysemy 4 Conclusion 5 I a la UPC? 6 6 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Initial classification insights from descriptive grammar and formal semantics Qualitative adjectives denote attributes or properties of objects. ample, autònom Examples: ‘wide’, ‘autonomous’ Intensional adjectives denote second order properties. presumpte, antic Examples: ‘alleged’, ‘former’ Relational adjectives denote a relationship to an object. pulmonar, botànic Examples: ‘pulmonary’, ‘botanical’ semantic classification supported by distinctions at other levels of description 7 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Initial classification insights from descriptive grammar and formal semantics Qualitative adjectives denote attributes or properties of objects. ample, autònom Examples: ‘wide’, ‘autonomous’ Intensional adjectives denote second order properties. presumpte, antic Examples: ‘alleged’, ‘former’ Relational adjectives denote a relationship to an object. pulmonar, botànic Examples: ‘pulmonary’, ‘botanical’ semantic classification supported by distinctions at other levels of description 7 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Initial classification insights from descriptive grammar and formal semantics Qualitative adjectives denote attributes or properties of objects. ample, autònom Examples: ‘wide’, ‘autonomous’ Intensional adjectives denote second order properties. presumpte, antic Examples: ‘alleged’, ‘former’ Relational adjectives denote a relationship to an object. pulmonar, botànic Examples: ‘pulmonary’, ‘botanical’ semantic classification supported by distinctions at other levels of description 7 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Initial classification insights from descriptive grammar and formal semantics Qualitative adjectives denote attributes or properties of objects. ample, autònom Examples: ‘wide’, ‘autonomous’ Intensional adjectives denote second order properties. presumpte, antic Examples: ‘alleged’, ‘former’ Relational adjectives denote a relationship to an object. pulmonar, botànic Examples: ‘pulmonary’, ‘botanical’ semantic classification supported by distinctions at other levels of description 7 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Initial classification insights from descriptive grammar and formal semantics Qualitative adjectives denote attributes or properties of objects. ample, autònom Examples: ‘wide’, ‘autonomous’ Intensional adjectives denote second order properties. presumpte, antic Examples: ‘alleged’, ‘former’ Relational adjectives denote a relationship to an object. pulmonar, botànic Examples: ‘pulmonary’, ‘botanical’ semantic classification supported by distinctions at other levels of description 7 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Criteria (I): position with respect to the head noun qualitative (1) intensional (2) relational (3) pre- and post-nominal pre-nominal post-nominal les avingudes amples / les amples avingudes 1 ‘wide avenues’ #l’assassí presumpte / el presumpte assassí 2 ‘the alleged murderer’ una malaltia pulmonar / #una pulmonar malaltia 3 ‘a pulmonary disease’ 8 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Criteria (II): predicativity qualitative (1) intensional (2) relational (3) predicative non-predicative marginally predicative les avingudes són amples 1 ‘avenues are wide’ #l’assassí és presumpte 2 ‘the murderer is alleged ’ ?la malaltia és pulmonar 3 ‘the disease is pulmonary’ 9 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Polysemy edifici antic (qualitative) / antic president (intensional) 1 ‘ancient building / former president’ reunió familiar (relational) / cara familiar (qualitative) 2 ‘family meeting / familiar face’ in each sense, the adjective’s behaviour corresponds to that of the relevant class 10 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Polysemy edifici antic (qualitative) / antic president (intensional) 1 ‘ancient building / former president’ reunió familiar (relational) / cara familiar (qualitative) 2 ‘family meeting / familiar face’ in each sense, the adjective’s behaviour corresponds to that of the relevant class 10 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Motivation for unsupervised experiments classification based primarily on literature review does it account for the semantics of a broad range of adjectives? empirical test: use information extracted from corpus in machine learning experiments exploratory experiments → clustering (unsupervised) no bias by previous annotation insight into the actual structure of the data two sets of experiments (Exp. A, Exp. B) 11 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Motivation for unsupervised experiments classification based primarily on literature review does it account for the semantics of a broad range of adjectives? empirical test: use information extracted from corpus in machine learning experiments exploratory experiments → clustering (unsupervised) no bias by previous annotation insight into the actual structure of the data two sets of experiments (Exp. A, Exp. B) 11 / 37
Introduction Initial classification Experiments A and B: Testing the classification Experiment C: Integrating polysemy Conclusion I a la UPC? Experiment A: Material and method (I) – resources resources also used in Experiments B and C CTILC corpus ( Institut d’Estudis Catalans ): 14.5 million words, written, formal texts manually lemmatised and POS-tagged automatically shallow-parsed (noise) adjective database [Sanromà, 2003]: almost 2,300 lemmata from CTILC corpus morphological information manually coded 12 / 37
Recommend
More recommend