background background
play

Background Background Text Complexity Text Complexity Text - PowerPoint PPT Presentation

On Measures of On Measures of On Measures of Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya V.B., Sowmya V.B., What do we mean by measures of text complexity ? Why would anyone want to do


  1. On Measures of On Measures of On Measures of Background Background Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya V.B., Sowmya V.B., What do we mean by “measures of text complexity” ? Why would anyone want to do this? Detmar Meurers Detmar Meurers Detmar Meurers Background Background Background What is it? What is it? What is it? Why is it relevant? Why is it relevant? Why is it relevant? ◮ Measuring how difficult it is to read a text , Real-life challenge Real-life challenge Real-life challenge ◮ To evaluate the quality of (manually written) texts, e.g., On Measures of Text Complexity Some measures Some measures Some measures ◮ given a purpose , e.g., ◮ for articles, manuals, books to be accessible to the Traditional formulas and Traditional formulas and Traditional formulas and lexical measures lexical measures lexical measures ◮ general comprehension of key ideas of text intended readership Some recent CL approaches Some recent CL approaches Some recent CL approaches Language acquisition Language acquisition Language acquisition ◮ for reading and writing assessment in language teaching Sowmya V.B. Detmar Meurers ◮ identification of specific information looked for Psycholinguistics Psycholinguistics Psycholinguistics Universit¨ at T¨ ubingen How to evaluate How to evaluate How to evaluate ◮ based on properties of the text using criteria which are ◮ To evaluate the quality of natural language generation Work in progress Work in progress Work in progress ◮ theory-driven (e.g., difficult syntactic constructions) systems (e.g., in text summarization) References References References ◮ data-induced (e.g., corpora with graded texts), and a lot ◮ To track first and second language acquisition and ◮ in-between (e.g., derived frequency information for words) Second T¨ ubingen-Berlin Meeting on Analyzing Learner Language language attrition T¨ ubingen, December 5-6, 2011 ◮ and information about the user (e.g., language ability, ◮ Analysis of complexity in Kobalt-DaF network age, working memory) ◮ Criterial features for language development (MERLIN) ◮ obtained directly (e.g., questionnaire), or ◮ indirectly (e.g, inferred from nature of a query) 1 / 12 2 / 12 3 / 12 On Measures of On Measures of On Measures of A concrete “real-life” challenge How do we measure text complexity? Some recent CL research on text complexity Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya V.B., Sowmya V.B., Traditional readability formulas and lexical measures Detmar Meurers Detmar Meurers Detmar Meurers Background Background Background What is it? What is it? What is it? Why is it relevant? Why is it relevant? Why is it relevant? Real-life challenge Real-life challenge Real-life challenge ◮ Language n-gram models (Collins-Thompson & Callan ◮ Develop a search engine ranking web-search results ◮ Clearly different aspects of linguistic complexity play a Some measures Some measures Some measures based on complexity. 2004; Si & Callan 2001) Traditional formulas and Traditional formulas and Traditional formulas and lexical measures role in determining the readability of a text. lexical measures lexical measures ◮ support a range of complexity features Some recent CL approaches Some recent CL approaches Some recent CL approaches ◮ Machine learning approaches, with several lexical and Language acquisition Language acquisition Language acquisition ◮ first prototype of a Language-Aware Search Engine ◮ But traditional readability formulas use only shallow Psycholinguistics Psycholinguistics Psycholinguistics syntactic features (Heilman et al. 2007; Petersen & (Ott & Meurers 2010) How to evaluate quantiative features (e.g., lengths of words, sentences). How to evaluate How to evaluate Ostendorf 2009; Lijun Feng & Elhadad 2010) Work in progress Work in progress Work in progress ◮ (e.g., Flesch 1948; Coleman & Liau 1975; Kincaid et al. ◮ Which measures of complexity should we use? References References References 1975; DuBay 2004) ◮ What kind of features are relevant here? Insights from ◮ Which gold-standards can the resulting approach be ◮ Others are exclusively based on lexical measures , ◮ Language Acquisition evaluated against? such as occurrence in specific word lists (Dale & Chall ◮ Psycholinguistics 1948; Chall & Dale 1995; Coxhead 2000; Bauer & Nation 1993). 4 / 12 5 / 12 6 / 12 On Measures of On Measures of On Measures of Complexity in language acquisition Complexity and psycholinguistics How do we evaluate complexity measures? Text Complexity Text Complexity Text Complexity Sowmya V.B., Sowmya V.B., Sowmya V.B., Automated L1 acquisition measures : Detmar Meurers Detmar Meurers Detmar Meurers ◮ Measures based on identifying specific syntactic patterns: Background ◮ long and colorful history, cf. Derivational Theory of Background Background What is it? What is it? ◮ Measures of readability typically are evaluated against What is it? Why is it relevant? Complexity (DTC, Fodor et al. 1974) Why is it relevant? Why is it relevant? ◮ Index of Productive Syntax (IPSyn, Scarborough 1990; a gold standard classification of graded readers, which Real-life challenge Real-life challenge Real-life challenge Sagae et al. 2005) ◮ meaning: propositional idea density (Kintsch 1974; Some measures Some measures are written with the traditional measures in mind. Some measures ◮ Developmental Level (D-Level, Rosenberg & Abbeduto Traditional formulas and Traditional formulas and Traditional formulas and Turner & Greene 1977; Brown et al. 2008) lexical measures lexical measures lexical measures 1987; Covington et al. 2006; Lu 2009) ◮ What can serve as independently motivated gold Some recent CL approaches Some recent CL approaches Some recent CL approaches Language acquisition ◮ form: complexity in human sentence processing (e.g., Language acquisition Language acquisition ◮ Some others include (cf., Cheung & Kemper 1992) : standard for evaluating complexity? Psycholinguistics Psycholinguistics Psycholinguistics surprisal Boston et al. 2008, 2011) How to evaluate How to evaluate How to evaluate ◮ Developmental Sentence Scoring (DSS) ◮ Correlating complexity with cognitive measures Work in progress Work in progress Work in progress ◮ Directional Complexity (D-Complexity) ◮ discourse: text coherence and cohension (Coh-Metrix ◮ online eye tracking measures identifying processing References References References ◮ Frazier’s node count, Yngve’s depth project, McNamara et al. 2002 ) difficulty in human sentence processing ◮ working memory decrease in language attrition ◮ Link to cognition also relevant for applications: Automated Second-Language Acquisition measures : (Cheung & Kemper 1992) ◮ Cognitively motivated readability assessment for adults ◮ Lexical richness (Lu 2011b) ◮ Analyzing complexity of the language produced at with intellectual disabilities (Feng et al. 2009; Feng 2010) ◮ Syntactic complexity in second language writing ◮ Using Syntactic Complexity measures to detecting different times in first language acquisition (Lu 2010, 2011a; Vyatkina 2012) cognitive impairment (Roark et al. 2007) ◮ Measures mostly based on general counts of phrases, T-units, clauses, . . . 7 / 12 8 / 12 9 / 12

Recommend


More recommend