jasmine benn hr t bingen december 5th 2011 http ww
play

Jasmine Bennhr Tbingen, December 5th, 2011 - PowerPoint PPT Presentation

COMPOST Identification of indicators for comp etence assessment o f st udents essays: How do Textual Indicators Evolve? Jasmine Bennhr Tbingen, December 5th, 2011


  1. COMPOST Identification of indicators for comp etence assessment o f st udents’ essays: How do Textual Indicators Evolve? Jasmine Bennöhr Tübingen, December 5th, 2011 http://ww.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/forschung/kompost/compost

  2. Overview I Introduction Purpose, aims and examples II Data III Methodology IV Results and implications V Work in Progress VI Future Work VII Conclusion 2

  3. I Introduction - Aim Aim: find (new/interesting) indicators for language quality in essays Measure how the indicators evolve over time 3

  4. I Introduction: Purpose Purpose: Identify pupils with special needs in language training Side effects Data for developing or improving competence models

  5. II Data – Essay Corpus - Origin Essay corpus – collected during the longitudinal study KESS (Kompetenzen und Einstellungen von Schülerinnen und Schülern – competences and attitudes of pupils) Programme for student assessment KESS: complete survey of a year of pupils in Hamburg Grades 4, 7, 8, 10 (and 12) in years 2003, 2006, 2007, 2009 (and 2011).

  6. II Data – Compost Essay Corpus - Overview Essays N available digitalised [1] rated [2] Test results for N validation KESS4 – 2003 839 ca. 8000 KFT (1 topic) KESS7 – 2006 126 63 and 63 Reading (2 topics) (of appr. 1500) comprehension KESS8 - 2007 1705 1705 C-test, grammar, (13 topics) vocabulary, spelling, reading comprehension KESS10 - 2009 1189 Not yet C-test, spelling, (6 topics) rated, reading 1189 comprehension

  7. II Data: Extract from test booklet Example: task from grade 4 Texts are digitized (typed manually) Interpretation begins when texts are digitalised That is decisions at this point affect results

  8. III Methodology: Annotation and frequencies Annotation Operationalise features that shall be determined automatic annotation Operationalisation which can be applied automatically. How can features be identified? Check quality of annotations Determine frequencies

  9. III Methodology: Annotation is interpretation Only what is annotated can be counted Interpretation is continued Errors can be inserted during annotation

  10. IV Results: Word length word length 5,1 5 4,9 4,8 4,7 4,6 4,5 4,4 4,3 4,2 4,1 KESS4 KESS7 KESS8 KESS10 x: grade y: letters per word

  11. IV Results: Commas commas 6,0000 5,0000 4,0000 3,0000 2,0000 1,0000 0,0000 KESS4 KESS7 KESS8 KESS10 x: grade y: commas per 100 words

  12. IV Results: -heit, -keit, -ung -heit -keit -ung 0,0800 0,2500 1,8000 1,6000 0,0700 0,2000 1,4000 0,0600 1,2000 0,0500 0,1500 1,0000 0,0400 0,8000 0,1000 0,0300 0,6000 0,0200 0,4000 0,0500 0,0100 0,2000 0,0000 0,0000 0,0000 KESS4 KESS8 KESS10 KESS4 KESS8 KESS10 KESS4 KESS8 KESS10 x: grade y: -heit, -keit, -ung per 100 words

  13. IV Results Word length is one of the most reliable features Certain suffixes show an evolvement, but not all

  14. IV Results: Implications Good starting point But, from there we want to go further Word length is a number, cannot be interpreted in terms of content/structure An approach that is motivated more by a linguistic point of view Analysis of suffixes, problem: choice and data sparseness Combine both  look at structure of words and how that develops over time

  15. V Work in Progress Skim through tokens with high word length Look at morphological structure, complexity For simplicity we assume Prefix, Suffix, Lexemes, Flexives We want to look at combinations E. g. Prefix + Lexeme + Suffix Case study with prefix + lexeme + -ung High number of occurrences Example: <Auf><frisch><ung>

  16. V Work in Progress: Preliminary Results <prefix><lexeme+|prefix*|suffix*><ung> 1,2 1 0,8 0,6 0,4 0,2 0 KESS4 KESS7 KESS8 KESS10 x: grade y: forms per 100 words

  17. V Work in Progress: Preliminary Results – KESS4 216 – ung 109 <prefix><lexeme+|prefix*|suffix*><ung> <Ver><mut><ung> <ent><vern><ung> <An><leit><ung>

  18. V Work in Progress: Preliminary Results - KESS8 1511 – ung 569 <prefix><lexeme+|prefix*|suffix*><ung> <An><leit><ung>,<ver><spät><ung>, <Ver><pflicht><ung> <Vor><wahrn><ung>, <er><källt><ung> <Um><satztsteiger><ung> False positives: <Er><derwärm><ung>

  19. V Work in Progress: Preliminary Results - KESS10 902 – ung 457 <prefix><lexeme+|prefix*|suffix*><ung> <Ab><mahn><ung>, <Ver><zweifl><ung>

  20. VI Future Work Type/token ratio – ung bzw. <prefix><lexeme+|prefix*|suffix*><ung>

  21. VI Future Work Focus: How do word structures of students develop? Prefix chains <un><ent> <schied><en> Suffix chains <Tät> <ig><keit>, <Pünkt> <lich><keit> Combination of several prefixes and suffixes <prefix><prefix><lexeme><suffix> <un><be> <greif> <lich>,<un><ver> <kenn> <bar> <prefix><lexeme><suffix><suffix> <Über> <pünkt> <lich><keit>

  22. V Conclusion: Summary Evolvement of indicators over time From surface indicator word length and individual affixes to a more linguistically motivated analysis Word length is not well interpretable but tightly linked to morphological structure Individual affixes (suffixes) Structure of words Qualitative analysis meaningful How do students construct words and how does that develop over time?

Recommend


More recommend