E-Learning Materials Development Based on Abstract Analysis Using - PDF document

E-Learning Materials Development Based on Abstract Analysis Using Web Tools Tomofumi NAKANO and Yukie KOYAMA Nagoya Institute of Technology Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan { nakano, koyama } @center.nitech.ac.jp Abstract. This study includes an original corpus of engineering journals and is part of the series of E-Learning & English for Specific Purposes (ESP) researches . Purposes (ESP) researches that includes an original corpus of engineering journals. In this paper the results of a corpus study will be presented, and a sample of the ESP e-learning materials being developed for graduate students in engineering will be shown. Abstracts were chosen for the corpus this time because students are likely to read many for their research, and eventually to have to produce their own. We prepare the 40,000-word corpus that consists of 263 abstracts from mechanical and electrical engineering journals. The corpus is analyzed using Wmatrix, which gives part-of-speech tags and semantic tags, and compares the results with those of the BNC written corpus sampler. Some special features found in the analysis are frequencies in semantic tags, part-of-speech tags, difference in the use of verbal forms and multi-words. As an application of the important features, we are develop- ing web-based materials which include the original abstracts with target items hyper-linked to various pages containing exercises, concordances, grammar explanations, a bilingual dictionary, etc. 1 Introduction In the field of English teaching, since Swales claimed in his epoch-making book, Genre Analysis [11], it has been widely accepted that ESP is one of the most ef- ficient approaches in terms of content appropriateness and students’ motivation. Since we teach at a university of technology, we first started the need analysis and found that reading, especially reading academic papers, is the most important skill for engineering students. After that, we started to compile an original corpus of engineering journal papers. This corpus is still growing both in its discipline coverage and its quantity. In this study the results of a corpus study and a sample of the ESP e-learning material will be shown. This material is developed for graduate students in engineering this time because students are likely to read many articles for their research, and eventually to have to produce articles of their own. Needless to say, abstracts play an enormously important role in the academic world, because by reading abstracts, in many cases, the readers decide whether or not they continue

to read the full papers [4]. Another reason is that, in the English as a Foreign Language (EFL) situation, researchers often write their abstracts in English but the rest of the paper is written in their first language. This also raises the need for abstract analysis in EFL situations such as in Japan. While reading is the most important language skill for engineering students in Japan [6], they are hindered in reading academic papers by a lack of vocabulary (usually sub-technical or academic) [1] and by difficulty with the grammar of long, often complex sentences [5]. Therefore, this study focuses not only on the word lists but also on part-of-speech and semantic areas. An application introduced in this study also makes it possible to adjust the level of frequency and the degree of specification of the word compared to that in general corpus. As Morton points out, the problem for a student is not technical vocabulary but the difficult words of more general English [7]. Thanks to developments in ICT, E-Learning has become an ideal medium for language learning because of its flexibility and the autonomous learning op- portunities it provides inside and outside the classroom. As a new application of the results of relevant analysis, an e-learning material for engineering graduate students will be introduced in the rest of this paper. 2 Corpus Analysis 2.1 The method of analysis The 40,000-word corpus used in this study consists of 263 abstracts from mechanical and electrical engineering journals. This corpus is taken from an originally compiled 1,120,000-word corpus of full papers of these journals. We use Wma- trix [9] for the abstract analysis, not only because this software is very easy to handle but also it has a special function which can determine the characteristics of the corpus. Using the Wmatrix the corpus was automatically tagged, both by part-of-speech tags with CLAWS7 [3] and by semantic tags of USAS (UCREL Semantic Analysis System [8]). Moreover, Wmatrix provides frequency tables and log-likelihood tables of words and these two kinds of tags. Log-likelihood is a measurement which shows the difference in frequencies of two different corpora [2]. Therefore, the information given by log-likelihood is very important for ESP material development in order to grasp the characteristics of the ESP Corpus. The two corpora used in this study are the abstract corpus and BNC written corpus sampler which is the built-in corpus in Wmatrix. In Table 1, the left word list is ordered by the frequency and the right word list is ordered by the log-likelihood. While almost all words are general in the left list, the words in right list are specific to the abstracts or engineering papers. 2.2 Results of the analysis The lists of part-of-speech tags and semantic tags are shown in Table 2 and 3 respectively. Both lists show tag names, the frequency in the corpus, its frequency

Table 1. Left: a word list ordered by the frequency. Right: a word list ordered by the log-likelihood. Abst. BNC log-like- rank word freq. rank word freq. rate freq. rate lihood 1 the 2459 1 the 2459 8.38 37283 3.79 1158.11 2 of 1205 2 of 1205 4.10 12817 1.30 1068.71 3 and 945 3 flow 119 0.41 10 0.00 772.82 4 a 683 4 model 103 0.35 20 0.00 621.26 5 in 525 5 results 88 0.30 31 0.00 488.40 6 is 522 6 energy 84 0.29 33 0.00 457.50 7 to 521 7 presented 63 0.21 9 0.00 392.35 8 for 396 8 method 59 0.20 16 0.00 340.94 9 are 262 9 fuel 59 0.20 22 0.00 324.30 10 with 260 10 paper 93 0.32 174 0.02 323.55 11 this 213 11 power 71 0.24 67 0.01 315.47 12 by 190 12 using 91 0.31 189 0.02 302.33 13 that 183 13 analysis 50 0.17 10 0.00 300.55 14 an 172 14 combustion 42 0.14 1 0.00 287.94 15 on 156 15 by 190 0.65 1293 0.13 286.05 16 be 149 16 performance 58 0.20 39 0.00 282.24 17 at 128 17 based on 55 0.19 31 0.00 278.82 18 from 127 18 experimental 43 0.15 4 0.00 277.34 19 as 120 19 conditions 61 0.21 61 0.01 266.37 20 flow 119 20 gas 70 0.24 106 0.01 265.30 Table 2. A part-of-speech tag list POS Abst. BNC log-like- tag freq. rate freq. rate lihood NN1 7297 24.86 147395 15.22 1447.40 JJ 3481 11.86 74927 7.74 533.93 FO 222 0.76 2050 0.21 233.75 VVN 1205 4.10 24675 2.55 226.88 AT 2483 8.46 67521 6.97 84.13 VBZ 522 1.78 11171 1.15 82.10 IO 1204 4.10 30286 3.13 78.32 NN2 2064 7.03 55665 5.75 75.84 IF 398 1.36 8765 0.91 55.09 VVZ 350 1.19 7602 0.79 51.59 VBR 262 0.89 5435 0.56 46.88 . . .

Table 3. A semantic tag list semantic Abst. BNC log-like- tag. freq. rate freq. rate lihood meaning X4.2 444 1.51 3108 0.32 640.06 Mental object :- Means, method O1.3 167 0.57 300 0.03 586.57 Substances and materials generally: Gas O2 610 2.08 6100 0.63 577.74 Objects generally O3 204 0.69 651 0.07 537.88 Electricity and electrical equipment A1.5.1 308 1.05 1965 0.20 485.85 Using N3.1 130 0.44 413 0.04 343.66 Measurement: General O1 151 0.51 689 0.07 314.64 Substances and materials generally M4 161 0.55 843 0.09 301.64 Shipping, swimming etc. O4.6 78 0.27 110 0.01 301.46 Temperature X2.4 252 0.86 2176 0.22 288.38 Investigate, examine, test, search N2 143 0.49 760 0.08 264.68 Mathematics . . . rate, the frequency in BNC written corpus sampler, its frequency rate and the log-likelihood, and these are sorted by the log-likelihood. Examining the results shown in the tables, the findings are as follows: 1. Semantic areas such as objects, mental objects (method and means), substances & materials (gas, solid and general), measurement (length & height, distance, size and volume), comparison, and evaluation occur much more frequently. 2. Parts of speech appearing more often are common nouns, the past participle, general adjectives, the definite article, ‘of’, ‘for’, ‘is’, and ‘are’. 3. In the use of verbal forms, the frequency of past participles is significant as found in the journal corpus [5], while the occurrence of past of lexical verbs and infinitive forms is much less compared to BNC written sampler. 4. Multi-words appearing more frequently are ‘based on’, ‘due to’, ‘used to’, ‘such as’, ‘carried out’, ‘as well’, ‘in order to’, ‘in terms of’, ‘in addition’ and ‘according to’. 3 Material development Through such analysis of corpora data, features of special importance to our students can be selected. Using automatic item generation allows learners to work with different authentic texts each time. Materials underdevelopment include the original abstracts with target items hyper-linked to various pages containing concordances, grammar explanations, a bilingual dictionary, etc. The outline of material is as follows: – An abstract is used as the base of this material, whose objective is to enhance the ability of abstract reading comprehension.

E-Learning Materials Development Based on Abstract Analysis Using - PDF document

E-Learning Materials Development Based on Abstract Analysis Using Web Tools Tomofumi NAKANO and Yukie KOYAMA Nagoya Institute of Technology Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan { nakano, koyama } @center.nitech.ac.jp Abstract. This study

Syntax Liam OConnor CSE, UNSW (and data61) Term3 2019 1 Abstract Syntax Parsing Bindings

Introduction to Abstract Data Types Introduction to Abstract Data Types Abstract Data Type (ADT)

Abstract Classes and Interfaces (?) June 21, 2017 Reading Quiz Abstract Classes A. Abstract

CS 2334: Lab 6 Abstract Classes & Interfaces Andrew H. Fagg: CS2334: Lab 6 1 Abstract Class

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

Materials Production Materials Production Materials Production Materials Production

Exercise 2: Materials Exercise 2: Materials FLUKA Beginners Course Exercise 2: Materials Aim

Materials Production Materials Production Materials Production Materials Production T. G.

Abstract Syntax Trees 27 February 2019 OSU CSE 1 Abstract Syntax Tree An abstract syntax

Abstract DPLL and Abstract DPLL Modulo Theories Robert Nieuwenhuis 1 , Albert Oliveras 1 , and

Abstract Generation Advanced VLSI Design CMPE 641 Abstract Generation Place and route tools do

Abstract Generation Advanced VLSI Design CMPE 414 Abstract Generation Place and route tools do

From abstract -Ramsey theory to abstract ultra-Ramsey Theory Timothy Trujillo SE OP

Materials Selection for Mechanical Design: Exploring the World of Materials Background: the

Accelerating Materials Discovery with High-Throughput DFT: The Open Quantum Materials Database

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

3D Point Cloud Registration using GPU-Accelerated Expectation Maximization Ben Eckart 1,2 ,

Does Inequality Lead to Wars? An Empirical Analysis Bachelors Thesis

Transforming Data into Actionable Intelligence: Health and Social Care Manira Ahmad

Computer Lab II Further Introduction to Biogeme Binary Logit Model Estimation Anna Fernndez

Earthquake Forecasting Ensemble Methods for Merging Models Alexander K. Christensen Dr.

Practical Organizational Efficiency and Effectiveness Modeling Presented at 29 ISMOR 28 August

Learning Mul,modal Deep Models Russ Salakhutdinov Department of Computer

Rhode Island State Investment Commission ERSRI 2011 Asset Liability Study second meeting RA

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

E-Learning Materials Development Based on Abstract Analysis Using - PDF document

E-Learning Materials Development Based on Abstract Analysis Using Web Tools Tomofumi NAKANO and Yukie KOYAMA Nagoya Institute of Technology Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan { nakano, koyama } @center.nitech.ac.jp Abstract. This study

Syntax Liam OConnor CSE, UNSW (and data61) Term3 2019 1 Abstract Syntax Parsing Bindings

Introduction to Abstract Data Types Introduction to Abstract Data Types Abstract Data Type (ADT)

Abstract Classes and Interfaces (?) June 21, 2017 Reading Quiz Abstract Classes A. Abstract

CS 2334: Lab 6 Abstract Classes &amp; Interfaces Andrew H. Fagg: CS2334: Lab 6 1 Abstract Class

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

Materials Production Materials Production Materials Production Materials Production

Exercise 2: Materials Exercise 2: Materials FLUKA Beginners Course Exercise 2: Materials Aim

Materials Production Materials Production Materials Production Materials Production T. G.

Abstract Syntax Trees 27 February 2019 OSU CSE 1 Abstract Syntax Tree An abstract syntax

Abstract DPLL and Abstract DPLL Modulo Theories Robert Nieuwenhuis 1 , Albert Oliveras 1 , and

Abstract Generation Advanced VLSI Design CMPE 641 Abstract Generation Place and route tools do

Abstract Generation Advanced VLSI Design CMPE 414 Abstract Generation Place and route tools do

From abstract -Ramsey theory to abstract ultra-Ramsey Theory Timothy Trujillo SE OP

Materials Selection for Mechanical Design: Exploring the World of Materials Background: the

Accelerating Materials Discovery with High-Throughput DFT: The Open Quantum Materials Database

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

3D Point Cloud Registration using GPU-Accelerated Expectation Maximization Ben Eckart 1,2 ,

Does Inequality Lead to Wars? An Empirical Analysis Bachelors Thesis

Transforming Data into Actionable Intelligence: Health and Social Care Manira Ahmad

Computer Lab II Further Introduction to Biogeme Binary Logit Model Estimation Anna Fernndez

Earthquake Forecasting Ensemble Methods for Merging Models Alexander K. Christensen Dr.

Practical Organizational Efficiency and Effectiveness Modeling Presented at 29 ISMOR 28 August

Learning Mul,modal Deep Models Russ Salakhutdinov Department of Computer

Rhode Island State Investment Commission ERSRI 2011 Asset Liability Study second meeting RA

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

CS 2334: Lab 6 Abstract Classes & Interfaces Andrew H. Fagg: CS2334: Lab 6 1 Abstract Class