computer applications of language technology a
play

Computer applications of language technology (a) How can we apply - PDF document

Computer applications of language technology (a) How can we apply models of the kind Getting Computers to Process shown so far in automatically Language II processing language? How is that related to current Human Communication 1


  1. Computer applications of language technology (a) • How can we apply models of the kind Getting Computers to Process shown so far in automatically Language II processing language? • How is that related to current Human Communication 1 engineering practice? Lecture 15 • What can we learn from this about humans ? 17/02/09 Susen Rabold 1 17/02/09 Susen Rabold 2 Computer applications of Database query (a) language technology (b) Language-based computer applications are of • Which road links Edinburgh to growing importance both for Penicuik? might be represented as: • improving the effective use of information • broadening the base of computer literacy x Edinburgh (y) Major problems will arise in exploiting Penicuik (z) background knowledge in the same way as road (x) humans do. Link (x,y.z) 17/02/09 Susen Rabold 3 17/02/09 Susen Rabold 4 1

  2. Database query (b) Machine translation (a) • We can then investigate our model, which • One way of doing this is to have semantic could be a database of UK roads, to see if rules which map into the same DRSs from there is such an x. different languages. Possible problems: N → Hund with symbol “dog” V0 → bellen with symbol “bark” • syntactic coverage • semantic representations of e.g. plurals, Det → ein with the same semantic rule as for times “a” • disambiguation • So, ein Hund bellt will have the same semantic representation as “a dog barks”. • working out what an appropriate response is. 17/02/09 Susen Rabold 5 17/02/09 Susen Rabold 6 Speech understanding and Machine translation (b) synthesis • We can then define a routine to produce an English • Add in rules for grouping sounds into sentence on the basis of a DRS. words. • Problems: • Problems: – determining a set of conditions to translate to – phonetic ambiguity – different languages seem to carve up the space of words – which interacts multiplicatively with other differently forms – modelling the complex effects of speech production 17/02/09 Susen Rabold 7 17/02/09 Susen Rabold 8 2

  3. Limits (a) Limits (b) • To date, computer applications in processing • Or to a limited extent: natural language work but . . . – in document processing, one may try just • in limited domains: simplifying problems in to extract key information rather than interpretation, e.g. limiting ambiguity. understand the whole of a document. • with restrictions on the kind of language/speech used – speech systems in which one must leave gaps between words – what looks or sounds reasonable to you may be rejected by the system 17/02/09 Susen Rabold 9 17/02/09 Susen Rabold 10 An engineering solution? (a) An engineering solution? (b) • Many approaches to”language • To translate, examine the source text engineering” adopt statistical methods. and find the target text that is the best For example, to do machine translation: fit. The system “learns” the • Get a bi-lingual collection of texts (e.g. correspondences between English and the Canadian Hansard) German words. Various techniques can be used to improve the quality of the • Compute the frequency with which pairs of English and French words appear in output. similar positions in the text 17/02/09 Susen Rabold 11 17/02/09 Susen Rabold 12 3

  4. Views on statistical methods Views on statistical methods (a) (b) • A statistical (or non-symbolic) approach • “ attendu correlates with expected with means that we don ’ t have to characterize the factor 85%” different kinds of knowledge we ’ ve identified • Machine translation uses statistical in humans. More and more sophisticated models statistical techniques are being applied to • http://babelfish.yahoo.com/?fr=bf-res problems in language processing. • One drawback with the statistical method • Hidden Markov Model, statistical model from the perspective of cognitive science is used in NLP, can be considered that once you ’ ve derived your set of statistics simplest dynamic Bayesian network. it ’ s difficult to extract general rules from them. 17/02/09 Susen Rabold 13 17/02/09 Susen Rabold 14 Symbolic or non-symbolic? (a) Symbolic or non-symbolic? (b) Possible responses: • It ’ s all symbolic; what we see (or model) as statistical behaviour is the result of complex interactions • there are no general rules in this sense; between different sources of knowledge not yet everything is probabilistic understood • Some aspects of the methodology of linguistics lead “Connectionism” and neural networks towards this position. occupy this extreme, particularly if one • or . . . disputes the claim that there are mental representations. 17/02/09 Susen Rabold 15 17/02/09 Susen Rabold 16 4

  5. symbolic? Hybrid Systems (a) • It ’ s a mixture; some aspects of processing are • Using a combination of linguistic statistically based, others symbolically. knowledge and statistics helps: A wishy-washy view or a golden mean? • one acquire a statistical model with • It seems likely that computation at the level of sparse training data (via more accurate neurons is non-discrete; neurons fire more rapidly as their inputs excite them more. smoothing) • On the other hand, aspects of linguistic • estimate which features will be most processing seem more discrete: we either informative during the learning phase hear a sound as, say, a “b”, or we don ’ t. 17/02/09 Susen Rabold 17 17/02/09 Susen Rabold 18 Hybrid Systems (b) Summary From a cognitive science perspective, we have seen Today we have seen: that we want these different levels of description. That is, we want both • how to get computers to do part of the • explicit rules that capture some aspects of humans ’ job of processing language knowledge of language, e.g. intuitions about meaning, and • difficulties that arise in this • to be able to express information about the relative • applications in language technology frequency with which people use certain words, or linguistic constructions (a grammar only says what ’ s • the debate between statistical and possible, not what ’ s frequent). symbolical approaches. 17/02/09 Susen Rabold 19 17/02/09 Susen Rabold 20 5

Recommend


More recommend