the linguistic relevance of mcfls
play

The linguistic relevance of MCFLs Greg Kobele University of Chicago - PowerPoint PPT Presentation

Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion The linguistic relevance of MCFLs Greg Kobele University of Chicago MCFG+ 2 Nara, Japan Introduction Natural language goes beyond


  1. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion The linguistic relevance of MCFLs Greg Kobele University of Chicago MCFG+ 2 Nara, Japan

  2. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction 1 Natural language goes beyond CFLs 2 The MCS hypothesis 3 Challenging the MCS hypothesis 4 Conclusion 5

  3. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction 1 Natural language goes beyond CFLs 2 The MCS hypothesis 3 Challenging the MCS hypothesis 4 Conclusion 5

  4. TIIKEE MODELS FOR TIE DESCRIPTION OF LANGUAGE* Nom Chomsky Department of Modern Languages and Research Laboratory of Electronics Massachusetts Institute of Technology Cambridge, Massachusetts Abstract observations, to show how they are interrelated, We investigate several conceptions of and to predict an indefinite number of new linguistic structure to determine whether or phenomena. A mathematical theory has the not they can provide simple and sreveallngs additional property that predictions follow grammars that generate all of the sentences rigorously from the body of theory. of English and only these. We find that no Similarly, a grammar is based on a finite number of observed finite-state Markov process that produces sentences (the linguist’s corpus) and it symbols with transition from state to state sprojectss this set to an infinite set of can serve as an English grammar. Fnrthenuore, grammatical sentences- by establishing general the particular subclass of such processes that “laws” (grammatical rnles) framed in terms of produce n-order statistical approximations to such bpothetical constructs as the particular English do not come closer, with increasing n, phonemes, words. phrases, and so on, of the to matching the output of an English grammar. language under analysis. A properly formulated We formalisa the notions of lphrase structures grammar should determine unambiguously the set and show that this gives us a method for of grammatical sentences. describing language which is essentially more powerful, though still representable as a rather General linguistic theory can be viewed as elementary type of finite-state process. Never- a metatheory which is concerned with the problem theless, it is successful only when limited to a of how to choose such a grammar in the case of We study the small subset of simple sentences. each particular language on the basis of a finite formal properties of a set of grammatical trans- corpus of sentences. In particular, it will formations that carry sentences with phra.se consider and attempt to explicate the relation structure into new sentences with derived phrase between the set of grammatical sentences and the showing that transformational grammars structure, set of observed sentences. In other wards, are processes of the same elementary type as linguistic theory attempts to explain the ability phrase-structure grammars; that the grammar Of Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion of a speaker to produce and understand- new English is materially simplifisd if phrase Introduction sentences, and to reject as ungrammatical other Chomsky 1956 structure description is limited to a kernel of new sequences, on the basis of his limited simple sentences from which all other sentences linguistic experience. are constructed by repeated transformations; and that this view of linguistic structure gives a Suppose that for many languages there are certain insight into the use and understanding certain clear cases of grammatical sentences and sf language. certain clear cases of ungrammatical sequences, (1) and (2). respectively, in English. 1. Introduction e-e., (1) John ate a sandwich There are two central problems in the Sandwich a ate John. descriptive study of language. One primary (2) concern the linguist is to discover simple of In this case, we can test the adequacy of a and srevealing* grammars for natural languages. proposed linguistic theory by determining, for At the same time, by studying the properties of each language, whether or not the clear cases such successful grammars and clarifying the basic are handled properly by the grammars constrncted conceptions that underlie them, he hopes to in accordauce with this theory. For example, if arrive at a general theory of linguistic a large corpus of English does not happen to structure. We shall examine certain features of contain either (1) or (2), we ask whether the these related inquiries. grammar that is determined for this corpus will project the corpus to include (1) and exclude (21 The grammar of a language can be viewed as Even though such clear cases may provide only a a theory of the structure of this language. Any weak test of adequacy for the grammar of a given scientific theory is based on a certain finite language taken in isolation, they provide a very set of observations and, by establishing general strong test for any general linguistic theory and laws stated in terms of certain wpothetical for the set of grammars to which it leads, since constructs, it attempts to account for these we insist that in the case of each language the -.- clear cases be handled properly in a fixed and 4Thi8 work was supported in part by the Army predetermined manner. We can take certain steps (Signal Corps), the Air Force (Office of Scientific towards the construction of an operational Research, Air Research and Development Command), characterization of ngrammatical sentences that and the Navy (Office of Naval Research), and in will provide us with the clear cases required to part by a grant from Eastman Kodak Company. set the task of linguistics significantly.

  5. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction The ‘canonical’ datum of linguistics is of the form w ∈ L or w / ∈ L . A theory of a language is a description of some L which correctly classifies these data. A theory is good if concisely describes the data. (If the cost of encoding the actual data-cum-theory is low.) Sometimes using a grammar that generates a different language can provide a shorter description than could any other. 1024 , 1048576 , 59049 ∈ L � 1024 , 1048576 , 59049 � ? As the amount of data grows, the more benefit there is to treating it as a projection of an infinite set.

  6. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction The ‘canonical’ datum of linguistics is of the form w ∈ L or w / ∈ L . A theory of a language is a description of some L which correctly classifies these data. A theory is good if concisely describes the data. (If the cost of encoding the actual data-cum-theory is low.) Sometimes using a grammar that generates a different language can provide a shorter description than could any other. 1024 , 1048576 , 59049 ∈ L � 1024 , 1048576 , 59049 � ? � f ( x ) = x 10 , 2 , 4 , 3 � ? As the amount of data grows, the more benefit there is to treating it as a projection of an infinite set.

  7. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction We are actually presented with data from different languages ( w ∈ L 1 , u / ∈ L 2 , v ∈ L 3 ,. . . ) We can ask: What kinds of properties do these L share? We can then factor out these commonalities from the description of the individual L s, stating them just once. As the number of different languages we consider grows, the more benefit there is to treating them as a projection of an infinite set. �� 1024 , 1048576 , 59049 � , � 9 , 81 � , � 1 , 2 , 1 ��

  8. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction We are actually presented with data from different languages ( w ∈ L 1 , u / ∈ L 2 , v ∈ L 3 ,. . . ) We can ask: What kinds of properties do these L share? We can then factor out these commonalities from the description of the individual L s, stating them just once. As the number of different languages we consider grows, the more benefit there is to treating them as a projection of an infinite set. �� 1024 , 1048576 , 59049 � , � 9 , 81 � , � 1 , 2 , 1 �� �� x 10 , 2 , 4 , 3 � , � x 2 , 39 � , � x 1 , 1 , 2 , 1 ��

  9. Introduction Natural language goes beyond CFLs The MCS hypothesis Challenging the MCS hypothesis Conclusion Introduction We are actually presented with data from different languages ( w ∈ L 1 , u / ∈ L 2 , v ∈ L 3 ,. . . ) We can ask: What kinds of properties do these L share? We can then factor out these commonalities from the description of the individual L s, stating them just once. As the number of different languages we consider grows, the more benefit there is to treating them as a projection of an infinite set. �� 1024 , 1048576 , 59049 � , � 9 , 81 � , � 1 , 2 , 1 �� �� x 10 , 2 , 4 , 3 � , � x 2 , 39 � , � x 1 , 1 , 2 , 1 �� � x y , � 10 , 2 , 4 , 3 � , � 2 , 3 , 9 � , � 1 , 1 , 2 , 1 ��

Recommend


More recommend