Review Last Time: Programming Language History � � 50s, 60s: Exciting Time » � Invention of: assemblers, compilers, interpreters, first high- CSCI: 4500/6500 Programming level languages, structured programming , abstraction, formal syntax, object-oriented programming, LISP, program Languages verification � � 70s, 80s, 90s: Boring Time » � Refinement of earlier ideas, better implementations, making theory more practical Natural and Programming Languages » � A few new/refined ideas: functional languages, data abstraction, concurrent languages, data flow, type theory, etc. Syntactic Structures � � 00+s: Party Time » � A New Environment: Internet, large scale distributed computing, the grid, Java, C#, Maria at UGA � � Alan Kay: “The best way to predict the future is to invent it.” Contributors: Portions of this lecture thanks to: Prof David Evans, U Virginia and Prof Spencer Rugaber, 1 2 GTech Maria Hybinette, UGA Maria Hybinette, UGA This Week: Programming Language Implementation Formal System & Language Source program � � This week and next we Scanner Formal System: will talk about the first Lexical Analyzer � � Set of symbols: two phases of Lexical units, token stream compilation, namely: » � the primitives Parser Syntax » � Scanning and � � Set of rules for manipulating symbols Analyzer Parse tree » � Parsing. » � Rules of production Intermediate Code Generator Symbol Optimizer What is a Language (theoretically)?: � � Today the basic Table (optional) Semantic concepts next week Analyzer � � Formal System + (mapping of sequence of we talk about parse symbols and their meaning) Abstract syntax tree or trees & discuss other intermediate form Code Generator practicalities Machine Language Computer 3 4 Maria Hybinette, UGA Maria Hybinette, UGA Linguist’s Language What are languages made of? � � Description of pairs (S, M) � � Primitives » � S is the “sound”, or any kind of surface forms, and » � The smallest units of meaning, or the simplest ‘surface forms’ ( pronunciation). » � M is the meaning. � � Means of Combination (all languages have � � Language specifies properties of sound and these) meaning and how they relate (Aristotle characterize language as a system than links » � Like Rules of Production for Formal Systems sound and meaning) » � Creates ‘new’ surface forms from the ones you have � � Means of Abstraction (all powerful languages � � Aristotle: 384-322 B.C. Greek have these) philosopher, father of deductive logic, » � Ways to use simple surface forms to represent more Meta physics, “Physics”, teacher of complicated ones Alexander the Great. 5 6 Maria Hybinette, UGA Maria Hybinette, UGA
What is longest word in the English language? Creating longer words � � Supercalifragilisticexpialidocious � � Floccinaucinihilipilification (previous slide) » � Popularized by Mary Poppins » � Oxford English Dictionary, 34 letters » � The estimation of something as worthless, the act of estimating something as useless » � Nonsense word meaning fantastic � � Pneumonoultramicroscopicsilicovolcanoconiosis � � Anti-floccinaucinihilipilification » � 'a lung disease caused by the inhalation of very fine silica » � The estimation of something as not worthless dust’, 45 letters (miner’s lungs). � � Antifloccinaucinihilipilification-or » � 207,000+ mitochondrial DNA » � The one who does the act of not rendering useless � � Floccinaucinihilipilification » � The estimation of something as worthless (usage dated � � Anti- antifloccinaucinihilipilification since 1741) -- four ‘worthless’ words with a verb ending. » � 27 letters, longest non-technical word according first edition of Oxford English Dictionary (floccus - I don’t care, I don’t make wool, naucum - little value, nihilum - nothing, pilus - a hair, a bit or whit, something small and insignificant, facio, facere, feci, factus make or do � 7 8 Maria Hybinette, UGA Maria Hybinette, UGA Natural Languages What are languages made of? � � Are there any recursive languages? � � Primitives » � No, we would run out of things to say » � The smallest units of meaning, the “ simplest” surface � � So, we only need to start with a few building forms. Lexemes lowest level of meaning. blocks and from there we can create infinite � � Means of Combination (all languages have these) things » � Creates new surface forms from the ones you have » � Sentences and works on word parts too! � � Means of Abstraction (all powerful languages have MU! MUU MU! these) » � Ways to use simple surface forms to represent more complicated ones » � Example: pronouns: “I in English; or Phom, Dichan is the polite way of saying I in Thai depending on gender (Dichan for females). 9 10 Maria Hybinette, UGA Maria Hybinette, UGA Primitives/Tokens Means of Combination � � Tokens: Described by regular expressions » � First phase of compilation process converts strings/lexemes of the � � Allow us to say infinitely many things with a programming language to tokens (a representation of the lexeme finite set of primitives in the computer) – � Example : letter ( letter | digit )* � � � We can create sentences using primitives » � Can be generated from just three rules/operations: » � But really, in English “words” are really not the – � Concatenation ‘primitives’ since we can create longer words – � Repetition (arbitrary number of times - Kleene closure) – � Alternation (Choice from a finite set) � � How can we describe “means of » � Corresponds to type-3 grammars in Chomsky hierarchy and is the combinations” in the syntax of a language? most restrictive A -> a, A-> aB or A -> Ba » � Computer Scientists: � � Many utilities exist that use regular expressions – � Backus-Normal-Form -> Backus- Naur -Form (BNF) » � grep (global regular expression print) – � grep ^root /etc/passwd � » � Lex/flex, turn a regular expression of tokens into a scanner, so they are generators (next week) 11 12 Maria Hybinette, UGA Maria Hybinette, UGA
BNF Example BNF Example Sentence ::= Noun-Phrase Verb-Phrase Sentence ::= Noun-Phrase Verb-Phrase Noun-Phrase Noun-Phrase ::= Maria | Microsoft Noun-Phrase ::= Noun | Adjective Noun-Phrase Verb-Phrase := Rocks | Jumps Noun := Maria | Microsoft | Home | Feet Adjective := Yellow | Smelly Verb-Phrase := Skips | Runs | Rocks � � What are the terminals? » � Maria, Microsoft, Rocks, Jumps � � Now we can express infinitely many things with this little language… � � How many different things can we express with this language? » � 4 » � … but only 1 is true 13 14 Maria Hybinette, UGA Maria Hybinette, UGA Definition of Languages BNF and Context Free Grammars � � Recognizers � � Context Free Grammars » � Reads input string and accepts or rejects if the » � Developed by Noam Chomsky in the 1950s string is in the language » � Define a class of languages called context-free » � Example: Parsers -- the syntax analyzer of a languages (type 2) compiler (yacc- yet another compiler compiler) � � Backus Naur Form (BNF) � � Generators » � A meta-language used to describe another » � Generate sentences of a language language » � Example: Grammars are language generators » � Equivalent to context-free grammars 15 16 Maria Hybinette, UGA Maria Hybinette, UGA BNF Basics BNF details A BNF grammar consists of four parts: � � The tokens are the smallest units of syntax » � Strings of one or more characters of program text � � Tokens: tokens of the language, the terminals » � They are atomic: not treated as being composed from � � Non-terminal symbols: BNF abstractions in <> smaller parts brackets � � The non-terminal symbols stand for larger pieces of syntax � � A start symbol » � They are strings enclosed in angle brackets, as in <NP> � � Grammar: The set of productions or rules » � They are not strings that occur literally in program text » � The grammar says how they can be expanded into strings of tokens � � The start symbol is the particular non-terminal that forms the root of any parse tree for the grammar 17 18 Maria Hybinette, UGA Maria Hybinette, UGA
Recommend
More recommend