plan for lexical analysis with jlex and one pass code gen
play

Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure - PowerPoint PPT Presentation

Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure of the MeggyJava Compiler Analysis Synthesis Overview of the MeggyJava Assignments character stream PA2:Lexer/scanner in MJPA2.jar lexical analysis code gen Expressing


  1. Plan for Lexical Analysis with Jlex and One Pass Code Gen Structure of the MeggyJava Compiler Analysis Synthesis Overview of the MeggyJava Assignments character stream PA2:Lexer/scanner in MJPA2.jar lexical analysis code gen Expressing tokens with regular expressions tokens � words � Atmel assembly code – regular expression syntax for JLex syntactic analysis – using JLex with JavaCup PA1: Write test cases in MeggyJava, and AVR warmup AST � sentences � PA2: MeggyJava scanner and setPixel How do lexer generators work? semantic analysis PA3: add exps and control flow (AST) – Convert regular expressions to NFA PA4: add methods (symbol table) AST and symbol table – Converting an NFA to DFA PA5: add variables and objects – Implementing the DFA PA6: add arrays and register allocation PA2: Syntax-directed code generation (MJ.jar) CS453 Lecture Lexical Analysis with JLex 1 CS453 Lecture Introduction 2 PA2 Scanner/Lexer Specifying Tokens with JLex JLex example input file: LETTER=[A-Za-z] Look at the assignment writeup and point out the tar ball. DIGIT=[0-9] package mjparser; UNDERSCORE="_" import java_cup.runtime.Symbol; LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} Look at the input files. ID={LETTER}({LETT_DIG_UND})* %% ... %line %char Look at the output files. %cup %% %public "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); } %eofval{ Look at MJPA2Driver.java. return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar)); "+" {return new Symbol(sym.PLUS, ...); } %eofval} "if" {return new Symbol(sym.IF,...); } Look at mj.lex. {ID} {return new Symbol(sym.ID, new ... Look at the Makefile. {EOL} { /* reset yychar */ … } {WS} { /* ignore */ } CS453 Lecture Lexical Analysis with JLex 3 CS453 Lecture Lexical Analysis with JLex 4

  2. Nondeterministic Finite Acceptor (NFA) Nondeterministic Finite Accepter (NFA) Alphabet = { a } Alphabet = { a } Two choices a a q q q q 1 2 1 2 a a q q 0 0 a a q q 3 3 First Choice First Choice a a a a a a q q q q 1 2 1 2 a a q q 0 0 a a q q 3 3

  3. First Choice First Choice a a a a All input is consumed a a q q q q “accept” 1 2 1 2 a a q q 0 0 a a q q 3 3 Second Choice Second Choice a a a a a a q q q q 1 2 1 2 a a q q 0 0 a a q q 3 3

  4. Second Choice Second Choice a a a a Input cannot be consumed a a q q q q 1 2 1 2 a a q q 0 0 a a No transition: q q should we reject aa? 3 3 the automaton hangs When To Accept a String Example aa is accepted by the NFA: An NFA accepts a string: when there is a computation of the NFA “accept” a q q that accepts the string 1 2 a a q q 1 2 AND q a 0 a q q 0 a 3 q “reject??” 3 all the input is consumed and the automaton because this But this only tells is in a final state computation us that choice accepts aa didn’t work….

  5. Rejection example First Choice a a a a q q q q 1 2 1 2 a a q q 0 0 a a q q 3 3 First Choice Second Choice a a “reject??” a a q q q q 1 2 1 2 a a q q 0 0 a a q q 3 3

  6. Second Choice Second Choice a a a a q q q q 1 2 1 2 a a q q 0 0 a a q q “reject??” 3 3 Example An NFA rejects a string: a is rejected by the NFA: when there is NO computation of the NFA that accepts the string: “reject??” • All the input is consumed and the a q q a q q 1 2 1 2 a a automaton is in a non final state q q 0 a 0 a q “reject??” OR q 3 3 • The input cannot be consumed All possible computations lead to rejection

  7. Specifying Tokens with JLex JLex example input file: L = { aa } LETTER=[A-Za-z] Language accepted: package mjparser; DIGIT=[0-9] import java_cup.runtime.Symbol; UNDERSCORE="_” EOL=(\n|\r|\r\n) LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} %% %line ID={LETTER}({LETT_DIG_UND})* %char a q q %cup %% 1 2 %public a "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); } %eofval{ return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar)); "+" {return new Symbol(sym.PLUS, ...); } q %eofval} 0 "if" {return new Symbol(sym.IF,...); } a {ID} {return new Symbol(sym.ID, new ... q 3 {EOL} { /* reset yychar */ … } {WS} { /* ignore */ } CS453 Lecture Lexical Analysis with JLex 26 Example NFA for Multiple Tokens DFA from IF and ID NFAs (Do in class) CS453 Lecture Lexical Analysis with JLex 27 CS453 Lecture Lexical Analysis with JLex 28

  8. DFA from IF and ID NFAs (Answer) Implementing DFAs? CS453 Lecture Lexical Analysis with JLex 29 CS453 Lecture Lexical Analysis with JLex 30 PA2 Syntax Directed Code Generation Recall Doing Syntax-Directed Interpretation Look at the assignment writeup and point out usage of MJ.jar. Grammar (1) exp --> exp * exp Input files are MeggyJava files that fit the PA2 grammar. (2) exp --> exp + exp (3) exp --> NUM Look at current output file. Will be a .s file that can go through the simulator. String Look at MJDriver.java. 42 + 7 * 6 Look at mj.cup. Look at the Makefile. CS453 Lecture Lexical Analysis with JLex 31 CS453 Lecture Context Free Grammar Intro 32

  9. Semantic Rules for Expression Example Code Generation versus Interpretation When interpreting . . . – Each action in the .cup file associates a value with the left hand side of the non terminal. – Each non terminal on the right hand side has a value associated with it. – This approach will also be useful when we are building the Abstract Syntax Tree (AST) in PA3. When doing one pass compilation . . . – Actions output the target code (in this case AVR assembly) CS453 Lecture Context Free Grammar Intro 33 CS453 Lecture Lexical Analysis with JLex 34 Parse Tree for An Empty MeggyJava Program CS453 Lecture Lexical Analysis with JLex 35

Recommend


More recommend