Warm-up project Aslan Askarov aslan@cs.au.dk Revised from slides by - PowerPoint PPT Presentation

Compilation 2014 Warm-up project Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst

Straight-line Programming Language • Toy programming language: no branching, no loops • Skip lexing and parsing issues • Focus on the “meaning” – interpretation • Syntax Stm → Stm; Stm (CompoundStm) ExpList → Exp , ExpList (PairExpList) Stm → id := Exp ExpList → Exp (AssignStm) (LastExpList) Stm → print ( ExpList ) Binop → + (PrintStm) (Plus) Exp → id Binop → – (IdExp) (Minus) Exp → num Binop → × (NumExp) (Times) Exp → Exp BinOp Exp Binop → / (OpExp) (Div) Exp → ( Stm , Exp ) (EseqExp)

Straight-line program • Source: CompoundStm � a := 5 + 3; AssignStm CompoundStm b := (print (a, a - 1),10 * a); � OpExp AssignStm PrintStm a print (b) � NumExp BinOp NumExp EseqExp LastExpList b � PrintStm OpExp IdExp • Corresponding syntax tree: 5 Plus 3 PairExpList NumExp BinOp IdExp b IdExp LastExpList Times 10 a OpExp a IdExp BinOp NumExp Minus a 1

SLP syntax representation datatype • SML declaration (CompoundStm) Stm → Stm; Stm type id = string (AssignStm) Stm → id := Exp datatype binop Stm → print ( ExpList ) (PrintStm) = Plus | Minus | Times | Div Exp → id (IdExp) datatype stm Exp → num (NumExp) = CompoundStm of stm * stm Exp → Exp BinOp Exp (OpExp) | AssignStm of id * exp Exp → ( Stm , Exp ) (EseqExp) | PrintStm of exp list (PairExpList) ExpList → Exp , ExpList and exp ExpList → Exp (LastExpList) = IdExp of id Binop → + (Plus) | NumExp of int Binop → – (Minus) Binop → × | OpExp of exp * binop * exp (Times) Binop → / (Div) | EseqExp of stm * exp

SLP syntax representation • Source program CompoundStm a := 5 + 3; � AssignStm CompoundStm b := (print (a, a - 1),10 * a); � print (b) OpExp AssignStm PrintStm a � NumExp BinOp NumExp EseqExp LastExpList b • SML value: PrintStm OpExp IdExp 5 Plus 3 val prog = CompoundStm ( PairExpList NumExp BinOp IdExp b AssignStm (“a", OpExp ( NumExp 5, IdExp LastExpList Times 10 a Plus, NumExp 3)), OpExp a CompoundStm ( IdExp BinOp NumExp AssignStm ("b", EseqExp ( PrintStm [IdExp "a", Minus a 1 OpExp (…)], OpExp (NumExp 10, …))), PrintStm [IdExp "b"]))

Project assignment • Follow descriptions p10-12 in MCIML • “Modularity principles” p9-10: discussed on Friday, may be ignored at first

Lexical analysis

Lexical analysis High-level source code Lexing Parsing Elaboration Low-level target … code

Lexical analysis First phase in the compilation Input: stream of characters i f ( x > 0 ) \n \t t h e n 1 \n \t e l s e 0 IF LPAREN ID (“x”) GE INT (0) RPAREN THEN INT (1) ELSE INT (0) Output: stream of tokens in our language Discards comments, whitespace, newline, tab characters, preprocessor directives

Tokens Type Examples ID foo n14 a’ my-fun INT 73 0 070 REAL 0.0 .5 10. IF if COMMA , LPAREN ( ASGMT :=

Non-tokens Type Examples comments /* dead code */ // comment (* nest (*ed*) *) preprocessor directives #define N 10 #include <stdio.h> whitespace

Token data structure • Many tokens need no associated data, e.g.:   IF , COMMA, LPAREN, RPAREN, ASGMT � • Some tokens carry an associated string:   ID (“my-fun”) � • Some tokens carry associated data of other types:   INT (73), INT (1), FLOAT (IEEE754, 1001111100…) � • Tokens may include useful additional information:   start/end pos in input file (line number + column, or charpos)

  Q/A • Consider source program   var δ := 0.0 � • Language: case sensitive, ASCII � • How to report error of using δ ? FileName:Line.Col: Illegal character δ

Regular expressions • We can use regular expressions to specify programming language tokens • Regular expressions: • Expected to be well-known • Syntax: • symbol a • choice x | y • concat x y • empty ε • repeat x*

Regular expressions used for scanning • Examples • if (IF); • [a-z][a-z0-9]* (ID); • [0-9]* (NUM); • ([0-9]+”.”[0-9]*) | ([0-9]* ”.” [0-9]+) (REAL); • (”--” [a-z]*”\n”) | (” ”|”\t”) (continue()); • . (error (); continue());

Resolving ambiguities • Rule: when a string can match multiple tokens, the longest matching token wins • if (IF); � i f x > 0 • [a-z][a-z0-9]* (ID); � ID (“ifx”) � • We also need to specify priorities if we match several tokens of the same length. • Usual rule: earliest declaration wins i f ID (“if”) IF

Lexical analysis Specification: Tokens as regular exps +longest-matching rule +priorities Formalism: NFA DFA Implementation: Simulate NFA Simulate DFA linear complexity Program that translates raw text Output: into stream of tokens

Total NFA for ID,IF,NUM,REAL a-e,g-z,0-9 0-9,a-z ID error IF REAL 0-9,a-z f 0-9 0-9 . ID 4 2 3 5 6 a-h,j-z . i NUM REAL 0-9 0-9 7 8 1 blank etc. - 0-9 whitespace other blank - \n etc. 9 12 13 11 10 error error a-z

ML-Lex • Lexer generator, “built-in” part of SML/NJ • Accepts lexical specification, produces a scanner • Example specification (* SML declarations *) type lexresult = Tokens.token fun eof() = Tokens.EOF(0,0) %% (* Lex definitions *) digits=[0-9]+ %% (* Regular Expressions and Actions *) if => (Tokens.IF(yypos,yypos+2)); [a-z][a-z0-9]* => (Tokens.ID(yytext,yypos,yypos + size yytext)); {digits} => (Tokens.NUM( Int.fromString yytext, yypos, yypos + size yytext); ({digits}”.”[0-9]*)|([0-9]*”.”{digits}) => (Tokens.REAL( Real.fromString yytext, yypos, yypos + size yytext)); (“--”[a-z]*”\n”)|(“ “|”\n”|”\t”)+ => (continue()); • => ( ErrorMsg.error yypos “Illegal character”; continue());

Lexer states • Helpful when handling di ff erent “kinds” of tokens • For ex.: use state • INITIAL in general lexing (automatic) • STRING when scanning the contents of a string • COMMENT when scanning a comment • Point: keep di ff erent concerns apart – simpler! • Syntax: ... (* Regular Expressions and Actions *) <INITIAL>if => (Tokens.IF(yypos,yypos+2)); <INITIAL>[a-z][a-z0-9]* => (Tokens.ID(yytext,yypos,yypos + size yytext)); ... <INITIAL>”\”” => (YYBEGIN STRING; continue()); ... <STRING>. => (continue()); ...

Summary • Warm-up project: Program in SML! • Straight-line programming language, no lexing/parsing involved • Express programs: use abstract syntax tree datatype • Project specified on website, essentially as in the book • Lexical analysis • Avoid complexity in grammar. Use lexer • Based on regular expressions. Implementation via NFA/DFA • Theory assumed known • Tools: ML-Lex • Scanner generator, outputs SML code from spec • Note lexer states

Warm-up project Aslan Askarov aslan@cs.au.dk Revised from slides by - PowerPoint PPT Presentation

Compilation 2014 Warm-up project Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst Straight-line Programming Language Toy programming language: no branching, no loops Skip lexing and parsing issues Focus on the

Warm Mix Asphalt Warm Mix Asphalt (WMA 101) (WMA 101) What Is Warm Mix Asphalt ? What Is Warm

4. Droplet Growth in Warm Clouds In warm clouds, droplets can grow by condensation in a

Elder Abuse The Confederated Tribes of Warm Springs Warm Springs, Oregon Wilson Wewa Senior

Hot code is faster code Addressing JVM warm-up Mark Price LMAX Exchange The JVM warm-up

WARM HANDOFF Why and how to implement it and successful approaches for CCOs Content 1. Define

REEF TM Reef Warm Reef Series 8 faces . 5 colors . 3 sizes BIANCO WARM PEARL MATT MATT MATT

Operation Warm Brand New Coats for Kids The Mission Operation Warm has provided a TO PROVIDE

Welcome To Volunteer Onboarding 1 Volunteer Onboarding Who is familiar with the story warm

Warm Welcome Warm Welcome COSEC Cafeteria Management Cafeteria What is Cafeteria Management?

A WARM WELCOME TO YEAR 2 A warm welcome to Year 2. We are your childs teachers. We are

THE WIDENING PARTICIPATION RESEARCH AND MENTORING GROUP (WARM) Penny Llewellyn and Rhianne

Developing What is a warm-up? Vocal Technique in the Warm-up is an exercise used to

2 Microstructures of Warm Clouds Clouds that lie completely below the 0 C isotherm, referred to

L ECTURE 20: S WARM I NTELLIGENCE 1 / P ARTICLE S WARM O PTIMIZATION 1 T EACHER : G IANNI A. D I C

Lecture 6: Cook Levin Theorem Arijit Bishnu 11.03.2010 Warm Up Expressiveness of Boolean

L ECTURE 16: S WARM I NTELLIGENCE 2 / P ARTICLE S WARM O PTIMIZATION 2 I NSTRUCTOR : G IANNI A. D I

Motivation STM Best performance Faster Expected gains Performance from FastLane FastLane

Haskell: Compiler as Theorem-Prover Greg Price ( price ) 2007 Nov 19 code samples:

Analytical solution of the bosonic three-body problem Alexander Gogolin Department of

Scanning Tunneling Microscopy (STM) and spin-polarized STM Part I - STM Wulf Wulfhekel

ELECTRONIC PROPERTIES OF TWISTED BILAYER GRAPHENE Johannes Lischner Imperial College London

http://cs246.stanford.edu Web pages are not equally important www.joe-schmoe.com vs.

Democracy and the role of minorities in Markov chain models Non-reversible perturbations of

Monotonicity, Convexity and Comparability of Some Functions Associated with Block-Monotone Markov

Sambuz

Useful Links

Newsletter

Mail Us