Syntax Analysis Reinhard Wilhelm Universitt des Saarlandes - PowerPoint PPT Presentation

Syntax Analysis Syntax Analysis Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il 23. Oktober 2009

Syntax Analysis Subjects ◮ Introduction ◮ The task of syntax analysis ◮ Automatic generation ◮ Error handling ◮ Context free grammars, derivations, and parse trees ◮ Grammar Flow Analysis ◮ Pushdown automata ◮ Top-down syntax analysis ◮ Bottom-up syntax analysis ◮ Bison — A parser generator

Syntax Analysis “Standard” Structure source (character string) ❄ lexical analysis (7) finite automata ❄ source (symbol string) ❄ syntax analysis (8) pushdown automata ❄ syntax-tree ❄ attribute grammar evaluators semantic-analysis (9) ❄ decorated syntax-tree ❄ optimizations (10) abstract interpretation + transformations ❄ intermediate rep. ... ❄

Syntax Analysis “Standard” Structure cont’d ❄ intermediate rep. ❄ code-generation(11, 12) tree automata + dynamic programming + · · · ❄ machine-program

Syntax Analysis Syntax Analysis (Parsing) ◮ Functionality Input Sequence of symbols (tokens) Output Parse tree ◮ Report syntax errors, e,g., unbalanced parentheses ◮ Create “‘pretty-printed” version of the program (sometimes) ◮ In many cases the tree need not be generated (one-pass compilers) Note: Input is considered as a word over a new (finite) alphabet, i.e. the set of all symbol classes.

Syntax Analysis Handling Syntax Errors ◮ Report and locate the error (symptom) ◮ Diagnose the error ◮ Correct the error ◮ Recover from the error in order to discover more errors (without reporting too many follow up errors) Example a := a ∗ ( b + c ∗ d ;

Syntax Analysis The Valid Prefix Property ◮ For every word u that the parser identifies as a legal prefix, there exists a word w such that uw is a valid program — u has a continuation w ◮ Property of a parsing method ◮ All the parsing methods treated, i.e. LL-parsing and LR-parsing, have the valid prefix property.

Syntax Analysis Error Diagnosis Data ◮ Line number (may be far from the actual error) ◮ The current symbol ◮ The symbols expected in the current parser state ◮ Parser configuration

Syntax Analysis Error Recovery ◮ Becomes less important in interactive environments ◮ Example heuristics: ◮ Search for a “significant” symbol and ignore the string up to this symbol ( panic mode ) ◮ Try to “replace” symbols for common errors ◮ Refrain from reporting more than 3 subsequent errors ◮ Globally optimal solutions — For every illegal input w , find a legal input w ′ with a “minimal distance” from w

Syntax Analysis Example Context Free Grammar (Section) Stat → If_Stat | While_Stat | Repeat_Stat | Proc_Call | Assignment If_Stat → if Cond then Stat_Seq else Stat_Seq fi | if Cond then Stat_Seq fi While_Stat → while Cond do Stat_Seq od Repeat_Stat → repeat Stat_Seq until Cond Proc_Call → Name ( Expr_Seq ) Assignment → Name := Expr Stat_Seq → Stat | Stat_Seq; Stat Expr_Seq → Expr | Expr_Seq, Expr

Syntax Analysis Context-Free-Grammar Definition A context-free-grammar is a quadruple G = ( V N , V T , P , S ) where: ◮ V N — finite set of non-terminals ◮ V T — finite set of terminals ◮ P ⊆ V N × ( V N ∪ V T ) ∗ — finite set of production rules ◮ S ∈ V n — the start non-terminal

Syntax Analysis Examples G 0 = ( { E , T , F } , { + , ∗ , ( , ) , id } , { E → E + T | T T → T ∗ F | F E ) F → ( E ) | id } , G 1 = ( { E } , { + , ∗ , ( , ) , id } , { E → E + E | E ∗ E | ( E ) | id } , E )

Syntax Analysis Derivations A context-free-grammar G = ( V N , V T , P , S ) ◮ ϕ = ⇒ ψ if there exist ϕ 1 , ϕ 2 ∈ ( V N ∪ V T ) ∗ , A ∈ V N ◮ ϕ ≡ ϕ 1 A ϕ 2 ◮ A → α ∈ P ◮ ψ ≡ ϕ 1 α ϕ 2 ∗ ◮ ϕ = ⇒ ψ reflexive transitive closure ◮ The language defined by G ∗ L ( G ) = { w ∈ V ∗ T | S = ⇒ w }

Syntax Analysis Reduced and Extended Context Free Grammars A non-terminal A is ∗ reachable: There exist ϕ 1 , ϕ 2 such that S = ⇒ ϕ 1 A ϕ 2 ∗ productive: There exists w ∈ V ∗ T , A = ⇒ w Removal of unreachable and unproductive non-terminals and the productions they occur in doesn’t change the defined language. A grammar is reduced if it has neither unreachable nor unproductive non-terminals. A grammar is extended if a new startsymbol S ′ and a new production S ′ → S are added to the grammar. From now on, we only consider reduced and extended grammars.

Syntax Analysis Syntax-Tree (Parse-Tree) ◮ An ordered tree. ◮ Root is labeled with S . ◮ Internal nodes are labeled by non-terminals. ◮ Leaves are labeled by terminals or by ε . ◮ For internal nodes n : Is n labeled by N and are its children n . 1 , . . . , n . n p labeled by N 1 , . . . , N n p , then N → N 1 , . . . , N n p ∈ P .

Syntax Analysis Examples E E E E E E E E E E id ∗ id + id id ∗ id + id E E E E E E E E E E + + + + id id id id id id

Syntax Analysis Leftmost (Rightmost) Derivations Given a context-free-grammar G = ( V N , V T , P , S ) ◮ ϕ = ⇒ if there exist ϕ 1 ∈ V ∗ T , ϕ 2 ∈ ( V N ∪ V T ) ∗ , and A ∈ V N ψ lm ◮ ϕ ≡ ϕ 1 A ϕ 2 ◮ A → α ∈ P ◮ ψ ≡ ϕ 1 α ϕ 2 replace leftmost non-terminal ◮ ϕ = if there exist ϕ 2 ∈ V ∗ T , ϕ 1 ∈ ( V N ∪ V T ) ∗ , and A ∈ V N ⇒ ψ rm ◮ ϕ ≡ ϕ 1 A ϕ 2 ◮ A → α ∈ P ◮ ψ ≡ ϕ 1 α ϕ 2 replace rightmost non-terminal ∗ ◮ ϕ ∗ = ⇒ ψ , ϕ = ⇒ ψ are defined as usual lm rm

Syntax Analysis Ambiguous Grammar A grammar that has (equivalently) ◮ two leftmost derivations for the same string, ◮ two rightmost derivations for the same string, ◮ two syntax trees for the same string.

Syntax Analysis Reinhard Wilhelm Universitt des Saarlandes - PowerPoint PPT Presentation

Syntax Analysis Syntax Analysis Reinhard Wilhelm Universitt des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il 23. Oktober 2009 Syntax Analysis Subjects Introduction The task of syntax

Chapter 6: Syntax Syntax Syntax is the structure of a language. Earlier, both syntax and

Syntax Liam OConnor CSE, UNSW (and data61) Term3 2019 1 Abstract Syntax Parsing Bindings

Literary Analysis Syntax Review AP Literature and Composition 1 SYNTAX n Syntax Defines Style

Fundamantals Syntax of Programming Languages cs3723 1 Syntax and Semantics Syntax The

Syntax Directed Analysis Chapter 5 1 Compiler Construction Syntax Directed Analysis

Syntax Analysis Parsing Syntactic analysis = parsing Goal of parser: Find all syntax errors

Abstract Syntax Trees 27 February 2019 OSU CSE 1 Abstract Syntax Tree An abstract syntax

Compiling Techniques Lecture 7: Abstract Syntax Christophe Dubach 3 October 2017 Christophe

Syntax and Grammars 1 / 21 Outline What is a language? Abstract syntax and grammars Abstract

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

SI485i : NLP Set 7 Syntax and Parsing Syntax Grammar, or syntax: The kind of implicit

Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And

Syntax and Semantics Philipp Koehn 3 November 2020 Philipp Koehn Machine Translation: Syntax

SI425 : NLP Set 10 Syntax and Parsing Fall 2020 : Chambers Syntax Grammar, or syntax:

SI425 : NLP Set 7 Syntax and Parsing Syntax Grammar, or syntax: The kind of implicit

Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And

SVERTS 2004 Workshop associated with UML 2004 Susanne Graf Verimag, Grenoble, France

Symbol Tables COMP 520: Compiler Design (4 credits) Alexander Krolik

Computational Morphology: Machine learning of morphology Yulia Zinova 09 April 2014 16 July

Summary of Event-B Proof Obligations Jean-Raymond Abrial (edited by Thai Son Hoang) Department

Lecture #25: Calculator A Sample Language: Calculator Adminitrivia Source: John Denero.

Syntax analysis Definition keywords: (method select: (aBlock) [locals temp] (set temp ((self

Compiler Construction Lecture 5: Syntax Analysis I (Introduction) Thomas Noll Lehrstuhl f ur

Syntax and grammars, more datatypes, Source Program Break up string