CSC 1800 Organization of Programming Languages Syntax 1 Questions ⚫ What is a computer program? ⚫ What is a programming language? ⚫ Describe the alphabet available to you to create programs 2 2 1
Programs, formally ⚫ Let A be the alphabet of a programming language L , hence a subset of all the keyboard-producible characters. ⚫ Let A* be the set of finite sequences of characters of A . ⚫ L is a subset of A*. That is, any program P in L is a sequence of characters in A . ⚫ Why is this definition not completely correct? 3 3 Syntax ⚫ The description of a language specifying structurally correct phrases in the language. As opposed to ⚫ Semantics – The meaning of phrases in the language 4 4 2
Syntax (2) and Pragmatics ⚫ – The practical use of the language – Communicating algorithms to humans – Laying out a program in readable form – Purpose of the language – User interface to the language including IDE – Efficiency of generated code 5 5 Simple Syntax <letter> ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p| q|r|s|t|u|v|w|x|y|z Non-terminal: <letter> Terminals: lower case letters (26 choices) | means “or” 6 6 3
Simple Syntax (2) <letter seq> ::= <letter> | <letter seq><letter> Notes: ⚫ Right recursion (or is it left recursion?) ⚫ Does the recursive order make a difference? ⚫ What are permissible lengths of letter sequences? 7 7 Simple Syntax (3) <digit> ::= 0|1|2|3|4|5|6|7|8|9 <digit string> ::= <digit> | <digit string><digit> Your turn: ⚫ Create the syntax specification for an integer 8 8 4
BNF ⚫ Backus-Naur Form (or Backus Normal Form) John Backus, developer of Fortran – Peter Naur – Revised Report on the Algorithmic Language Algol 60, Computer – Journal, 5 (1963), 349-367. ⚫ Use for Checking a symbol string for syntactic correctness – Generating a correct symbol string – ⚫ The examples on the previous slide use BNF 9 9 BNF ::= is defined as | or < > syntactic category, grammatical category 10 10 5
Extended BNF For those who dislike recursion: ⚫ [ ] surround an optional part ⚫ { } surround a repeated part, repeated 0 or more times ⚫ ( ) are used to clarify hierarchy 11 11 Sequences In many instances, we need to specify a sequence of items. This may be done recursively: <seq> ::= <id> | <seq> , <id> or <seq> ::= <id> | <id> , <seq> or (using EBNF) <seq> ::= <id> { , <id> } 12 12 6
Context Free Grammar ⚫ Defined with four (4) elements: terminal symbols – ⚫ formed as strings of the alphabet of the language ⚫ lexical elements of the language non-terminal symbols – ⚫ each specifies a grammatical category, syntactic category, or set of strings of terminal symbols – special non-terminal, the start symbol ⚫ appears only on LHS of a production productions – ⚫ recursively defines all non-terminal symbols by showing how terminal and non-terminal symbols may be combined 13 13 Context Free Language ⚫ Defined by a context free grammar ⚫ The set of all strings of terminal symbols generated by the context free grammar 14 14 7
Parse Tree ⚫ A labeled rooted tree ⚫ Root labeled with start symbol ⚫ Interior nodes labeled with non-terminal symbols (syntactic categories) ⚫ Leaves labeled with terminal symbols 15 15 Parse Tree Example A = B * ( A + C ) 16 16 8
Grammar of Expressions Using a simplified EBNF Key: S ::= E S – statement E – expression E :: = T | E + T | E – T T – term T ::= F | T*F | T/F F – factor I – identifier F ::= (E) | I | N N – number What is this language? 17 17 Expression Parse Trees Your turn: Build a parse tree for a+b*c 1. Build a parse tree for a-b-c 2. Infer rules for associativity and distributivity from the structure of the grammar. [Note : example trees on next slide… no peeking] 18 18 9
Parse Trees a+b*c * + a b c + * a b c 19 19 Ambiguity Some grammars lead to distinct parse trees for the same expression. Questions: What does distinct mean? 1. Is this situation bad? 2. What can be done? 3. 20 20 10
Ambiguity Revealed A = B + C * A 21 21 11
Recommend
More recommend