Introduction
History, standards, syntax, semantics, grammars, parsing.
Readings: Scott, ch 1  2
See resources folder for language standards documents: C11, C++17, ECMA262, ECMA334, and JLS15.
Such as: Fortran, Pascal, C, Ada
 programs have mutable storage (state) modified by assignments
 the most common and familiar paradigm
Such as: Scheme, Lisp, ML, Haskell
 functions are firstclass values
 side effects (e.g., assignments) discouraged
Such as: Prolog, Mercury
 programs are sets of assertions and rules
Such as: Simula 67, Smalltalk, C++, Ada95, Java, C#
 data structures and their operations are bundled together
 inheritance
Such as: QCL, Q, Q#, qGCL
 performs operations on data using quantum bits (“qubits”)
 utilizes quantum properties such as superposition and entanglement
Does not add expressiveness to the language—for convenience only.
 alernation:
<Symb> ::= <Letter>  <Digit>
 sequencing:
<Id> ::= <Letter> <Symb>
 1.Numbers of a and b are equal (in any order):
<E> :== a <E> b <E>  b <E> a <E>  ε
 1.a is more:
<A> :== <E> a <A>  <E> a <E>
Encompasses everything BNF has, plus:
 repetition:

 zero or more:
{<Symb>}
or<Symb>*

 one or more:
<Digit>+
 option:
[<Digit>]
 grouping:
('+''')
 all productions can be written in the form: N ::= TN
 one nonterminal on left side; at most one on right
 generally used for scanners
 all productions can be written in the form:N ::= XYZ
 one nonterminal on the lefthand side; mixture on right
 most major programming languages
 number of symbols on the left is no greater than on the right
 no production shrinks the size of the sentential form
 used for parts of C++, but otherwise rarely used
no restrictions
Regular grammars can be used to generate regular languages. Regular expressions can be used to accept regular languages.
ε
denotes∅
 a character
x
, wherex ∈ Σ
, denotes{x}
 sequencing: a sequence of two regular expressions
RS
denotes{αβ  α ∈ [R], β ∈ [S]}
 alternation:
RS
denotes[R] ∪ [S]
 Kleene star:
R*
denotes the set of strings which are concatenations of zero or more strings from[R]
 grouping: parentheses
(AB)
Shorthands
R? ≡ εR
R+ ≡ RR*
Conventions
. ≡ any α ∈ Σ (“any character”)
[abc] ≡ (abc)
[^abc] ≡ Σ \ {a, b, c}
[09] ≡ (0123456789)
az and AZ
A parse tree describes the grammatical structure of a sentence
 leaf nodes are terminal symbols
 internal nodes are nonterminal symbols
 construction of tree from sentence is called parsing
Example:
Input string:
52.316
Grammar:
<Float> ::= <Digits>  <Digits> '.' <Digits>
<Digits> ::= <Digit>  <Digit> <Digits>
<Digit> ::= '0''1''2''3''4''5''6''7''8''9'
Parse tree:
If the parse tree for a sentence is not unique, the grammar is ambiguous.
From recitation: A CFG is ambiguous if it has more than one parse tree for some strings. i.e. there is more than 1 derivation for a string.
<S> ::= <M>  <U>
<O> ::= <V> ‘=’ <E>  <S> ‘;’ <S>
<M> ::= ‘if’ <B> ‘then’ <M> ‘else’ <M>  <O>
<U> ::= ‘if’ <B> ‘then’ <S>  ‘if’ <B> ‘then’ <M> ‘else’ <U>
<B> ::= <E> ‘===’ <E>
<V> ::= ‘x’  ‘y’  ‘z’
<E> ::= <V>  ‘0’  ‘1’  ‘2’  ‘3’
If we say operator
*
has precedence over operator +
. This means expression 5 + 2 * 3
should be evaluated as: 5 + (2 * 3)
, not (5 + 2) * 3
.Precedence can be specified in two ways:
 Write precedence directly into the rules: higher precedence appear in deeper rules.
 Write an ambiguous grammar first, then specify operator precedence separately.
Associativity tells the parser what to do with operators at the same level of precedence.
For example,
5  (2  3)
verses (5  2)  3
. Two 
operators have the same precedence. However, how you associate them will yield different mathematical results.Usually, you can specify using left associativity or right associativity.
Scanners (or tokenizers) read in text and extract tokens. Parsers read in tokens and construct a parse tree.
LL (Lefttoright, Leftmost derivation) parsers are also called topdown, recursive descent or predictive parsers. It begins at the root symbol.
LL(k)
: means k
look ahead.Problems with LL parsing:
 Left recursion: a grammar is leftrecursive if there exists nonterminal
A
such that<A> ::= <A> α
for someα
.  Common prefixes: if there exists a nonterminal
A
and terminalb
such that there exists rule R1<A> ::= b ...
and R2<A> ::= b ...
.
How to eliminate leftrecursive problem:
Original:
A → Aα1 Aα2  …  Aαm  β1  β2  …  βn
Convert to:
A → β1A'  β2 A'  …  βnA'
A' → α1A'  α2A'  …  αmA'  ε
Last modified 2yr ago