Context Free Languages
Context Free Grammars
Parsing Arithmetic Expression
Removing λ-productions
Normal forms
January 4, 2024 Formal Language Theory 1
Context Free Grammar (CFG)
Definition: A CFG, G, is a PSG G=(N, T, P, S) with
productions of the form A β, A Є N, β Є (NUT)*.
CFGs are used in defining the syntax of programming
languages and in parsing arithmetic expressions.
A language generated from CFG is called Context Free
Language (CFL).
Ex.
a) S aB b) S aB|A
B bA|b A aA|a|CBA
Aa Bλ
Cc
January 4, 2024 Formal Language Theory 2
CFG: cont’d
Let G be a CFG, then x Є L(G) iff S x in
zero or more steps over G.
x Є L(G) can as well be obtained from a
derivation tree or parse tree. The root of the
tree is S and x is the collection of leaves from
left to right.
Left most derivation: employs the reduction
of the left most non-terminal
Right most derivation: employs the reduction
of the right most non-terminal
January 4, 2024 Formal Language Theory 3
CFG: cont’d
If a derivation of a string x has two different left most
derivations, then the grammar is said to be ambiguous.
Otherwise unambiguous.
(i.e. a grammar is ambiguous if it can produce more than one
parse tree for a particular sentence.
Ex.
1. G1 = (N, T, P, S) with productions:
S AB
A aA|a
B bB|b
let x = aaabbb
a) find a left most and right most derivations for x
b) draw the parse tree for x
January 4, 2024 Formal Language Theory 4
CFG: cont’d
2. G2 = (N, T, P, S) with productions:
S SbS|ScS|a
let x = abaca Є L(G2)
a) find a left most and right most
derivations for x
b) draw the parse tree for x
3. Is G1 ambiguous? Is G2?
January 4, 2024 Formal Language Theory 5
Parsing Arithmetic Expression
Consider the following grammar:
ET|E+T|E–T
T F | T * F | T/F
F a | b | c | (E)
Draw parse trees for
a) a*b+c b) a+b*c c) (a+b)*c d) a-b-c
January 4, 2024 Formal Language Theory 6
Removing λ-productions
Let G be a CFG and A α, A Є N
1. If α = λ, then A α is a λ-production
2. If α ≠ λ for all productions, then G is λ-free
3. If A => λ in zero or more steps, then A is called nullable or λ-
generating non-terminal
Ex. Let G=(N, T, P, S) be a CFG with productions:
S ABaC
AB
B b| λ
C c| λ
Find the non-terminals which are nullable.
January 4, 2024 Formal Language Theory 7
Removing λ-productions: cont’d
Let G=(N, T, P, S) be a CFG with λ-productions.
Construct G’=(N, T, P’, S), a λ-free CFG as
follows:
1. Put all non λ-productions of P in P’
2. For all nullable non-terminals, put productions in P’ by
removing one or more nullable non-terminals on the right side
productions.
Thus, L(G) \ {λ} = L(G’)
Ex1: Construct G’ for the previous example.
Ex2: Construct G’ for a grammar G with productions:
S aS | AB
Aλ
Bλ
Db
January 4, 2024 Formal Language Theory 8
Removing λ-productions: cont’d
Theorem: For any CFG G, there exists a
CFG G’ with λ-free productions such that
L(G) \{λ} = L(G’).
A production A α is called relevant/useful
iff there exists a derivation of some xЄL(G)
that uses the production. Otherwise, it’s
called irrelevant/useless.
Ex. S aSb | A | λ
A aA | λ
January 4, 2024 Formal Language Theory 9
Normal Forms
When the productions in a CFG G satisfy certain
restrictions, G is said to be in a normal form.
We’ll see two normal forms: CNF and GNF
1. Chomsky Normal Form (CNF)
Let G=(N, T, P, S) be a CFG and A α be a
production of G.
1. If α = B, B in N, then A α is called a Unit production
2. If |α|>1 and there exists a terminal substring of α, then A α
is called a Secondary production
3. If α contains more than two non-terminals, then A α is
called a Tertiary production
January 4, 2024 Formal Language Theory 10
Chomsky Normal Form (CNF)
Definition: Let G = (N, T, P, S) be a λ-free CFG, then
G is said to be in CNF if all its productions are of the
form:
A BC where A, B, C Є N
OR
A a, a Є T, A Є N
Ex. G with productions
S AB
Aa
Bb
January 4, 2024 Formal Language Theory 11
CNF: cont’d
Theorem: Any λ-free CFL can be generated by a
CFG in CNF.
proof:
Let G = (N, T, P, S) be a λ-free grammar such that
L = L(G). Construct a grammar G which is in CNF.
Steps:
1. Replace all Unit productions as follows:
For any A Є N on the LHS of the Unit production, denote U(A)
and non-Unit productions of A by N(A)
For each A Є N and U(A) ≠ Ø replace U(A) by
{A α | A=>B in one or more steps, and Bα Є N(B)}
January 4, 2024 Formal Language Theory 12
CNF: cont’d
2. Replace all Secondary productions as follows:
For any a Є T, a substring of a Secondary production,
replace a by Aa where Aa is a new non-terminal and
Aaa.
3. Replace all Tertiary productions as follows:
If A B1B2…Bm, m>2, then replace the production by:
A B1B1’
B1’ B2B2’
B2’ B3B3’
…
Bm-2’ Bm-1Bm, where B1’, …, Bm-2’ are all unique
new non-terminals that do not appear in any other
production.
January 4, 2024 Formal Language Theory 13
CNF: cont’d
Ex. Convert the following grammars to CNF
1. Let G be a CFG with productions:
S A | ABA
A aA | a| B
B bB | b
2. Let G be a CFG with productions:
S aAD
A aB | bAB
Bb
Dd
January 4, 2024 Formal Language Theory 14
Identification of non-CFLs
Pumping lemma for CFLs
(Reading assignment)
January 4, 2024 Formal Language Theory 15
Greibach Normal Form (GNF)
Let G=(N, T, P, S) be a λ-free CFG, then if all the
productions of G are of the form
A aα, A Є N, a Є T, α Є N*
then G is said to be in GNF
Theorem G1: If A α1Bα2 is a production in a CFG
G and B β1|β2|β3|…|βk are all productions with B
on the LHS, then
A α1B α2 can be replaced by
A α1β1α2| α1β2α2| … | α1βkα2
without affecting L(G).
January 4, 2024 Formal Language Theory 16
GNF: cont’d
Theorem G2: If in a CFG there is a production A
Aα1 | Aα2 | … | Aαn | β1|β2|…|βm, such a production
is called left recursive, and A β1|β2|β3|…|βm are
the remaining productions with A on the LHS.
Then an equivalent grammar can be constructed by
introducing a new non-terminal, A’, and replacing all
these productions by:
A β1|β2|β3|…|βm| β1A’| β2A’|…| βmA’
A’ α1| α2|…| αn| α1A’| α2A’|…| αnA’
January 4, 2024 Formal Language Theory 17
GNF: cont’d
Theorem GNF: Any λ-free CFG G can be converted into a
grammar in GNF.
Proof:
Let G be a λ-free CFG. To convert G into GNF use the steps below:
1. Convert G into CNF, G’
2. Rename the non-terminals in G’ as
A1, A2, …, Am (m>=1)
3. Convert all the productions into
Ai aα or Ai Ajα with j > i
To convert to the form Ai Ajα with j > i, do the following:
Substitute for Aj according to Theorem G1.
If there exist left recursive productions with Ai on the LHS, then
introduce a new non-terminal Ai’ and apply Theorem G2.
January 4, 2024 Formal Language Theory 18
GNF: cont’d
4. After the 3rd step, the productions will be
of the form
i. Ai Ajα, j > i, αЄ(NUN’)* where N’ stands for
the new non-terminals Ai introduced.
ii. Ai aα, a Є T, αЄ(NUN’)* or
iii. Ai’ xα, xЄ(NUT), αЄ(NUN’)*
Replace (i) by using Theorem G1
(iii) by using Theorem G2
January 4, 2024 Formal Language Theory 19
GNF: cont’d
Ex. Convert to GNF
1. Let G be with productions
S AB
A BS | b
B SA | a
2. Let G be with productions
A 1 A 2A 2 | a
A 2 A 1A 2 | b
January 4, 2024 Formal Language Theory 20
Closure Properties of CFGs
Theorem: CFGs are closed under:
a) Union
b) Concatenation
c) Kleen star(*)
January 4, 2024 Formal Language Theory 21