A **formal grammar** is an abstract structure that describes a formal language precisely: i.e., a set of rules that mathematically delineates a (usually infinite) set of finite-length strings over a (usually finite) alphabet.

A **formal language** is a set of finite-length character strings drawn from some finite alphabet.

**A context-free grammar** (CFG) is a formal grammar in which every production rule is of the form

- V ?
*w*

where V is a non-terminal symbol and *w* is a string consisting of terminals and/or non-terminals. __The term “context-free” comes from the fact that the non-terminal V can always be replaced by w, regardless of in what context it occurs__.

A formal language is context-free if there is a context-free grammar that generates it.

**terminals** belong to the set of strings of the alphabet of the language. They do not get replaced during the derivation of a sentence.

**non-terminals** or **variables** or **syntactic-categories** are replaced during the derivation. They represent arbitrary strings.

As the parser accepts a sequence of tokens, it determines, based on this information, when the grammar’s respective rules are complete and verifies the syntactic correctness of the token sequence. The end result of the process is a **derivation** which represents the token sequence organized following the rules of the grammar.

A **derivation** of a string is a sequence of steps where nonterminals are expanded so that the string results.

S -> "if" E "then" S "else" S -> "if" "2" "=" "2" "then" S "else" S -> "if" "2" "=" "2" "then" "print" E "else" S -> "if" "2" "=" "2" "then" "print" "2" "=" "2" "else" S -> "if" "2" "=" "2" "then" "print" "2" "=" "2" "else" "print" E -> "if" "2" "=" "2" "then" "print" "2" "=" "2" "else" "print" "2" "=" "2"

A **leftmost** derivation is one where every step expands the leftmost nonterminal. A **rightmost** derivation expands the rightmost nonterminal.

A derivation need not be rightmost or leftmost. Many derivations can result in the same syntax tree.

A derivation starting from the start symbol gives a member of the language (after all variables have been eliminated)

Every context gree language has a **start symbol.**

The start symbol represents all valid sentences in the language

The head (of a **production**) a single variable for context free langiuages in a similar notation for non-context-free languages, the head may contain more than one symbol. The body (of the production) a sequence of symbols, variables as well as terminals are allowed.

To infer that a sentence is in the language

Recursive Inference:From body to head – Start from terminals and work up the tree to the start symbol

Derivation:From head to body – Start from the start symbol and apply productions until you get the sentence

Regular Languages are a subset of Context Free Languages All regular languages are also context free

Definition of a Context Free Grammar

G = {V,T,P,S}

V = Finite set of variables

nT = Finite set of terminals

nP = Set of productionsS = Start symbolFor language

L = { 0

n1n | n >= 1}

S -> 0 X 1

X -> e

X -> 0 X 1The grammar is

G = ( {S,X} , {0,1} , {(S,0X1),(X,

e),(X,0X1)},S )

The ancient Indian linguist Panini described Sanskrit using a context-free grammar. Recently, it has been suggested that a class of Tamil poetry called Venpa is governed by a context-free grammar.