What are the properties of sigma notation
[Back] [Back (bottom of page)] [Bottom of page] [Supplementary chapter] [Please report script error]
13.4. Formal languages and rule grammars
13.4.1. Language as a crowd
The alphabet (sigma), characters and strings
De ﬁ nition 13.4.1. An alphabet is a finite set of characters (atomic symbols). It is notated with Σ (sigma).
De ﬁ nition 13.4.2. A character string (word, string) of n characters from Σ is a finite sequence of length n over Σ.
Note on the notation
A character string is typically noted by juxtaposing the characters from left to right.
Let Σ = {a, b}, then ϵ, a, bb or ababbba are words over Σ. A more explicit notation for bb is 〈b, b〉 or {〈1, b〉, 〈2, b〉}.
Star of Sigma and Formal Languages
De ﬁ nition 13.4.3. The star of Sigma is the set of all words over an alphabet Σ. The star is used as the post ﬁ x operator Σ^{∗} (pronounced “Sigma Stern”).
De ﬁ nition 13.4.4. A formal language L over Σ is a subset of the star of Sigma.
Example 13.4.5. If Σ = {a}, then Σ^{∗}= {ϵ, a, aa, aaa, ...}. The sets L_{1}= {ϵ, a} or L_{2}= {aa, aaaa, aaaaaa} are formal languages because they are (real) subsets of Σ^{∗} are.
Empty languages vs. empty string
Hints
 The empty language is the empty set, noted as {} or ∅.
 The language, which only includes the empty string, is noted as {ϵ}.
 The empty language {} and the language {ϵ} are not the same.
ask
 Is {} a language above everyone Σ?
 Is the language {ϵ} a subset of any nonempty language?
 Is Σ^{∗} a language over Σ?
13.4.2. Concatenation
Concatenation of strings
De ﬁ nition 13.4.6. The concatenation of character strings is a twodigit function which concatenates its arguments into a word. For all u, v ∈ Σ^{∗}:
What does uv mean?
If u: 1..n → Σ and v: 1..m → Σ are words, i.e. finite sequences of signs, then uv: 1 .. (m + n) → Σ. Whereby for all character positions i ∈ 1 .. (n + m) applies:
Power notation of concatenation
Properties of concatenation
The concatenation is associative and has ϵ as a neutral element. For all u, v, w ∈ Σ^{∗}:
De ﬁ nition 13.4.7. The nfold concatenation of a character string w with itself in the power notation is de ﬁ ned recursively. For n ≥ 1, n ∈ ℕ:
Example 13.4.8 (power notation of the concatenation).
The string aaabbcccc can be used as a^{3}b^{2}c^{4} be noted.
bu ﬀ alo^{6}
Example 13.4.9 (A syntactically correct English sentence).
“Bu ﬀ alo bu ﬀ alo Bu ﬀ alo bu ﬀ alo bu ﬀ alo bu ﬀ alo Bu ﬀ alo bu ﬀ alo.”
analysis
 Bu ﬀ alo as an adjective of origin
 bu ﬀ alo as a noun (Bü ﬀ el)
 bu ﬀ alo as a verb (intimidate)
 Sense: "Bison from Bu ﬀ alo, New York who are intimidated by other bison in their community also happen to intimidate other bison in their community."
13.4.3. Grammars
Chomsky hierarchy [HOPCROFT et al. 2002]

Language class  Type  example 
regular  3  {a^{n}} 
context free  2  {a^{n}b^{n}} 
context sensitive  1  {a^{n}b^{n}c^{n}} 
generally  0  
Real subsets
For all type i languages with 0 ≤ i ≤ 2 the following applies: L_{i + 1}⊂ L_{i} .
Where are natural languages located? [HESS2005, 138ﬀ.]
At least type 2: NP^{n}VP^{n}(central embedding)
  
    
     
The man whose wife whose child is angry is sad is surprised
At least type 1 according to [SHIEBER1985, KALLMEYER2005]: NP^{i}NP^{j}V^{i}V^{j}(cross serial construction)
 
 
   
  
     
mer wänd d’Chind on Hans s’Huus laa half aaschtriiche
Complexity, grammaticality, acceptance of language
Christian Morgenstern, preface to gallows songs
Rule grammars
 A rule grammar is a powerful finite descriptive tool for specifying formal languages with a potentially infinite number of strings.
 A grammar G = 〈Φ, Σ, R, S〉 consists of:
 Alphabet Φ: finite set of nonterminal symbols
 Alphabet Σ: finite set of terminal symbols with Φ ∩ Σ = ∅
 Set R ⊆ Γ^{∗}× Γ^{∗} of rules 〈α, β〉 (with total alphabet Γ = Φ ∪ Σ), where: α ≠ ϵ and α ⁄∈ Σ^{∗}
 Start symbol S ∈ Φ
 This definition of a grammar is the most general (type 0).
 A grammar rule is an ordered pair: 〈α, β〉. Notation: α → β.
Context Free Grammars
 A contextfree grammar G = 〈Φ, Σ, R, S〉 consists of:
 Nonterminal symbols Φ
 Terminal symbols Σ
 Rule set R ⊆Φ × Γ^{∗} (Γ = Φ ∪ Σ)
 Start symbol S ∈ Φ
Example 13.4.10 (contextfree grammar).
 G_{1}= 〈{S, NP, V P, EN, V, D, N}, {Egon, Pudel, den, ass}, R, S〉
 Standard set
Example evaluation
See Fig. 13.3 on page 530.

Formal derivation of sentences
De ﬁ nition 13.4.11 (direct derivative relation). The direct derivative relation⇒ ⊆ Γ^{∗}× Γ^{∗} of a grammar is the set of all pairs 〈u, v〉 with u, v, w, z ∈ Γ^{∗}for which applies:
 there is a rule of the form w → z
 the strings u and v can be divided into substrings in such a way that: u = u_{1}wu_{2} as well as v = u_{1}to_{2}
De ﬁ nition 13.4.12 (derivation). A derivative is an ntuple 〈w_{1}, ..., w_{n}〉 Of strings w_{i}∈ Γ^{∗} with (1 ≤ i ≤ n) such that:
 w_{i − 1}⇒ w_{i} for all i ∈ {2..n}
Normal notation for derivatives
w_{1}⇒… ⇒ w_{n}
Example: Derivation with contextfree grammar
Derivation  u  rule  v 
u_{1}wu_{2}  w → z  u_{1}to_{2}  
S.  ϵ S ϵ  S → NP VP  ϵ NP VP ϵ 
⇒ NP VP  ϵ NP VP  NP → EN  ϵ EN VP 
⇒ EN VP  EN VP ϵ  VP → V NP  EN V NP ϵ 
⇒ EN V NP  EN V NP  V → ate  EN ate NP 
⇒ EN ate NP  EN ate NP ϵ  NP → D N  EN ate D N ϵ 
⇒ EN aß D N  EN ate D N  D → den  EN ate the N 
⇒ EN ate the N  EN ate the N ϵ  N → poodle  EN ate the poodle ϵ 
⇒ EN ate the poodle  ϵ EN ate the poodle  EN → Egon  ϵ Egon ate the poodle 
⇒ Egon ate the poodle  
Sentence forms, sentences and languages
De ﬁ nition 13.4.13 (derivation relation) is the re ﬂ exivetransitive envelope of ⇒.
De ﬁ nition 13.4.14 (sentence). A sentencea of a grammar G = 〈Φ, Σ, R, S〉 is a string of terminal symbols a ∈ Σ^{∗}so that:
De ﬁ nition 13.4.15 (language of a grammar G) .The languageL_{G} of a grammar G = 〈Φ, Σ, R, S〉 is the set of all its sentences a ∈ Σ^{∗}.
Grammar rules, language classes and automata
The various grammar types differ in terms of the conditions that are placed on the rule set R. Let A, B ∈ Φ, w ∈ Σ^{∗} and α, β, γ ∈ (Φ ∪ Σ)^{∗}.
Language class  Machine type  
Regular  A → w  Finite automaton  
(Type 3)  A → wB  or A → Bw  
Contextfree  A → α  Basement machine  
(Type 2)  
Context  αAγ → αβγ  with β ≠ ϵ or  
sensitive  S → ϵ  (then S is not allowed  Linear 
(Type 1)  on a right side  more limited  
a rule occur)  Machine (LBA)  
(Type 0)  α → β  (with α ≠ ϵ and α ⁄∈ Σ^{∗})  Turing machine 
The complexity of the computations for parsing increases with each type of grammar.
[Back] [Back (bottom of page)] [Top of page] [Supplementary chapter] [Please report script error]
 Which clothing brand has the best logo?
 Darker beers have more alcohol
 Can have email addresses
 Why am I happy alone?
 How often should you see a dermatologist
 What is the creative process
 What do white South Africans eat
 What is the belief according to your religion
 Why do hotels keep their windows locked?
 What is an eggplant
 Is Australia multicultural and diverse like the USA
 Why nonalcoholic beverages are so addicting
 What is replacement fertility
 What are hyperbolas used for?
 What is the agile software development process
 Can we make a ring with pure gold?
 Do countries still have spies
 Athens is a beautiful city
 Is IBM a good investment
 What is the extension of 1 + x 2
 How does Expedia compensate its affiliates
 Which country has a free visa
 What is a secondary immune response
 Like 100 has 3 significant numbers