# What are the properties of sigma notation

[Back] [Back (bottom of page)] [Bottom of page] [Supplementary chapter] [Please report script error]

### 13.4. Formal languages ​​and rule grammars

#### 13.4.1. Language as a crowd

The alphabet (sigma), characters and strings

De ﬁ nition 13.4.1. An alphabet is a finite set of characters (atomic symbols). It is notated with Σ (sigma).

De ﬁ nition 13.4.2. A character string (word, string) of n characters from Σ is a finite sequence of length n over Σ.

The empty string (empty word) is the sequence of 0 characters. It is notated with ϵ (epsilon) and has the length 0.

Note on the notation

A character string is typically noted by juxtaposing the characters from left to right.

Let Σ = {a, b}, then ϵ, a, bb or ababbba are words over Σ. A more explicit notation for bb is 〈b, b〉 or {〈1, b〉, 〈2, b〉}.

Star of Sigma and Formal Languages

De ﬁ nition 13.4.3. The star of Sigma is the set of all words over an alphabet Σ. The star is used as the post ﬁ x operator Σ (pronounced “Sigma Stern”).

De ﬁ nition 13.4.4. A formal language L over Σ is a subset of the star of Sigma. Example 13.4.5. If Σ = {a}, then Σ= {ϵ, a, aa, aaa, ...}. The sets L1= {ϵ, a} or L2= {aa, aaaa, aaaaaa} are formal languages ​​because they are (real) subsets of Σ are.

Empty languages ​​vs. empty string

Hints

• The empty language is the empty set, noted as {} or ∅.
• The language, which only includes the empty string, is noted as {ϵ}.
• The empty language {} and the language {ϵ} are not the same.

• Is {} a language above everyone Σ?
• Is the language {ϵ} a subset of any non-empty language?
• Is Σ a language over Σ?

#### 13.4.2. Concatenation

Concatenation of strings

De ﬁ nition 13.4.6. The concatenation of character strings is a two-digit function which concatenates its arguments into a word. For all u, v ∈ Σ: What does uv mean?

If u: 1..n → Σ and v: 1..m → Σ are words, i.e. finite sequences of signs, then uv: 1 .. (m + n) → Σ. Whereby for all character positions i ∈ 1 .. (n + m) applies: Power notation of concatenation

Properties of concatenation

The concatenation is associative and has ϵ as a neutral element. For all u, v, w ∈ Σ: De ﬁ nition 13.4.7. The n-fold concatenation of a character string w with itself in the power notation is de ﬁ ned recursively. For n ≥ 1, n ∈ ℕ: Example 13.4.8 (power notation of the concatenation).
The string aaabbcccc can be used as a3b2c4 be noted.

bu ﬀ alo6

Example 13.4.9 (A syntactically correct English sentence).
“Bu ﬀ alo bu ﬀ alo Bu ﬀ alo bu ﬀ alo bu ﬀ alo bu ﬀ alo Bu ﬀ alo bu ﬀ alo.”

analysis

• Bu ﬀ alo as an adjective of origin
• bu ﬀ alo as a noun (Bü ﬀ el)
• bu ﬀ alo as a verb (intimidate)
• Sense: "Bison from Bu ﬀ alo, New York who are intimidated by other bison in their community also happen to intimidate other bison in their community." #### 13.4.3. Grammars

Chomsky hierarchy [HOPCROFT et al. 2002] Figure 13.2: Subset relationships of the Chomsky language classes

 Language class Type example regular 3 {an} context free 2 {anbn} context sensitive 1 {anbncn} generally 0
with n ≥ 1

Real subsets

For all type i languages ​​with 0 ≤ i ≤ 2 the following applies: Li + 1⊂ Li .

Where are natural languages ​​located? [HESS2005, 138ﬀ.]

At least type 2: NPnVPn(central embedding)

-----------------------------------------------
|          ----------------------------         |
|         |            --------        |        |
|         |           |        |       |        |
The man whose wife whose child is angry is sad is surprised

At least type 1 according to [SHIEBER1985, KALLMEYER2005]: NPiNPjViVj(cross serial construction)

------------------
|                  |
------------------      |
|     |            |     |
--------------------      |     |
|       |     |      |     |     |
mer wänd d’Chind on Hans s’Huus laa half aaschtriiche

Complexity, grammaticality, acceptance of language

Christian Morgenstern, preface to gallows songs

Rule grammars

• A rule grammar is a powerful finite descriptive tool for specifying formal languages ​​with a potentially infinite number of strings.
• A grammar G = 〈Φ, Σ, R, S〉 consists of:
1. Alphabet Φ: finite set of nonterminal symbols
2. Alphabet Σ: finite set of terminal symbols with Φ ∩ Σ = ∅
3. Set R ⊆ Γ× Γ of rules 〈α, β〉 (with total alphabet Γ = Φ ∪ Σ), where: α ≠ ϵ and α ⁄∈ Σ
4. Start symbol S ∈ Φ
• This definition of a grammar is the most general (type 0).
• A grammar rule is an ordered pair: 〈α, β〉. Notation: α → β.

Context Free Grammars

• A context-free grammar G = 〈Φ, Σ, R, S〉 consists of:
1. Non-terminal symbols Φ
2. Terminal symbols Σ
3. Rule set R ⊆Φ × Γ (Γ = Φ ∪ Σ)
4. Start symbol S ∈ Φ

Example 13.4.10 (context-free grammar).

• G1= 〈{S, NP, V P, EN, V, D, N}, {Egon, Pudel, den, ass}, R, S〉
• Standard set Example evaluation
See Fig. 13.3 on page 530. Figure 13.3: Example of left derivation and parse tree construction

Formal derivation of sentences

De ﬁ nition 13.4.11 (direct derivative relation). The direct derivative relation⇒ ⊆ Γ× Γ of a grammar is the set of all pairs 〈u, v〉 with u, v, w, z ∈ Γfor which applies:

• there is a rule of the form w → z
• the strings u and v can be divided into substrings in such a way that: u = u1wu2 as well as v = u1to2

De ﬁ nition 13.4.12 (derivation). A derivative is an n-tuple 〈w1, ..., wn〉 Of strings wi∈ Γ with (1 ≤ i ≤ n) such that:

• wi − 1⇒ wi for all i ∈ {2..n}

Normal notation for derivatives

w1⇒… ⇒ wn

Example: Derivation with context-free grammar Derivation u rule v u1wu2 w → z u1to2 S. ϵ S ϵ S → NP VP ϵ NP VP ϵ ⇒ NP VP ϵ NP VP NP → EN ϵ EN VP ⇒ EN VP EN VP ϵ VP → V NP EN V NP ϵ ⇒ EN V NP EN V NP V → ate EN ate NP ⇒ EN ate NP EN ate NP ϵ NP → D N EN ate D N ϵ ⇒ EN aß D N EN ate D N D → den EN ate the N ⇒ EN ate the N EN ate the N ϵ N → poodle EN ate the poodle ϵ ⇒ EN ate the poodle ϵ EN ate the poodle EN → Egon ϵ Egon ate the poodle ⇒ Egon ate the poodle

Sentence forms, sentences and languages

De ﬁ nition 13.4.13 (derivation relation) is the re ﬂ exive-transitive envelope of ⇒.

De ﬁ nition 13.4.14 (sentence). A sentencea of ​​a grammar G = 〈Φ, Σ, R, S〉 is a string of terminal symbols a ∈ Σso that: De ﬁ nition 13.4.15 (language of a grammar G) .The languageLG of a grammar G = 〈Φ, Σ, R, S〉 is the set of all its sentences a ∈ Σ. Grammar rules, language classes and automata
The various grammar types differ in terms of the conditions that are placed on the rule set R. Let A, B ∈ Φ, w ∈ Σ and α, β, γ ∈ (Φ ∪ Σ).

 Language class Machine type Regular A → w Finite automaton (Type 3) A → wB or A → Bw Context-free A → α Basement machine (Type 2) Context- αAγ → αβγ with β ≠ ϵ or sensitive S → ϵ (then S is not allowed Linear (Type 1) on a right side more limited a rule occur) Machine (LBA) (Type 0) α → β (with α ≠ ϵ and α ⁄∈ Σ∗) Turing machine

The complexity of the computations for parsing increases with each type of grammar.

[Back] [Back (bottom of page)] [Top of page] [Supplementary chapter] [Please report script error]