Clever Geek Handbook
📜 ⬆️ ⬇️

Chomsky Hierarchy

The Chomsky Hierarchy is a classification of formal languages and formal grammars , according to which they are divided into 4 types according to their conditional complexity. Proposed by the professor at the Massachusetts Institute of Technology , the linguist Noam Chomsky .

Content

  • 1 Grammar Classification
    • 1.1 Type 0 - Unlimited
    • 1.2 Type 1 - context sensitive
    • 1.3 Type 2 - context-free
    • 1.4 Type 3 - Regular
  • 2 Classification of languages
  • 3 notes
  • 4 Literature

Grammar Classification

According to Chomsky, formal grammars can be divided into four types. To assign a grammar to one or another type, it is necessary that all its rules (productions) correspond to certain schemes.

Type 0 - Unlimited

A grammar with a phrase structure G is an algebraic structure , an ordered four (V T , V N , P, S), where [1] :

  • VT{\ displaystyle V_ {T}}   - alphabet (set) of terminal symbols - terminals ,
  • VN{\ displaystyle V_ {N}}   - the alphabet (set) of non-terminal characters - non - terminals ,
  • V=VT∪VN{\ displaystyle V = V_ {T} \ cup V_ {N}}   - dictionaryG {\ displaystyle G}   , andVT∩VN=∅ {\ displaystyle V_ {T} \ cap V_ {N} = \ emptyset}  
  • P{\ displaystyle P}   - a finite set of productions (rules) of grammar,P⊆V+×V∗ {\ displaystyle P \ subseteq V ^ {+} \ times V ^ {*}}  
  • S{\ displaystyle S}   - initial character ( source ).

HereV∗ {\ displaystyle V ^ {*}}   - the set of all lines above the alphabetV {\ displaystyle V}   , butV+ {\ displaystyle V ^ {+}}   - many non-empty lines above the alphabetV {\ displaystyle V}   .

According to Chomsky’s classification, type 0 includes unlimited grammars — phrases with phrasal structure , that is, all formal grammars without exception. Rules can be written as:

α→β{\ displaystyle \ alpha \ rightarrow \ beta}   ,

Whereα∈V+ {\ displaystyle \ alpha \ in V ^ {+}}   - any nonempty chain containing at least one nonterminal [2] symbol, andβ∈V∗ {\ displaystyle \ beta \ in V ^ {*}}   - any string of characters from the alphabet.

Due to their complexity, such grammars do not have practical application.

Type 1 - Context-sensitive

This type includes context-sensitive (KZ) grammars and non-shortening grammars. For grammarG(VT,VN,P,S),V=VT∪VN {\ displaystyle G (V_ {T}, V_ {N}, P, S), V = V_ {T} \ cup V_ {N}}   all rules have the form [3] :

  • αAβ→αγβ{\ displaystyle \ alpha A \ beta \ rightarrow \ alpha \ gamma \ beta}   whereα,β∈V∗,γ∈V+,A∈VN {\ displaystyle \ alpha, \ beta \ in V ^ {*}, \ gamma \ in V ^ {+}, A \ in V_ {N}}   . Such grammars are referred to as context-sensitive.
  • α→β{\ displaystyle \ alpha \ rightarrow \ beta}   whereα,β∈V+,one≤|α|≤|β| {\ displaystyle \ alpha, \ beta \ in V ^ {+}, 1 \ leq | \ alpha | \ leq | \ beta |}   . Such grammars are classified as non-shortening.

These grammar classes are equivalent. They can be used in the analysis of texts in natural languages, however, when building compilers, they are practically not used because of their complexity. For context-sensitive grammars, the statement is proved: according to a certain algorithm, in a finite number of steps, you can establish whether a chain of terminal symbols belongs to a given language or not.

Type 2 - context-free

This type includes context-free (KS) grammar . For grammarG(VT,VN,P,S),V=VT∪VN {\ displaystyle G (V_ {T}, V_ {N}, P, S), V = V_ {T} \ cup V_ {N}}   all rules are of the form:

  • A→β{\ displaystyle A \ rightarrow \ beta}   whereβ∈V+ {\ displaystyle \ beta \ in V ^ {+}}   (for non-shortening COP grammars) orβ∈V∗ {\ displaystyle \ beta \ in V ^ {*}}   (for shortening)A∈VN {\ displaystyle A \ in V_ {N}}   . That is, the grammar allows only a nonterminal symbol to appear on the left side of the rule.

KS grammars are widely used to describe the syntax of computer languages (see parsing ).

Type 3 - Regular

The third type includes regular grammars (automatic) - the simplest of formal grammars. They are context-free, but with disabilities.

All regular grammars can be divided into two equivalent classes, which for a grammar of type III will have rules of the following form:

  • A→Bγ{\ displaystyle A \ rightarrow B \ gamma}   orA→γ {\ displaystyle A \ rightarrow \ gamma}   whereγ∈VT∗,A,B∈VN {\ displaystyle \ gamma \ in V_ {T} ^ {*}, A, B \ in V_ {N}}   (for left-linear grammars).
  • A→γB{\ displaystyle A \ rightarrow \ gamma B}   ; orA→γ {\ displaystyle A \ rightarrow \ gamma}   whereγ∈VT∗,A,B∈VN {\ displaystyle \ gamma \ in V_ {T} ^ {*}, A, B \ in V_ {N}}   (for right-line grammars).

Regular grammars are used to describe the simplest constructions: identifiers , strings , constants , as well as assembly languages , command processors , etc.

Language Classification

Formal languages ​​are classified according to the types of grammars with which they are defined. However, the same language can be defined by different grammars belonging to different types. In this case, it is believed that the language belongs to the simplest of them. So, a language described by a grammar with a phrase structure, context-sensitive and context-free grammars will be context-free.

As with grammars, the complexity of a language is determined by its type. The most complex are languages ​​with a phrasal structure (natural languages ​​can be attributed here), hereinafter referred to as KZ languages, CS languages ​​and the simplest are regular languages.

Notes

  1. ↑ Cook, Base, 1990 , p. 258.264.
  2. ↑ Serebryakov V. A., Galochkin M. P., Gonchar D. R., Furugyan M. G. Theory and implementation of programming languages. - M.: MZ-Press, 2006. - S. 21. - ISBN 5-94073-094-9 .
  3. ↑ Cook, Base, 1990 , p. 268.

Literature

  • Molchanov A. Yu. System software. St. Petersburg: Peter, 2006.
  • John Hopcroft , Rajeev Motwani , Jeffrey Ullman . CHAPTER 5. Context-free grammars and languages // Introduction to the theory of automata, languages ​​and computations = Introduction to Automata Theory, Languages, and Computation. - M .: "Williams" , 2002. - S. 528. - ISBN 0-201-44124-1 .
  • Serebryakov V. A. , Galochkin M. P., Gonchar D. R., Furugyan M. G. Theory and implementation of programming languages - M .: MZ-Press, 2006, 2nd ed. - ISBN 5-94073-094-9
  • Robin Hunter Basic Compiler Concepts = The Essence of Compilers. - M .: "Williams" , 2002. - S. 256. - ISBN 5-8459-0360-2 .
  • Cook D., Baze G. Chapter 8. Languages ​​and Grammars // Computer Mathematics = Computer Mathematics. - M .: Science. Fizmatlit, 1990 .-- 384 p. - ISBN 5-02-014216-6 .
Source - https://ru.wikipedia.org/w/index.php?title=Khomsky_Hierarchy&oldid=100637587


More articles:

  • Johann Casimir Palatinate-Kleeburgsky
  • Youth Street (Izhevsk)
  • Evening Party (Plant)
  • Puma pardoides
  • Dimal
  • Joshua Book
  • Bozhkov, Dmitry Alexandrovich
  • My Eyes
  • Andrena ornata
  • Vakhsh (village)

All articles

Clever Geek | 2019