Context-free grammar - Biblioteka.sk

Upozornenie: Prezeranie týchto stránok je určené len pre návštevníkov nad 18 rokov!
Zásady ochrany osobných údajov.
Používaním tohto webu súhlasíte s uchovávaním cookies, ktoré slúžia na poskytovanie služieb, nastavenie reklám a analýzu návštevnosti. OK, súhlasím


Panta Rhei Doprava Zadarmo
...
...


A | B | C | D | E | F | G | H | CH | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Context-free grammar
 ...
Simplified excerpt of the formal grammar[1] for the C programming language (left), and a derivation of a piece of C code (right) from the nonterminal symbol . Nonterminal symbols are blue and terminal symbols are red.

In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the form

with a single nonterminal symbol, and a string of terminals and/or nonterminals ( can be empty). Regardless of which symbols surround it, the single nonterminal on the left hand side can always be replaced by on the right hand side. This distinguishes it from a context-sensitive grammar, which can have production rules in the form with a nonterminal symbol and , , and strings of terminal and/or nonterminal symbols.

A formal grammar is essentially a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the first rule in the picture,

replaces with . There can be multiple replacement rules for a given nonterminal symbol. The language generated by a grammar is the set of all strings of terminal symbols that can be derived, by repeated rule applications, from some particular nonterminal symbol ("start symbol"). Nonterminal symbols are used during the derivation process, but do not appear in its final result string.

Languages generated by context-free grammars are known as context-free languages (CFL). Different context-free grammars can generate the same context-free language. It is important to distinguish the properties of the language (intrinsic properties) from the properties of a particular grammar (extrinsic properties). The language equality question (do two given context-free grammars generate the same language?) is undecidable.

Context-free grammars arise in linguistics where they are used to describe the structure of sentences and words in a natural language, and they were invented by the linguist Noam Chomsky for this purpose. By contrast, in computer science, as the use of recursively-defined concepts increased, they were used more and more. In an early application, grammars are used to describe the structure of programming languages. In a newer application, they are used in an essential part of the Extensible Markup Language (XML) called the document type definition.[2]

In linguistics, some authors use the term phrase structure grammar to refer to context-free grammars, whereby phrase-structure grammars are distinct from dependency grammars. In computer science, a popular notation for context-free grammars is Backus–Naur form, or BNF.

Background

Since at least the time of the ancient Indian scholar Pāṇini, linguists have described the grammars of languages in terms of their block structure, and described how sentences are recursively built up from smaller phrases, and eventually individual words or word elements. An essential property of these block structures is that logical units never overlap. For example, the sentence:

John, whose blue car was in the garage, walked to the grocery store.

can be logically parenthesized (with the logical metasymbols ) as follows:

, whose blue car was in the garage, walked to the grocery store.

A context-free grammar provides a simple and mathematically precise mechanism for describing the methods by which phrases in some natural language are built from smaller blocks, capturing the "block structure" of sentences in a natural way. Its simplicity makes the formalism amenable to rigorous mathematical study. Important features of natural language syntax such as agreement and reference are not part of the context-free grammar, but the basic recursive structure of sentences, the way in which clauses nest inside other clauses, and the way in which lists of adjectives and adverbs are swallowed by nouns and verbs, is described exactly.

Context-free grammars are a special form of Semi-Thue systems that in their general form date back to the work of Axel Thue.

The formalism of context-free grammars was developed in the mid-1950s by Noam Chomsky,[3] and also their classification as a special type of formal grammar (which he called phrase-structure grammars).[4] Some authors, however, reserve the term for more restricted grammars in the Chomsky hierarchy: context-sensitive grammars or context-free grammars. In a broader sense, phrase structure grammars are also known as constituency grammars. The defining trait of phrase structure grammars is thus their adherence to the constituency relation, as opposed to the dependency relation of dependency grammars. In Chomsky's generative grammar framework, the syntax of natural language was described by context-free rules combined with transformation rules.[5]

Block structure was introduced into computer programming languages by the Algol project (1957–1960), which, as a consequence, also featured a context-free grammar to describe the resulting Algol syntax. This became a standard feature of computer languages, and the notation for grammars used in concrete descriptions of computer languages came to be known as Backus–Naur form, after two members of the Algol language design committee.[3] The "block structure" aspect that context-free grammars capture is so fundamental to grammar that the terms syntax and grammar are often identified with context-free grammar rules, especially in computer science. Formal constraints not captured by the grammar are then considered to be part of the "semantics" of the language.

Context-free grammars are simple enough to allow the construction of efficient parsing algorithms that, for a given string, determine whether and how it can be generated from the grammar. An Earley parser is an example of such an algorithm, while the widely used LR and LL parsers are simpler algorithms that deal only with more restrictive subsets of context-free grammars.

Formal definitionsedit

A context-free grammar G is defined by the 4-tuple , where[6]

  1. V is a finite set; each element is called a nonterminal character or a variable. Each variable represents a different type of phrase or clause in the sentence. Variables are also sometimes called syntactic categories. Each variable defines a sub-language of the language defined by G.
  2. Σ is a finite set of terminals, disjoint from V, which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar G.
  3. R is a finite relation in , where the asterisk represents the Kleene star operation. The members of R are called the (rewrite) rules or productions of the grammar. (also commonly symbolized by a P)
  4. S is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of V.

Production rule notationedit

A production rule in R is formalized mathematically as a pair , where is a nonterminal and is a string of variables and/or terminals; rather than using ordered pair notation, production rules are usually written using an arrow operator with as its left hand side and β as its right hand side: .

It is allowed for β to be the empty string, and in this case it is customary to denote it by ε. The form is called an ε-production.[7]

It is common to list all right-hand sides for the same left-hand side on the same line, using | (the vertical bar) to separate them. Rules and can hence be written as . In this case, and are called the first and second alternative, respectively.

Rule applicationedit

For any strings , we say u directly yields v, written as , if with








Text je dostupný za podmienok Creative Commons Attribution/Share-Alike License 3.0 Unported; prípadne za ďalších podmienok.
Podrobnejšie informácie nájdete na stránke Podmienky použitia.

Your browser doesn’t support the object tag.

www.astronomia.sk | www.biologia.sk | www.botanika.sk | www.dejiny.sk | www.economy.sk | www.elektrotechnika.sk | www.estetika.sk | www.farmakologia.sk | www.filozofia.sk | Fyzika | www.futurologia.sk | www.genetika.sk | www.chemia.sk | www.lingvistika.sk | www.politologia.sk | www.psychologia.sk | www.sexuologia.sk | www.sociologia.sk | www.veda.sk I www.zoologia.sk