Clever Geek Handbook
📜 ⬆️ ⬇️

Transformation based machine translation

Transformation-based machine translation is a type of machine translation (MT). This is currently one of the most common methods of machine translation. In contrast to the simpler model of direct MP, the MP, based on the transformation, divides the translation process into three stages: analysis of the text in the source language to determine its grammatical structure, translation of the resulting structure into a structure suitable for producing text in the translation language, and text generation. Thus, MT-based transformation systems are able to use knowledge of the source language and the target language [1] .

Device

The basis of translation based on transformation and interlanguage machine translation is the same idea, according to which in order to make a translation, it is necessary to obtain an intermediate representation. With it, you can fix the meaning of the original sentence, then to build the correct translation. In an interlingual MP, such an intermediate representation should be independent of both the source language and the language of translation, while in the case of MP, based on transference, there is a certain degree of dependence on a particular pair of languages. The methods of operation of MT systems based on transformation are significantly different, but in general they follow the same pattern: they use sets of linguistic rules determined by the correspondences between the structure of the source language and the language of translation. The first stage includes the analysis of the input text from the point of view of morphology and syntax (sometimes also semantics ) to create an intermediate representation. A translation is formed from the received presentation using bilingual dictionaries and grammatical rules. This strategy makes it possible to obtain a sufficiently high-quality translation with an accuracy of correspondence to the original of about 90% (however, the accuracy largely depends on a particular language pair and is determined by the degree of proximity of two specific languages).

Translation process

In the rules-based MP system, the source text is first analyzed from the point of view of morphology and syntax in order to obtain a syntactic representation. This view can later be changed in the direction of less specificity, due to the need to pay special attention to the most significant fragments for translation, while ignoring other types of information. In the transformation process, the final representation (still existing in the source language) is transformed into a representation of the same level of concretization in the target language. These two representations are called intermediate representations . The process of transforming the presentation in the language of translation into a finished text consists of similar steps, performed in the reverse order.

Analysis and Transformation

Until the final result is obtained, it is possible to appeal to various methods of analysis and transformation. Along with statistical approaches, the number of generating hybrid systems can be increased. The chosen methods and priorities largely depend on the device of the system itself. However, most existing systems include at least the following steps:

  • Morphological analysis . The surface forms of the input text are classified according to the parts of speech (noun, verb, etc.) and by grammatical categories (number, gender, tense, etc.). As a rule, at this stage all possible types of analysis are carried out for each of the surface forms simultaneously with the vocabulary form of the word.
  • Lexical categorization . Any text may contain words that have more than one meaning, which creates ambiguity in the process of analysis. The lexical categorization draws attention to the context in which the word is used in order to try to determine its correct meaning. This process may include the marking of parts of speech, as well as the resolution of semantic homonymy .
  • Lexical Transformation . The process mainly involves translating vocabulary values . Search for the initial form of the word in the dictionary and the choice of translation.
  • Structural Transformation . Unlike the previous stages, where speech was about words, at this stage we are talking about formations of a larger order, such as phrases and fragments of text . Characteristic features of this stage are the need to harmonize grammatical categories, such as gender and number, as well as changing the order of words or phrases.
  • Morphological transformation . Based on the data obtained at the stage of structural transformation, final ready-made forms are created in the target language.

Types of Transformations

One of the main features of MT systems based on transformation is the stage at which the intermediate text presentation in the source language is translated into the intermediate text representation in the target language. This process can occur at one of the levels of linguistic analysis or between them. Levels are presented below:

  • Surface (syntactic) transformation . This level is characterized by the transmission of syntactic structures between the source language and the target language. It is applicable to languages ​​of the same type or belonging to the same family, for example, if we are talking about Romance languages , between Spanish, Catalan, French, Italian, etc.
  • Deep (semantics) transformation . At this level, a semantic representation is created, depending on the source language. It may consist of several structures that convey a certain meaning. At this level of transformation, as a rule, the creation of predicates occurs. Also translation usually requires structural transformation. This level is used to translate between languages ​​that are remotely related to each other (for example, between Spanish-English or Spanish-Basque pairs, etc.)

See also

  • Statistical machine translation

Notes

  1. ↑ Jurafsky, Daniel; Martin, James H. (2009). Speech and Language Processing. Pearson. pp. 906-908.
Source - https://ru.wikipedia.org/w/index.php?title=Machine_translation_on_based_transformation&oldid=99598363


More articles:

  • Allen, Crystal
  • Cited Homology
  • Canali, Saverio
  • Ariana (dim sum)
  • Gabaev, Georgiy Solomonovich
  • Hydrodamalis cuestae
  • Chervonets Peter III
  • Siranov, Kabysh
  • Leve, Edouard
  • Stepaniha (Slednevskoe Rural Settlement)

All articles

Clever Geek | 2019