📜 ⬆️ ⬇️

Source related translations

What do translations look like? This is a document drawn up on another site, other pages, but which repeats the sequence of thoughts of the author and the structure of his text. It happens that the author's text changes, especially often this happens in the documentation - the main professional use of translations that programmers and other users of technical products have to meet. If you link a translation to the source code of an article, documentation or book so that the translation display script follows the structure, checks the changes and marks the untranslated, it becomes very easy to follow the changes to the texts and translate the changes. New versions of the texts do not become too different from the old ones and the old translations remain partially working. The translation itself will follow its relevance. All other texts will be mere replicas of less value than the translation created and maintained in this way. Yes, a script or server can monitor their status in order to inform the holders of casts about obsolescence in time. These impressions are useful not only as a source of content for owners of third-party resources, but also for users to work offline or as part of user programs. But the translation tied to the structure and design of the original will be more valuable. It will also help to monitor not the copy of the site, but the original site itself in the case of working with documentation.

Technical implementation of the translation tied to the original - userscripts, add-ons and browser extensions. In the worst case, if there is no text structure binding, they can show links to translation sites and use online translators for help.

What could be the source of data for translations? Already available translations from different places of the network and their copies. To establish matches, you need a matching file — the same technology used in Source Maps. Match files can be in the same place as translations or in third places. Finally, you need a place where starting points are stored - information on where to look for data on translations and on correspondence files. Obviously, these will be directories of links to the necessary resources. The user makes a copy of the links and the resources themselves, based on their needs.

Very similar problems are solved by information representation converters. Their task is to show tests without translations, but in a different structural design (select news of interest, as in an aggregator) and track changes in the form of information submission in order to inform in time about the incompatible format and necessary updates. The format of data presentation is selected by the user - this is an additional task compared to showing translations strictly within the formats of the sites on which they are made.
')
We get the structure of working with the Internet, different from today. First of all, you don’t need to have websites displaying news translations and documentation. From such sites information is very necessary, but in the format of just the data and the structure of the original. At the first stage, they can be used to “return” translated information to the original sites. In the future, it is convenient to work with those resources that provide information directly without registration - data, matching files, data for controlling the structure of the original.

If originals disappear from access or unexpectedly change the structure, their information will be necessary for continued use, if the information itself is relevant. Therefore, servers or individual copying resources perform the necessary function of a network of distributed clients, something like P2P, help to work offline or intranet, and at the same time show the relevance of this or that information — what is important is also copied by those who can afford to store volumes. This is no different from ordinary human memorization of information - the primordial properties of the brain, helping its wearer to survive in evolution.

Therefore, the data network is considered together with the functions of caching and recovery of information in case of loss of the original sources.

A separate function is represented by the system of transformations of representations. How much do they need? On the one hand, the same translations on third-party resources can provide information in their presentation, and this is even considered good form in comparison with simply copying the design of the representations by different sites. Is different design of representations good in the case of viewing information structures (documentation, books, help pages and discussions)? First, the design of the original emphasizes the origin of the information. If the design is not issued for its own source of information, then no one is misled about the sources. On the other hand, inconsistencies in presentation designs may make it difficult to see what is common, which are very different sources from different areas of knowledge - descriptions of various APIs, various dictionaries and books about similar topics. It is better if the recipient of the information makes the decision about the presentation - he can look it in both the original format and in any of its used formats. If the format cannot show part of the data, the task of the display tool is to report it in the format of the display. For example, the impossibility of displaying exotic players and data formats associated with them, or the lack of space for displaying notes.

Own representations are necessary also for those data for which own representations are not initially provided. A typical example is simple RSS lists, but there may also be complex cartographic data not originally linked to any of the engines, and simply a structure of books and articles that is displayed differently in different viewing conditions.

In all this circle of tasks, translations occupy a modest, but noticeable place. They are needed to understand the texts of other groups of readers. They are present wherever it is required, and appear everywhere where the translation mechanism is not provided. In fact, this is a mechanism for the typical transformation of information representations, which must be treated with the same due attention as the design of sites. However, this is felt in the backends of sites where internationalization is an important and unavoidable component.

With the existence of a developed system for transforming representations, an interesting consequence emerges - internationalization on the site becomes optional if another subsystem deals with this, which is now described, which is enough to “set” on a single-language view and give formats with translations. Because translations have meaning only at the level of submissions (sometimes - yes, the layout of submissions depends on the length of words and phrases, but this also applies to the submission). The translation system can be removed from the interior of the architecture, as well as the representation system. Yes, the site can help multilingual users themselves, but it does not need to use its backend for this.

In order not to create the need to install add-ons (but in the future it may well become a browser module), the site itself can submit the necessary scripts to display translations and submissions.

The role and place of search engines


They also work exclusively with data views and pull back the frontend to a certain extent, when a certain mass of search engines recognize texts only in an archaic format. For them, you need to provide a display of data at the site level. For a progressive approach, the search engine could itself accept data in the format of structures — this corresponds to the display trend and reduces costs. Of course, only after the recognized presentation formats appear.

Back to our translations


To embed a translation into any site, technically you need several subsystems described here: analysis of the data structure in the site view; control of the integrity of the presentation; informing the reader and translator about the changes; matching files (map files); embedding of texts on the place of the original texts. Or display of own representation with generation as the original, and transfer again.

The convenience of embedding in the originals is that the markup and links are automatically displayed in the mapping and do not interfere with reading and translation, but on the contrary, emphasize and link the necessary parts of the texts. Support for links and accents in third-party texts is an additional and non-core task for translators, so you can often see accents that are depleted in accents, and you can hope for even less actual support for links.

Interestingly, with this approach, you can use translations of terms and abbreviations according to the tastes of readers, supporting certain languages ​​of the subject areas. For example, for inexperienced readers in the topic to expand the reduction. And for very experienced readers, include the “removal of water” from the texts, similar to the way advertising is removed. After all, semantic analysis is a very small step from the mapping files that will already be present in the translations. If someone is puzzled by the translation, then he may also be puzzled by the marks of terms for creating adapted texts, or the program may try to do it automatically.

Language learning systems are also located close to this kitchen. If the learning system knows the level of the reader's training, then it will be able to show or translate unknown terms to the reader so that the percentage of unknown words in the test remains optimal for learning.

A similar problem exists in professional knowledge: if a novice reads an article that is difficult for him, the problem is made up of many unknown terms and ideas. Adapted texts, textbooks could and should even notice this moment, having learned from the reader about the level of his knowledge (also just a certain file from the concepts he learned in the subject area, obtained on the basis of previous lessons and tests). If the reader is not ready for mastering due to fundamental gaps (lack of knowledge of functional programming), the system advises him to take the necessary courses and shows their usual level of difficulty.

As we see, in the question of whether or not to have such a system for translations, a number of the most important for the human intellectual activity components of knowledge are mixed. Here are the echoes of the semantic web, and the dissemination of knowledge, and data presentation, and as the very basis - a modest substitution of source texts in the original format with the help of the translation display functions.

It becomes quite obvious that the systems for viewing and working with data on the web will follow this path, so you should start thinking about creating similar translation systems.

Source: https://habr.com/ru/post/178711/


All Articles