Good time to read, dear participants habrahabr.ru.
There are no objective obstacles for teaching the programming of children of preschool and school age.
The main historical obstacles at the moment are
1. using only ANSI characters as operators and error messages
2. insufficiently built ideology of using national languages
')
For example, the following hierarchy may be useful for Russia: English -> Russian -> Tatar
The article proposes a hierarchy of directories for use in application programs or the localization of a programming language that provides support for any national language, with inheritance from ANSI-encoded words.
For the development of programming in national languages (for example, the languages of the peoples of Russia, Eurasia, or all existing on the planet, used for writing, or using unprintable words), a translation system using national symbols and dictionaries is proposed. The system includes a directory structure and an approach that reveals the writing of programs in the native language and allows for the transfer of the source text of the programs from national to ANSI and back to any other language for which there is a description. Thus, the description of the algorithms is formed in any language, and uses a hierarchy of languages.
The main base type (universal ancestor) is the
draft language, which includes only words in English and the underscore character, spaces are replaced by underscores. Instead of other ANSI characters, their verbal description is used:
dot ,
comma , etc. The
draft language is used as a universal basis for translating words and expressions. All strings in the including program (translator) must be represented in this encoding.
The next language used for messages is
ansi . It is an ancestor of languages that use the alphabet, and can include any characters from the range 1-127 of the encoding table. It is logical to keep in it common expressions of the English language. Inline constants for this and other levels other than
draft may include any characters in the encodings supported by the XML markup language - OEM, utf8, utf16, utf32. For each language, the direction of writing can be indicated:
- from left to right from top to bottom (English, Russian, etc. - by default)
- from right to left from top to bottom (Arabic, Hebrew)
- from top to bottom from left to right (Japanese, Chinese)
Directory structureThe directory structure containing the dictionaries at the top level uses the continent designations to which the languages belong, and the subdirectories contain the names of the countries or languages.
Thus, top-level directories are limited to the following list:
culture / af - Africa - African cultures
culture / an - Antarctica - universal prototypes - Universal Antarctical cultures
culture / au - Australia - Australian cultures
culture / ea - Eurasian - Evrasian cultures
culture / na - North American - North American Cu
culture / sa - South American - Sourth American cultures
The draft and ansi codings are in the mainland of an - Antarctica to denote differences from the spoken dialects of English in the UK, USA and other countries:
culture / an / draft
culture / an / ansi
In this description, culture refers to a directory containing a language hierarchy. For a particular program, dictionaries are created in the subdirectories corresponding to the languages, with the file name corresponding to the application. Also, for the most universal words, common.xml files are created in the language directory . For example, for the English language this will be a file.
culture / ea / en / common.xml
Language inheritanceFor each language except draft , no more than one inherited language is specified. The draft language does not inherit from any language. The language from which the given language is inherited is indicated in the lang.xml dictionary description file.
The whole chain of inherited languages can be displayed when viewing the source code or the result of the language preprocessor. This can be convenient, for example, when checking programs in the national language, inherited from the Russian language, by an informatics teacher who is not sufficiently fluent in the national language. In addition, the possible variants of machine translation of the source text of programs from one national language to another on the same dictionary.
For each language there can be several different chains of inheritance that are independent of each other. For example, such chains as ansi -> ru or draft -> ru are possible for the Russian language;
they will be contained in directories:
culture / ea / rus / ru_ansi
culture / ea / rus / ru_draft
In addition, for multilingual countries, it is possible to create a language directory in a subdirectory of the country:
culture / ea / rus / tatar_ru
Where:
culture - the root directory of internationalization support
ea - Eurasia
rus - Russia
tatar_ru - dictionary of Tatar language with translation from Russian
Similarly, based on the culture / ea / eng / en_ansi language, you can create a dialect of American / American / culture / na / usa / en_en .
File structureThe entry point to the dictionary description is the lang.xml file, which is contained in each directory. The file contains a link to the inherited language, file names of common dictionaries connected by default, and may also contain a description of other features, for example, the encoding of dictionaries placed in text-based OEM files.
Language description is stored in the culture section of the lang.xml file.
<culture> <language> </language> <codepage> </codepage> <from> culture </from> <include> <file codepage=" "> xml txt </file> <file> xml txt </file> </include> </culture>
The
from section for the
draft language remains empty.
A simple dictionary consisting of a word in the target language and translation into the inherited language can be stored in a text file, although the use of XML files is preferable. In the case of text files, the word is separated from the translation into the inherited language by a space, one line contains one pair of words. You can consider the option of using phrases, then the phrases are enclosed in quotes and separated from the translation by a space.
Links to the translation of words for an XML file are in the
words section, and one dictionary file may contain a link to another file of the same dictionary in the
include section. In the case of a dictionary in XML format, it is possible to add properties related to the keywords of a programming language.
<include> <file codepage=" "> xml txt </file> </include> <words> <word> <value> </value> <from> </from> <tip> </tip> </keyword> </words>
Include XML files can also have links to other files, which allows you to create a modular structure of the translation. Re-inclusion of files is ignored, in case of a conflict of translations (different translations of the same phrase), preference is given to the first translation. If there is no translation from one of the chains of inherited languages, the correct translation is considered to be a translation in the last found language. Translite can be used to switch from the national language to ANSI encoding in the absence of a word in the dictionary.
In case there are several programsDifferent programs may require different translation options. Accordingly, for each program you can have your own dictionary, which corresponds to the name of the application, connects first, and then connects common dictionaries from the same directory (for example, common.xml). The application should specify the path to the dictionary directory, the language used, and the initial dictionary file, for example, through the configuration file. Work with the reduced modular structure of directories can be implemented as a library.
The proposed directory structure does not take into account the parameterizable strings, but is sufficiently transparent to create localizations in many languages, for example, using the Git repository.
References:
habrahabr.ru/post/176243 - "National" programming languages
habrahabr.ru/post/136272 - Which programming language should be the first to learn in school?
habrahabr.ru/post/20541 - About the internationalization of applications
habrahabr.ru/post/165705 - A
few words about the internationalization of applications
habrahabr.ru/company/alconost/blog/173467 - How LinkedIn makes localization into 19 languages for 1 night
habrahabr.ru/post/267501 - Localization of Google Chrome extensions - necessary and easy
Programming languages with non-English keywordsIn particular, the link to
www.robomind.net is interesting - the learning environment of the robot in English and Dutch