Having created and supporting an open source project, I want to immediately solve all possible problems of multilingual support of both the project and the site. I have come across multi-language support for a long time, starting with desktop programs. Thus, having an idea of the possible needs, I began to get acquainted with the proposed solutions. Yes, almost all SaaS services offer free use for open-source projects, but basically everything is focused on translating string resources. And what about the site and documentation? Unfortunately, I did not find anything suitable and started to implement myself. I will say right away that I am satisfied with the result and use the system for almost six months, although I warn you that this is not a massive, complete solution, but rather a specific implementation for my needs, but I hope that some ideas may be useful to other developers.
For a start, I will list the requirements that I set for the future offspring.
- You need to localize both resources for a project stored as JSON in .js, and all texts and documentation on the site.
- A resource may not have a translation into other languages. That is, for example, I can accumulate texts in Russian, and then give them to the translator, and these texts will already be available in the Russian version of the site.
- There should be a convenient system on the site so that the user can translate resources not translated into his language, create a new resource (text) or check and edit existing texts in his native language. It should look something like this - the user selects the action (translation, verification), the native language (and in the case of translation is still the original language), as well as the desired volume. For these parameters, a resource is searched for and offered to the user for translation or editing. Naturally, there should be a log of user actions and accumulate statistics on the work performed.
- The site should have a choice of languages, but on each page should show only those languages for which there is already a translation of this page.
- The same line can be used in several places. For example, the string is used in .js and in the documentation. That is, the resource must be in one instance and when it is changed, it must be changed in JSON and in the documentation.
- Ideally, there should be some kind of auto-moderated system, but for now you can stop at the personal decision making on publishing.
Displaying changes in real time was not relevant to me, and I decided to make several intermediate tables with the entire internal kitchen and then, on command, do the assembly of JSON and generate pages of the site itself. In fact, four tables are enough.
Table structureCREATE TABLE IF NOT EXISTS `languages` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `_owner` smallint(5) unsigned NOT NULL, `name` varchar(32) NOT NULL, `native` varchar(32) NOT NULL, `iso639` varchar(2) NOT NULL, PRIMARY KEY (`id`), KEY `_uptime` (`_uptime`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ; CREATE TABLE IF NOT EXISTS `langid` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `_owner` smallint(5) unsigned NOT NULL, `name` varchar(96) NOT NULL, `comment` text NOT NULL, `restype` tinyint(3) unsigned NOT NULL, `attrib` tinyint(3) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `_uptime` (`_uptime`), KEY `name` (`name`), KEY `restype` (`restype`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ; CREATE TABLE IF NOT EXISTS `langlog` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `_owner` smallint(5) unsigned NOT NULL, `iduser` int(10) unsigned NOT NULL, `idlangres` int(10) unsigned NOT NULL, `action` tinyint(3) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `_uptime` (`_uptime`), KEY `iduser` (`iduser`,`idlangres`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ; CREATE TABLE IF NOT EXISTS `langres` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `_uptime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `_owner` smallint(5) unsigned NOT NULL, `langid` smallint(5) unsigned NOT NULL, `lang` tinyint(3) unsigned NOT NULL, `text` text NOT NULL, `prev` mediumint(9) unsigned NOT NULL, `verified` tinyint(3) NOT NULL, `size` mediumint(9) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `_uptime` (`_uptime`), KEY `langid` (`langid`,`lang`), KEY `size` (`size`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ;
Language table with three fields name, native, iso639. Recording Example:
Russian, Russian, ruA table of text identifiers of langid resources, where you can specify another comment and type. I divided all resources for myself into several types: JSON string, site page, simple text, text in MarkDown format. You can of course use your own types.
Example:
ancelbtn, Text for Cancel button, JSON')
Table of text resources langres (langid, language, text, prev). Store references to the identifier, language and the text itself.
The last prev field provides versioning of the text when edited and indicates the previous version of the resource.
All changes are recorded in the langlog log table (iduser, idlangres, action). The action field will indicate the committed action - creation, editing, checking.
I will not stop at work with users, I will only say that the user is registered automatically when sending a translation or correction. Since email is not required, the user is immediately informed of the login and password. All changes made by him will be tied to his account. In the future, he can specify your email and other data or just forget about this registration.
I drew a diagram for you to better present all the relationships between the tables.

Since I need the ability to insert resources into other resources, I added macros like # identifier #. For example, in the simplest case, if we have the resource name = "Name", then we can use it in the resource entername = "Specify your # name #", which will be replaced by
Specify your Name .
Now, to generate site pages, it is enough to go through all languages and resources with the appropriate type, process each text with a special replacement function, and write the result into a separate table with ready pages. Moreover, the processing takes place in such a way that if # identifier # is not found in the current language, then it is searched for in other languages. Here is a sketch of the recursive function (with loop guard) that performs this processing.
PHP example substitution function public function proceed( $input, $recurse = false ) { global $db, $syslang; if ( !$recurse ) $this->chain = array(); $result = ''; $off = 0; $start = 0; $len = strlen( $input ); while ( ($off = strpos( $input, '#', $off )) !== false && $off < $len - 2 ) { $end = strpos( $input, '#', $off + 2 ); if ( $end === false ) break; if ( $end - $off > $this->lenlimit ) { $off = $end - 1; continue; } $name = substr( $input, $off + 1, $end - $off - 1 ); $langid = $db->getone("select id from langid where name=?s", $name ); if ( $langid && !in_array( $langid, $this->chain )) { $langres = $db->getrow("select _uptime, id,text from langres where langid=?s && verified>0 order by if( lang=?s, 0, 1 ),lang", $langid, $this->lang ); if ( $langres ) { if ( $langres['_uptime'] > $this->time ) $this->time = $langres['_uptime']; $result .= substr( $input, $start, $off - $start ); $off = $end + 1; $start = $off; array_push( $this->chain, $langid ); $result .= $this->proceed( $langres['text'], true ); array_pop( $this->chain ); if ( $off >= $len - 2 ) break; continue; } } $off = $end - 1; } if ( $start < $len ) $result .= substr( $input, $start ); return $result; }
In addition to replacing macros like # name #, I also immediately convert MarkDown markup to HTML and process my own directives. For example, I have a table of pictures where you can put screenshots for different languages on one post, and if I specify the tag [img "/ file / # * indexes #"] in the text, then I’m replacing an image with the name indexes with the right language. But the most important thing is that I can generate uploads for various purposes in any format. As an example, I will give the code for generating JSON files; there, however, as unnecessary, the identifier substitution function is not used.
JSON file generation for RU and EN function jsonerror( $message ) { print $message; exit(); } function save_json( $filename ) { global $db, $original; preg_match("/^\w*_(?<lang>\w*)\.js$/", $filename, $matches ); if ( empty( $matches['lang'] )) jsonerror( 'No locale' ); $lang = $db->getrow("select * from languages where iso639=?s", $matches['lang'] ); if ( !$lang ) jsonerror( 'Unknown locale '.$matches['lang'] ); $list = $db->getall("select lng.name, r.text from langid as lng left join langres as r on r.langid = lng.id where lng.restype=5 && verified>0 && r.lang=?s order by lng.name", $lang['id'] ); $out = array(); foreach ( $list as $il ) $out[ $il['name']] = $il['text']; if ( $lang['id'] == 1 ) $original = $out; else foreach ( $original as $ik => $io ) if ( !isset( $out[ $ik ] )) $out[ $ik ] = $io; $output = "/* This file is automatically generated on eonza.org. Use http://www.eonza.org/translate.html to edit or translate these text resources. */ var lng = { \tcode: '$lang[iso639]', \tnative: '$lang[native]', "; foreach ( $out as $ok => $ov ) { if ( strpos( $ov, "'" ) === false ) $text = "'$ov'"; elseif (strpos( $ov, '"' ) === false ) $text = "\"$ov\""; else jsonerror( 'Wrong text:'.$text ); $output .= "\t$ok: $text,\r\n"; } $output .= "\r\n};\r\n"; $jsfile = dirname(__FILE__)."/i18n/$lang[iso639].js"; if ( file_exists( $jsfile )) $output .= file_get_contents( $jsfile ); if (file_put_contents( HOME."tmp/$filename", $output )) print "Save: ".HOME."tmp/$filename<br>"; else jsonerror( 'Save error:'.HOME."tmp/$filename" ); } $original = array(); $files = array( 'en', 'ru'); foreach ( $files as $if ) save_json( "locale_$if.js" ); $zip = new ZipArchive(); print $zip->open( HOME."tmp/locale.zip", ZipArchive::CREATE ); foreach ( $files as $f ) print $zip->addFile( HOME."tmp/locale_$f.js", "locale_$f.js" ); print $zip->close(); print "Finish<br><a href='/tmp/locale.zip'>ZIP file</a>";
Thus, having spent not so much effort, I realized almost everything I wanted. Only things that are not relevant at the moment due to low activity on the site have remained unrealized. But the additional features that were needed in the process of use were added. For example, getting a text file with resources that need translation and reloading the translated text.
Those
interested can
take a look at the work page where users can translate, edit and create new resources for my project.
