📜 ⬆️ ⬇️

Research department of development of the registrar R01 on the topic "Domain RF"

In May 2010, the first domain in the zone was delegated. RF, and by October their number was already about 20,000. The count of preliminary applications allows us to predict that even before the new year, the 50,000th Cyrillic address will be registered in the zone.

However, at the very beginning of the zone, it turned out that some client applications (browsers, mail programs, web services) did not work correctly with Russian-language domains. The R01 specialists conducted their own research, during which they tried to figure out what the causes of these errors were and what the prospect of improving the work of client applications.

Currently, all popular browsers and email clients with
Cyrillic domain work correctly, but there are many problems in
web services.
Research problematics: the correctness of applications with the domain of the Russian Federation. Testing the readiness of postal services and browsers to support the national domain.

The main problem faced by developers in the implementation of domains in national languages ​​is a large number of applications that were created to work with domain names that contain only Latin, numeric, and hyphen characters (ASC II).

The use of domain names containing characters of national alphabets could lead to the need to change hundreds of standards and even more applications.

Long before the appearance of the domain. RF, the developers chose a simpler and faster version of working with domains in national encodings - one-to-one coding of such names into the character set of Latin letters, numbers and hyphens, that is, those used for standard domain names.

One of the algorithms for such a transformation is called punycode.

When using punycode coding, the workflow looks like this:
1. The user enters the domain in the client application in the national language;
2. The client application recodes it to the punycode transform and then transmits the information on the domain name in a coded form;
3. The server returns the response to the client application (the domain name is in the form of punycode);
4. The client application recodes the domain name from punycode back to the national alphabet and gives the answer to the user.

Based on the above scheme, possible difficulties are divided into 2 types:
1. Recoding names in the “User” link <=> “Client application”. Difficulties may be related to the correct application of IDN-domains by the client application, as well as incorrect display of such domains (for example, the domain is displayed in the form of punycode-transformation, which is completely incomprehensible to the user);
2. Interaction of automatic services among themselves. Problems at this stage are virtually eliminated by using the punycode representation of the domain name.

We tested the web interfaces of the largest postal services for correct work with Cyrillic domains.

The test consisted of 2 parts:
1. Attempt to send mail to the address in which the domain name is specified in the national alphabet (mailtest@example.testing);
2. Attempt to send mail to the address in which the domain name is specified in the form of punycode-transformation (mailtest@xn--e1afmkfd.xn--80akhbyknj4f).

* example test is a special domain delegated by ICANN in 2007 for testing. More information about it can be found on the site example . Test

The results of the study are shown in the table. The first parameter is the result of the first test, the second parameter is the result of the second test (in square brackets).


As you can see, only yandex.ru handled the test completely, it failed completely - gmail.com. The rest of the services were able to send a letter to at least the address in the form of punycode conversion.

At the same time already 3066 domains point to mail servers.

Currently, only a domain name can be written in Cyrillic. The username in the email address must be written in roman letters.

For example: admin@test.rf - the correct address; and admin@test.rf - incorrect.

Work on the introduction of the use of symbols of national alphabets in postal addresses is underway, but is still at the stage of discussion of proposals (RFC).

We also tested browser support for Cyrillic domains. If in May there were still problems in Firefox and Safari, by August they were solved in all popular browsers.

What to do?

The correct operation of Cyrillic domains requires the participation of the entire Internet community (especially the developers of client applications, websites, and site management systems).

The main tasks are:
1. Correct display of Cyrillic domains by web services and client applications;
2. Correct conversion of Cyrillic domains into punycode-view by client applications and web services;
3. Modifications of checks, with the help of which the correctness of specifying a user e-mail or URL is determined.

Key numbers:
As of October 12, 2010, 7,260 domains (40% of registered) have been delegated, of which 5,477 (29.8%) indicate hosting and for 3066 (more than 16.7%) domains are indicated mail servers.

Useful links:
1) example test - a special site designed to test the work of client applications (email clients, web browsers) with national domains.
2) http://datatracker.ietf.org/wg/eai/ - a page on the IETF website dedicated to adapting the mail system to work with addresses in national languages
3) http://www.gnu.org/software/libidn/ - libIDN home page (C, C # and Java) for working with domains in national alphabets.
4) http://search.cpan.org/search?query=IDN&mode=all - a set of modules for Perl for working with IDN-domains
5) http://php.net/manual/en/ref.intl.idn.php - PHP functions for working with IDN-domains
6) http://pear.php.net/package/Net_IDNA2 - PEAR-module for working with IDN

Source: https://habr.com/ru/post/106090/

All Articles