Review and comparative testing of PC "Elbrus 401 ‑ PC". Supplement - questions and answers

Perhaps the main result of the publication of this review , in addition to actually familiarizing the public with the first independent impressions of the new computer, was the desire of the MCST company to disclose more details, eliminate the misunderstandings and answer the questions raised in the article and in the comments to it. Some of these questions are so fundamental that they deserve a separate article each, and therefore require serious study. Now we will look at those that best fit in the interview format.

401-PC

Content

General moments

In order to correctly understand the position of the MCST company on the following issues, it is necessary to imagine its past, present and plans for the future - in isolation from this context, some facts may look strange.

Historically, the main customers and consumers of MCST products were law enforcement agencies. The range and production volumes were limited, each computer was registered, every customer knew by sight, figuratively speaking. With such a sales model, it was necessary and sufficient to release into circulation only well-established and certified systems for which the application software was targeted. Each client demanded a personal approach - qualified consultations when choosing equipment and its subsequent operation, including with the departure of a service engineer to the installation site (at any point of the vast part of the land, as well as at sea). That is, "terry enterprise", except with its own specifics.

Now MCST really wants to enter the civilian market - first in the corporate sector, which is nevertheless closer to the current experience, and then to the consumer segment, that is, to the widest masses. It is clear that if business clients are still somehow ready to bear additional expenses (but not such, of course, which traditional customers of the MCST are willing to bear), especially when the advantages of choosing an expensive exclusive are clear, an ordinary person votes with a purse for the most affordable goods, agreeing to less high quality products and sometimes a complete lack of support from the manufacturer. And yet ordinary people crave for everything new - give them a doubling of the number of transistors every year and a half, the latest version of the kernel, system libraries and application programs; and it’s not so important how many old mistakes were fixed there and how many new ones appeared, how much the software was heavy and how it now slows down on the machines of the previous generation.

The obvious gap between the desired and the real is perfectly understood in the MCST at all levels - no one has the rosy illusions that it is possible to get off the bat in one moment, overtaking the venerable sprinters and experienced marathoners, especially since famous fairy tale, rush with all your feet, just to stay in place. Now for such a breakthrough there is neither money, nor production capacity, nor, elementary, human resources — the staff is three orders of magnitude less than that of Intel or Microsoft, and we have to deal with everything at once. Even to reach commercial or budgetary structures, you must first deploy a network of dealerships and repair shops, establish a system of training and technical support - now the MCST only probes the ground in search of partners. And, of course, financial investments are necessary: in order to be able to sell their computers cheaply, it is required to reduce the cost of production, and this is achievable only with a significant increase in volumes - it turns out a vicious circle, which is very difficult to break.

There is also an understanding that consumer products must be so open, so possible and expedient: for example, there should be free access to documentation, software installation distributions and timely updates for it, the source codes of these programs, a platform for public consultations and exchange of experience, educational literature for beginners and specialists. But this, too, all does not appear by itself overnight, and the company is still at the very beginning of the journey to conquer the hearts and minds of potential clientele.

Since it is difficult to abandon old habits, especially when there is no practice of doing business in a completely different environment, it is necessary to make allowances for the fact that when employees of a company answer questions further in the text, then, saying “user”, they often still have the view of his classic client, with whom there is personal contact and direct contract, and often also an additional non-disclosure agreement — such a client is not interested in publicity and knows that he can count on any whim for his money. But, as can be seen from many answers, this template is no longer considered as the only one. Here, too, everything will develop gradually, step by step.

Production and promotion

At which plant produce CPU and KPI? How much? Is it true that production is minimized (suspended) for two years?

There is no secret here: new chips are now manufactured only in Asia. And if we aim to seriously compete in the free market, there is no alternative yet. Another thing is that then the natural question of information security arises in critical applications, but for a special category of customers it is possible to organize production at local capacities in limited quantities - with an appropriate price level. The first (and successful) project of this kind is the Elbrus-2SM processor: its crystals are made at the Zelenograd factory Mikron.

The release of chips does not stop - moreover, they are constantly making adjustments. Simply, by modern standards, volumes of less than a thousand crystals per year are considered small-scale production, and an order is made only sporadically, the round-the-clock conveyor is not needed here.

Many processors at this stage goes for internal needs, - both ordinary and experimental. For example, a computer complex was recently built of 32 1U modules with four Elbrus-4C processors in each — a total of 512 cores. Anyone who has interesting problems for such a system can apply for machine time. (Briefly about which classes of programs are most effectively executed on the E2K architecture, and how to optimize your source code, will be described below, and you will discuss this topic in more detail in a separate publication.)

At what production volumes will it be possible to reduce the cost of the motherboard + processor kit to a level acceptable for a wide range of customers? How soon will the Russian electronics industry be able to provide such volumes?

To reach the level of about $ 1,000, you need to produce at least 10,000 finished products annually, and you can only go further towards the customer with a flow of about 100,000 products per year. Of course, all the production should then be concentrated in China, or domestic factories should work very well to reduce the cost of logistics and reduce the cost of production. Now all the production boards of the MCST are mounted at Russian factories.

At what production volumes will the production of a simplified processor version for 1-socket systems be justified - without interprocessor communication units and access to remote memory?

Even 10 thousand crystals a year will not justify a change in the topology. Rather, it will make sense to get rid of the unused conclusions, which will reduce the area of the substrate, but in the foreseeable future, even this will not be justified.

How much will the license for the operating system cost if the sales of components begin separately?

So far, such a sales scheme is not run-in, but the experience of colleagues from Alt Linux will most likely be taken over; for personal use, the price will not definitely become burdensome.

When to expect ready-made systems based on "Elbrus-8C"? Are the characteristics of future processors determined? Will the next model have 16 cores and a clock frequency of 2 GHz, for example?

Pre-production samples of single-processor machines based on the G8 can be seen this summer. The next step is a small increase in the frequency (up to 1.5 Hz) and doubling the number of floating-point computing units, which are the main driving force of this platform - such a processor is already being developed with the working name Elbrus-8CB. The processor with 16 cores is scheduled for release in 2020.

Why is the hardware and software naming system so confusing?

When you have only a couple of basic products in your asset, then it’s easy to understand the nomenclature, especially for a professional. Now, when the spectrum of hardware and software is expanding, when there is an orientation towards the average user, the naming system is gradually being brought to a form understandable to a layman.

An important clarification. It is wrong to mention the designation “Elbrus-2000”, as well as the abbreviation “E2K” in the context of modern products: the official name of this microprocessor architecture is “Elbrus”, without any suffixes. The name "Elbrus-2000" was chosen for the architecture, which they were going to implement together with Western companies in 2000. At the very beginning of 1999, an article describing the architecture of the microprocessor "Elbrus-2000" was printed in the Microprocessor Report as "Elbrus-2000", and in abbreviated form - "E2k". The current architecture of “Elbrus” has been significantly improved in relation to the E2k architecture, this is the third version, so the use of the old designation is not quite correct. In addition, the abbreviation E2K (with the capital letter "K") can be interpreted by orthodox computer scientists as 2048, which is completely worthless.

Users support

Is there documentation in electronic form? Do you plan to share the documentation for free download by anyone (regardless of the fact of buying the equipment)?

Documentation can be downloaded now, but so far only having a link sent on request. However, in the coming months, it is planned to launch a community and technical support site, where all the information will be in the public domain. Since we are talking about dozens of documents, it will take some time.

Do you plan to open a bagzilla for public viewing? Create a FAQ, organize a forum - a site for an open exchange of experience?

You can not just take and open the bagzilla, where many tickets contain "highly sensitive" information. Most likely, a separate bagzilla will be created for the masses, available for viewing and replenishment by everyone. And previously accumulated experience on the most frequently asked questions will be processed in the FAQ, which will also be posted on the new support site. There will also be a forum, most likely.

What about publishing source codes of adapted software and sending patches to the upstream of a project? Do you plan to take patches from users? What about rewards for found vulnerabilities?

Source codes were not laid out in public access simply because the clients themselves were not public, and the demand for source codes among them was small, and who really needed not out of idle curiosity - addressed with a request and received everything that was necessary in private. It is planned that a public repository will be created for the mass consumer in the foreseeable future, where all borrowed code will fall. The company does not plan to open its own developments, such as the LCC compiler, - in the end, Intel C ++ Compiler (namely, ICTS considers it its main rival in terms of optimizations) is also closed, and it feels good.

Patches from users and so accepted, - so far, too, in private, of course. It is difficult to imagine how this will look when expanding the client base and a surge of attention from enthusiasts.

Sending your changes to the authors of the original projects is undoubtedly a useful thing, but someone needs to do this, for each project you need to know the approach, understand the peculiarities of the community culture. A more feasible task is simply to lay out the entire code to the public: if there is a “goodwill ambassador” ready to interact with this or that upstream, well, that's great. So far, the MCST has no such experience.

Hardware

How to use the integrated video adapter in a graphical environment? How comfortable for 2D work is supposed to be its speed?

It would be easiest to initiate reconfiguration of the graphics table by running the xorg-server.postinst utility . The hardware acceleration functions of the built-in adapter are completely absent, but the usual use of desktop applications should not cause any inconvenience - certainly not the way it was on older computers. Probably, it will be necessary to record it on video and post a small video - instead of a thousand words.

Which discrete graphics cards, in addition to the Radeon HD 6450 / R5 230, are supported by the operating system? What hardware acceleration functions are available to applications through the driver available on the system?

Supported by all modern line of Radeon, compatible with an open driver for Linux. Since nVidia is very sad in this respect, their products do not have support in the Elbrus operating system at the moment.

What can explain the abnormally low read and write speeds of a solid-state drive that do not even reach the nominal bandwidth of the SATA-2 interface through which it is connected?

This is a known limitation of the 1991VG1Y chip, which implements a peripheral interface controller (CCP). The optimized version of the controller (KPI-2), in which this problem is solved, will be installed in systems with new Elbrus-8C and Elbrus-1C + processors.

Why does the Elbrus 401 ‑ PC computer have a 1 TB hard drive if it is not even configured in the operating system, and the main drive already provides a lot of free space?

The obvious purpose of a hard disk is to store large amounts of data, which is demanded by the current contingent of users. The fact that the disk is not mounted in the system is a flaw, but not essential: some users also reformat the flash card to fit their needs instead of a binary broadcast, you will not immediately please everyone here.

What is the color of screw fastening solid-state drive, - as a guarantee seal or to prevent self-unscrewing?

Indeed, the paint is primarily a stopper. You can unscrew the drive - it does not deprive the guarantee, but if something breaks, then service engineers will have natural questions to the user regarding the actions taken.

Where do PCI device IDs come from - why is the developer code (Vendor ID) on many onboard devices the same as Intel?

The reason is completely prosaic - as the Windows operating system in binary mode, it is easier to feel at home. True, because of its paranoid binding of activation codes to the equipment used, this “most friendly” system still sometimes begins to be capricious.

Where can I find a description of the Echelon ‑ E trusted bootloader?

There is a misunderstanding here: this product is purely software, and is only a special case of the conventional ESR "Echelon", developed by the eponymous research and production association. This tool provides trusted computer boot, integrity monitoring, user authentication and authentication before transferring control to the operating system.

Is the IPMI remote control module, offered as an option for the Elbrus-4.4 servers, self-developed, or is it a finished product of foreign manufacture?

Of course, this is an independent development, but not yet a finished product, - the module is at the debugging stage.

operating system

What version designation system is used for Elbrus OS?

The correct answer has already been given in the article: the version number is written in the / etc / mcst_version file . Version 2.2, which packaged computers from the first batch, is in fact no longer relevant - 2.3 is now stable, and 3.0 is at the release candidate stage (with a 3.14 kernel).

Do you plan to release regular updates that are automatically installed from the public repository? Why not all software installed in the system is packaged?

Yes, an automatic update is planned, although now this process is still manually performed upon request. Of course, for this, all software must be under the control of the package manager - if there is something wrong now, it is only because of the unresolved build process.

Isn't it easier to directly port one of the popular Linux distributions - for example, the same Debian?

This is exactly what one of the teams is doing at the moment. Indeed, Debian offers perhaps the most convenient infrastructure for creating derivative distributions. Moreover, Debian now has the widest range of supported architectures, at least among the Linux family, so it’s more logical to create new ports on this basis. However, it is the porting procedure for this distribution that is not the smoothest and most systematic, so you have to work hard. But when the process is debugged and automated, synchronization with the mainline will be [almost] immediate. But whether it will be possible to give this port official status is a big question.

But the list of supported operating systems is not planned to be limited to just one option. First of all, the expected port of ALT Linux, which needs no introduction. QNX adaptation works are also underway: the Neutrino-Elbrus secure real-time operating system is already running in some form; For details, check with the developers in the competence center "SVD Embedded Systems".

How hard is porting the Linux kernel? Why is the kernel of version 2.6.33 now being used - not the newest, but at the same time not supported as LTS?

The process of porting the Linux kernel to a particular hardware platform is actually quite laborious, but the problem is not one-time effort, but the fact that a lot of things have to start almost from the beginning, since everything flows, changes and shuffles. For example, they just moved to the kernel 3.14 and started experimenting with the 4.x branch - and then everything changed.

It is unlikely that in the foreseeable future it will be possible to achieve the adoption of its port in kernelline kernel.org, when everything is very strict on one side and rather chaotic on the other. Therefore, the most likely prospect is the provision of a core to everyone for self-assembly of such a distribution kit, which they want to make themselves.

Which kernel versions ( default , nn , rt ) for which purposes should I use?

For everyday tasks, the default kernel is obviously best suited. The “nn” kernel is intended for network routers - interrupt handling is cheapened there. The “rt” kernel is able to plan the work of the processes, observing the restrictions on allocated time quanta within the established limits, which makes it possible to talk about calculations in real time - does not mean “quickly”, but means “predictably”, even if at the cost of performance loss; at the same time, only the kernel and the target application, as well as the minimum necessary set of background services, usually remain in the system.

Is it possible to quickly restart the [kernel] of the operating system without reinitializing the hardware? How to speed up the launch of the operating system in particular and the computer in general?

A quick restart of the operating system without hardware initialization is not provided. Hardware initialization can be accelerated, firstly, in obvious ways: for example, by disabling or reducing the search timeout for ATA over Ethernet servers, they are needed only for booting over the network. Secondly, there are some ways that are not obvious at first glance: for example, you can disable the cleaning of RAM, which is usually performed for information security purposes. Well, the acceleration of launching the operating system by disabling all unnecessary services does not need comments.

Application Software

For what purposes is the current version of Firefox 3.6 being positioned, if many sites using modern web technologies are not compatible with it?

The current version of the browser in the current release of the operating system "Elbrus" is 23.0, which is much more perfect in terms of functionality and performance. For example, the JetStream test is now successfully completed, and with a score of 7.8 points - not far below the mark of 8.2 points achieved by the same version of Firefox in the x86 binary translation mode, which uses a full-featured JIT compiler for JavaScript.

Version 31.0 was also tested, but it showed itself worse, slower, and it was decided not to release it to the public. The next version transferred will be 44.0.

Does the system have an implementation of domestic cryptographic algorithms (including current versions) available for programs in C / C ++ languages?

Now OpenSSL has been replaced by its branch - LibreSSL, where Russian cryptography is officially integrated.

How can we explain the poor performance of the Java virtual machine, demonstrated in various tests?

The OpenJDK 1.6.0 package was in some sense a “breakdown of the pen” - work on 1.7.0 and 1.8.0 is already in full swing, where we were able to increase productivity by 3-4 times, judging by such tests as SPECjvm2008. But in the general case, of course, there is still a lot to be optimized.

Is it planned to port Mono or .NET within the Elbrus OS or another distribution kit?

Given the popularity of this technology, it is almost inevitable. But, taking into account the ongoing changes in Microsoft’s relationship with the community and the uncertain future of the Mono project, there is a natural desire to wait a little bit when prospects become clearer so as not to waste time on the dead-end branches of development.

In the meantime, if someone needs to run the dotnet applications, he can use the x86-broadcast mode. , — , . , , just-in-time .

«» , , , ?

. , - , Windows Steam OS , - .

Where and how can I get a detailed reference guide on architecture and a set of machine instructions?

Now all documentation is sent on request. But there is a nuance: the instruction set is open, but the method of encoding instructions in the command word is confidential information for historical reasons. The question of full disclosure of architectural details was repeatedly tried to be solved positively, but so far no decision has been made.

, , , , «», E2K- — , - . , .

() E2K , , ?

«-2000» , — , : , «8» . , — , . EML (Elbrus math library), LCC .

Another strong point is the presence of a large register file - the program can access up to 256 registers at any one time, including their automatic renaming. This opens the way for very large-scale optimizations. For example, in a well-known review on CNews, the gostcrypt test appeared ( ), «-4» Core i7-2600 , — , , 28147-89. — E2K LCC . , , — .

E2K C/C++ Fortran? ?

, , , , — , « ». , , , , ; , . Itanium, — E2K.

.

: ( inline ), , — «». , , 4 , , , .
: likely unlikely , pragma loop count , , .
: for , while , break .
, — , . ( , .)
: , , — , , x86. , , .
Use optimized functions whenever possible — for example, the aforementioned EML library. As already mentioned, the compiler itself is able to replace calls to ordinary functions with optimized ones, but it is not omnipotent, and it is better to do everything explicitly.

In more detail and with examples these methods and other subtleties are planned to be covered in a separate article. The MCST is well aware of the importance of distributing “secret techniques” of extracting maximum performance from “Elbrus” among programmers, and plans to begin to carry the light of knowledge as soon as the community and its infrastructure is formed.

Is there a ready-made set of examples of source code in C / C ++ languages with memory access errors in order to demonstrate how the technology of protected execution of programs allows to detect such errors at the compilation stage and at the execution stage?

Of course, there is such a set of programs - as part of regression testing tools, which is carried out on a nightly basis. You can also use examples from the SAMATE collection of the American Institute of NIST. However, for clarity (it is planned to write a separate article on this topic), it will probably be easier to write “one-liners”, which accurately illustrate each mistake individually.

Is the possibility of writing an E2K backend for an LLVM compiler as an alternative to an LCC looking to be like GCC?

Of course, research in this direction was carried out, but the verdict was rather negative: the architecture of Elbrus 2000 is difficult to describe in the best way with LLVM. That is, an alternative compiler could be released, but the machine code generated by it would lose LCC in speed. But the direction is not considered a dead end - it is possible that with time the backend to LLVM will still be realized.

Can the LCC output errors and warnings according to the form adopted by GCC so that these messages are recognized in the development environment (for example, Qt Creator) accordingly?

At the moment, this is not provided, but the bagzilla ticket has already been entered.

Where can I get the E2K cross-compile toolkit from the x86 desktop? Is there a reverse process - generating the x86-code from the Elbrus environment, and if so, using a special version of the LCC or the usual GCC?

E2K cross-compilation tools (i.e., LCC compiler running on x86 Linux) are available on request. The reverse process is not explicitly provided for: if this is necessary, you can run some x86 system on Elbrus in binary translation mode and use the compiler available there.

What virtualization technologies are supported on the Elbrus platform?

Right now there is no support at all. However, soon it will be possible to use containers.

, KVM, — - OpenStack. Qemu/KVM , virtio - , «» - KVM, , hypercall API.

, Intel , . «»?

. ( , 4- ), ( 5- ), . .
— . , , .

, , , — , . , .

Binary broadcast x86 code

What features and limitations does binary translation have?

This topic is worth considering in a separate article, but in short the picture is as follows. Broadcast is of two kinds - at the system level and at the application level. In the first case, the guest operating system provides access to the entire hardware environment of the computer, and in the second, respectively, only system calls are transferred from the guest program to the Linux host system kernel. This can be compared with qemu-system-x86_64 and qemu-i386 emulators.accordingly, however, the translator does not emulate the guest processor, but immediately recompiles the guest machine code into the native instructions of its architecture. Moreover, the transformation is performed repeatedly, gradually increasing the degree of optimization for the most frequently encountered code sections, and the results are stored in a long-term cache.

( «lintel» — «-») x86 x86-64, («rtc», run time compiler) 32- , — 64- . AMD64 / EM64T , Intel / AMD, SSE, AVX, AES-NI, — CPUID .

?

Very simple: when you start the computer, you must specify the flash card as a boot disk. If the card is empty, or the user himself erased the broadcast system from there, then you can re-record it at any time by copying the image with the dd command .

BIOS , POST-, . — , . , CPUID, , , — - , Intel C++ Compiler. — SATA- PATA, . , « », , x86-, , — Windows .

?

Even simpler: by launching the translation program, passing it the path to the guest application and the path to the root directory of the recreated environment. The whole question is how to first get the image of this environment. So far, only the x86 version of the Elbrus system is regularly supplied, but others are allowed. If the client finds it difficult to form an image of the required system on his own, the MCST specialists can provide the necessary assistance.

From under a running guest application (for example, a command interpreter), the user can launch other guest programs — completely transparently. At the same time, several copies of the translator can be started at once, and each copy can work in its own environment; so, for example, you can try out the same version of the browser in different distributions, or, conversely, different versions of the browser on the same system (a contrived example, of course, but reveals the essence).

One can come across the statement that in the binary translation mode some benchmarks start working even faster than the ones originally compiled for E2K. For which classes of programs and under what conditions is this possible?

This is really possible, for example, when a native version of a JVM or JS engine can only interpret user code, and the x86 version compared to it has a full-featured JIT compiler. At the same time, even despite the fact that there is a multiple translation, first the byte code chosen for optimization is compiled into the x86 machine language, then after some time it is recompiled into E2K (and three times, once for each level optimization), - still the final gain from compilation outweighs.

As for native programs in C / C ++ languages, there is also a logical explanation here, even two. First, even though the LCC compiler does a titanic job of optimizing the generated code, no one can guarantee that any x86 compiler, especially commercial, will not cope better in one particular case or another. Secondly, it is more likely that a well-optimized program for x86 was simply compiled with regard to pre-profiling, while the LCC was fed with bare source codes without prompts. But other things being equal, of course, native programs should work at least no slower than those broadcast, - if this is not the case, you should send a bug report to the LCC developers.

performance measurement

According to experts of the MCST, some previously popular benchmarks can not truly unleash the potential of any of the currently existing platforms. To take the same UnixBench - with all due respect to its respectable age, it is long outdated and equally unsuitable for any modern processors and operating systems. Both of its processor-specific tests, Whetstone and Dhrystone, are practically not parallelized and are not subject to at least some significant extraordinary execution - even on architectures with explicit parallelism, even with an implicit one. And the rest of the tests are generally about nothing, instead of them it is better to use something more specific. The only advantage of UnixBench is cross-platform, which is why it is still used today.

Nor should you lose sight of the mighty power of profiling. For example, the results of the 7-Zip test in the CNews review, which seemed suspiciously high to many, are not a hoax, but a consequence of a two-pass compilation. Another question is how such an optimization is useful in the general case, that is, on arbitrary input data. For this reason, it makes little sense to profile all the components of the Pgbench test, because on real data, Postgresql performance may be completely different. But in the case with 7-Zip specifically, it’s pretty easy to double-check: you need to do another test, submitting a collection of various files to the input. The only problem is that if the files are not standardized, then repeat the test in an identical way for anyone who wants to fail, and there will be even less confidence in the published results.

One should be aware that synthetic benchmarks are often written with an eye to a specific architecture (including because their authors are accustomed to think so), or customized to a specific combination of hardware and compiler. For example, the well-known SPECcpu test declares objectivity and impartiality, however, in the source codes of version 2006 you can find comments that a particular crutch was added specifically for the Intel C ++ Compiler. Yes, and how can you not suspect the influence of a large vendor, when out of 36.6 thousand published results, the share of its products accounts for 90% of the records?

Real application programs are not always an indicator either, because the most critical sections can either be written entirely in x86 assembler or contain many assembly inserts and calls to special functions (intrinsics), which OpenSSL is a good example of. It turns out that the machine code licked to shine is compared with the implementation in a high-level language, the main purpose of which is to be a reference, not an optimal one.

Therefore, the proposal to the readers: let's think together about what tests, artificial or close to life, can be done to see how strong Elbrus is in relevant tasks for it. It is not necessary that these are ready-made programs, especially with regard to mathematical calculations, because, say, matrix multiplication — it is also matrix multiplication in Africa: the complexity of the problem is the same, whether it is performed with optimized EML, BLAS / LAPACK libraries or a self-written function. Leave your ideas in the comments.

The author expresses gratitude to the staff of the MCST for the detailed and interesting explanations.

Source: https://habr.com/ru/post/391259/

All Articles