Cloud - what is it and why?

We recently launched the ABBYY Cloud OCR SDK service , running on the Windows Azure cloud, and in the process gained 100,500 experience. For example, we learned that many people use the word “cloud” and heard that “clouds are fashionable,” but very few people understand what a cloud is and most importantly, why do it in the cloud. The word “cloud” is universally used and seems to have begun to overgrow with urban legends.

See, for example, this video:

')
You won't lose much if you just focus on the fact that the blonde looks good and has a nice voice.

Let us consider in detail what a public cloud is, why it might make sense to use it for software operation and whether it’s true that “everything will be soon in the clouds”.

Unseen opportunities for your customers

For a start, how for a client the service “in the cloud” differs from the service “not in the cloud”.

It is believed that the "cloud" service has a unique property - accessibility for all users. Clouds have nothing to do with it. Our service works in the cloud, it looks to the user as a regular website (some of the requests even give out ordinary-looking web pages), for example, it has a user account that looks like regular web pages.

For comparison, look at the Stack Exchange (best known for the Stack Overflow site) or Yandex. Mail - they look the same to the user. They are also available to any users and from anywhere. There, too, is a web server, which also accepts requests via HTTP; there, too, it doesn’t matter what operating system the client has, what architecture is on its machine, what language its programs are written in.

One can come across statements that due to the cloudiness of the service, “user data is available to them from anywhere.” Yes, users of the service can upload images to our service from anywhere and get results also from anywhere. By the way, users of Stack Exchange or Yandex. Mail can also work with these services from anywhere - ask questions, receive answers, send and receive letters.

Functionally, the cloud service is no different to the user. What is in the cloud, what is not in the cloud, on some IP-address is a server (usually a web server), which receives and processes requests. If there are no settings that restrict access to the server from specific IP address ranges and the client does not sit behind the paranoid firewall, then the service is available from anywhere and from any device. Clouds here have no effect.

Cloud Services for Cloud Services

It is also believed that the service in the cloud is done so that other services in the cloud can interact with it - something from the series “for use by cloud service developers”, as the authors of a press release recently wrote. In the particularly delusional presentations, you can find pictures with a studded peg with a naively schematic cloud - this is a cloud, there are services in it, and they interact there.

Let's look at it from the point of view of our service. The purpose of developing our service is to provide a service programmatically accessible from anywhere in the world — so that third-party developers who lack OCR in their programs can develop software that uses our service for recognition. For example, a program for a smartphone that photographs a check, extracts data from it, and saves it to a program for budgeting on the same smartphone. Captain Evidence suggests: the smartphone is not in the cloud. Our service is not only for "cloud service developers", it is for developers of any programs that are willing to use a third-party text recognition service. In the cloud, those programs work or not - it does not matter in principle, but our service just doesn't care.

It is believed that the cloud service is necessarily a service for servicing numerous external requests. Usually yes, but not necessarily. No one bothers you to run on your service the decomposition of primes to factors, store the source data for it somewhere outside, so that the service will take them from there, and upload the results to an external ftp-server.

Cloud Services Cloud Architecture

Further, it is considered that a service operating in the cloud is fundamentally differently arranged, its development requires a fundamentally different architecture compared to a service operating outside the cloud. There are some differences, but they are secondary.

Imagine that you need to make a web service that receives images from a user, puts them in a queue for processing (because recognition takes some time), processes them, and after processing gives the user a link to download the result. How would you do it? Most likely, you would create in the internal storage (most likely, a database) a “task” for each received image, give it a unique identifier, recognize the image as a separate thread or as a separate process, then for the next request “how are things from such and such” returned a link to the result. This is a completely obvious architecture for such a service, and the cloudiness has nothing to do with it either.

It is believed that the cloud uses a "cloud operating system." Usually, this is just a doped "ordinary operating system." In Windows Azure, this is Windows Server 2008 R2 with slightly over-tightened nuts (for example, the temporary folder is very small). All “cloudiness” in such an environment is created by additional services — for example, by a long-term data storage not tied to the machine on which the user service is running.

Some time ago we told that now FineReader Engine supports work in Windows Azure. This revision did not require the complete rewriting of the entire FRE, they just took into account the limitations of the platform, refined them a little, tested, updated the documentation, pledged to continue to support. Painstaking and important work, but no more.

Unparalleled reliability

It is also believed that the cloud service is certainly more reliable, because there is also a cloud cloud cloud provider, offering many nines after the comma. There are nine separately, reliability separately.

First of all, you need to read the fine print in the Nines Agreement (SLA - Service Level Agreement). It states exactly what these nines mean, what specific features of the service they affect, what the provider’s responsibility is.

Usually, the provider’s responsibility is no more than the relatively small money that you paid him, and while your service is down, your company can lose much more money and suffer damage to its reputation. Yes, the provider will answer, but this may not make you feel better.

A similar life example: on average once a year the power supply in the building is turned off for a second, so the computers restart. From the point of view of the electricity supplier, this is a miserable second per year (how many nines are there?), And from your point of view, this is a loss of several minutes of work by each employee, because he will have to wait for the OS to boot, start all the programs, then remember where did he stop? There are a lot of nines, but this is not easier for you.

The agreement can guarantee the availability of some specific services (for example, that the virtual machines running your software will work and be connected to the network) - a situation may arise when a long time, for example, service of managing these virtual machines is secondary in appearance - they will continue to work, and you will not be able to launch new ones or reconfigure them. You just had to increase the bandwidth of the service a hundred times to take the peak load from a very important and generously paid advertising campaign that has just begun. The provider did not even violate the agreement, because nothing is said in the agreement about this secondary service.

From being hosted in the cloud, the service does not become guaranteed more or less reliable. Risks nobody cancels, just the risks are different.

So what is it?

Now, when obscurantism has decreased, let us return to the question of what a public cloud is. This is a remote-controlled service that provides you with computing power and data storage with pay-as-you-go. You use the power to operate your software (your service), and the storage to store data with which this software (your service) works.

You may have a different level of control over the capacity provided. For example, you can select a virtual machine with a specific OS and assign it to you and give you remote access to it so that you can set it up as you need and continue to leave it at your disposal. Or (as in Windows Azure) you can download a special archive with the executable code of your service and a configuration file that says “run this on 5 machines with 2 cores each”, the cloud service infrastructure itself will find suitable virtual machines, deploy, launch and adjust the OS to them, then deploy your archive there and transfer control to the entry point (a fixed function like main ()), and will see if anything has broken, in which case it will restart your service on the same or (if the machine crashes) on another car. In the first case, you are more in control, in the second you have more additional buns.

What is the profit?

Profit in flexibility and delegation of responsibilities. Do you need to increase the number of machines running your service? A few clicks of the mouse, waiting in the region of 10 minutes - and you have already found new virtual machines, launched your service on them. Need to subtract? Same.

The same with the repository. You need a repository - a few mouse clicks, and you were given it and given the address and access keys to it. The storage is usually rubber, payment depends on the actual volume used.

The provider can, for example, provide a database server - also “somewhere” and also with payment according to the volume used. In Windows Azure, this is SQL Azure, based on specially configured and doped SQL Server 2008.

Need to try a new feature and there is a risk to break the service? You can do so. Create another repository and another database. You configure your service for a new storage and a new database, deploy it on additionally allocated virtual machines. We tried, released the machines, if there is a lot of data in the storage and database, you can also delete them, so as not to pay for them.

We have an automatic assembly at the end, which expands our service directly to the cloud on a dedicated virtual machine for this purpose and performs tests there. At each assembly, the machine is re-allocated, after assembly it is released, so that on weekends and at night, when there are no code edits, we do not pay for it. The code is tested in exactly the same environment in which it will then work.

This flexibility is very convenient. This is the bright side of the cloud, for which it is first and foremost valuable. It is necessary - to rent, do not - stop the rental, and both require a few mouse clicks (or a software request) and not very long wait.

This is convenient for companies of any size. It is not necessary to carry out the purchase of each piece of iron through the accounting department, it is not necessary to buy equipment in reserve, you can achieve much less downtime and much more flexibility in management.

Plus, you shift part of the responsibilities to the provider. You no longer buy servers, do not build racks, do not connect with electricity, you do not need space for equipment, you can even not configure the OS (cloud dependent). Please note that we are talking about shifting responsibilities, but not responsibility, more on this below.

As usual, there is a dark side.

The dark side of the cloud is that many things cannot be affected. If you believe the blog team Stack Exchange , their service does not work in the cloud, and on their own equipment, precisely because they are not satisfied with the level of control that is provided by cloud providers.

For example, virtual machines are standard and you may not even know the characteristics of real hardware. Most likely, when in Windows Azure you deploy a service on a single single-core node, you are actually given a virtual machine that runs in some 16-core server under HyperV. Maybe you can tweak something there and get a 15% performance boost out of the blue, but you can’t do anything about it.

If you are paranoid or bound by the strict requirements of the law or the contract, you may not be satisfied that you have very little control over iron. For example, you uploaded there documents with a commercial secret, they were copied onto a bunch of hard drives, you can’t affect their guaranteed deletion. Yes, the provider promises you, but you cannot verify it.

The same goes for reliability. You can not be sure that the racks at one point, for example, will not drain condensate from the detached air conditioning system pipe. If your server was in the office or in colocation, then you could do something, even if it looks insane, such as draining water from the space above your equipment. Here you can’t do anything - you don’t control where the equipment stands, whether it is well fixed there and whether the mouse runs over it. All the crazy events you could foresee (or not foresee and feel remorse about a job that was not done well) are now completely out of your control.

Mad events are very different. Here are examples of real failures in data centers.

FAIL . The car crashed into a power line pylon next to the data center, the high-voltage wires in front of the substation supplying the data center broke and fell to the ground. Began transition to backup power. From the wires lying on the ground, the current flowed into the ground; in the data center, protective circuits reacted to a leakage of current into the ground and disconnected the entire data center.

Another FAIL . Presumably due to a lightning strike, the transformer supplying the data center failed, the transition to backup power began. For some reason, it was not possible to synchronize the generators (most likely, there was no power supply to the equipment performing the synchronization), the data center could not switch to the backup power supply, all the equipment turned off.

Note that we know about these cases because they have affected hundreds and thousands of cloud users. We simply do not know how many similar events are happening with servers located in offices.

Of course, something similar can happen with servers in the office, but in this case it will be your share of guilt - you could have foreseen, but not foreseen. You will be ashamed of poorly done work. In the case when the equipment is “somewhere out there”, there is no such possibility, you have to believe the provider.

This is not bad, just need to clearly understand it. Placing the service in the cloud, you transfer to the provider a significant part of the responsibilities, but not responsibility for the viability of your service. Cloud does not automatically mean more reliable and does not automatically mean less reliable. You still need a risk assessment, for critical services you need duplication in different data centers and redistribution of load. It may happen that when you take into account all the costs of duplicating and synchronizing data between data centers, the price tag will upset you.

Cloud Services Cloud Architecture Again

Finally - about the special requirements for cloud services. There are such requirements - you need to be ready, that at any moment anything can break. If you like extremes, you can create a service like Netflix that breaks something in your service at arbitrary moments . Especially you need to be prepared for occasional short-term failures. For example, sometimes the connection with SQL Azure will disappear for a while - your code should not panic and break, but wait a bit and try again.

Just remember that it usually annoys users in the programs - all sorts of "could not find the server, here are 18 points to check" in a distributed system are absolutely normal, your service should try to cope with it itself, then try several more times. The user after the browser “no server response” message usually presses F5, so your service should simply try to repeat the action. For this, it is important that the repeated execution of any action does not cause harm - this is called a buzzword idempotency. If you do not take into account this feature, then your service will fail at the most inopportune moment because of some nonsense.

Similarly, the service should be ready to be stopped at any time - on all nodes or on some - and then restarted, data should not be damaged, the most recent data should be lost, after restarting the service should be able to continue working as if nothing had happened. This happens, for example, during the automatic installation of software updates in Windows Azure — nodes are stopped in turn, then the service starts on a node with already updated software.

Requirements are substantial, but doable, just Murphy will often come to your service. It depends on you whether a small FAIL will turn into an epic rejection.

A cloud is not a bunch of “scalable”, “accessibility”, “migration”, “performance”, “trend” words used randomly in the marketing text. This is just a model of ownership of computing power. In certain cases, this model is very convenient.

By the way, we have a service for developers working in the cloud.

Dmitry Mescheryakov,
product department for developers

Source: https://habr.com/ru/post/140708/

All Articles