
When planning the development of cloud services, many developers are wondering what they have to do and how the cloud service differs from the “usual” one.
Quite often, when describing clouds and cloud solutions, they talk about elasticity. Amazon even called its cloud Elastic Compute Cloud (EC2). Besides the fact that “elasticity” is a beautiful word used in cloud marketing, it also has a very definite meaning. It is about the possibility of renting computing resources with payment for actual use, starting and stopping the lease at any time.
This is very convenient for services with non-constant load - when the load changes, they can scale, increasing or decreasing the number of nodes, providing users with acceptable request processing time, and owners - reducing costs.
')
So in theory. In practice, moving from a beautiful word to business is not always easy.
Everything is good, while we are talking about showing a simple “Hello World” service at the next conference. “So we uploaded the package with the service to the cloud, you need to wait a bit until it starts up, but in the meantime I’ll tell you about other utilities, blablabla ... vooot, look, the service is working.
Now welcome to the real world. In the real world, it may not be enough for you to keep the “Hello World” service in the cloud. You may want to do something more complicated, for example, like
our Cloud OCR SDK service . By itself, our service is simple, but you see, it performs optical recognition, and for this it carries along an almost complete distribution of the FineReader Engine (minus help and installer) of about 750 megabytes. In the real world it happens - to do something useful, you have to use something bulky.
Since we are using PaaS model of Windows Azure, each service node should, when starting from somewhere, take this FRE distribution, deploy and configure it. The logical solution is to put the distribution in Blob Storage. The copying process should be as reliable as possible, so it would be good to reduce the number of files and do without a spreading directory structure. The first thing that comes to mind is the good old ZIP.
The cunning plan looks like this:
- download archive
- unpack the archive
- expand and configure the FRE
- ???
- TO WORK
One problem - the deployment of FRE goes quite a long time despite all cloud data centers. This happens because the virtual machine is not one on a physical machine, resources need to be shared, so the disk bandwidth is specifically limited to it. Because of this, unpacking takes quite a lot of time (read-write-read-write-repeat-right-number-time).
The reader, educated in cloudless old traditions, is already preparing to go in the comments and “explain” that the node launch time is not so important, because the launch is rare. Do not rush.
Here lies the important difference between the “usual” service and the cloud one.
In the "usual" service start and full-time work - two different phases of life. First it started, then it works. In a cloud service, launching additional nodes is part of normal operation. There are more users - the service is scaled by launching additional nodes. Users became noticeably less - part of the nodes stops and returns to the cloud.
No matter how you predict the load, you usually do not control it. At some point, an unpredictable surge is possible, and additional nodes will need to be launched. As fast as possible. Cloud elastic, remember?
Every extra 10 seconds spent starting up is a useless, simple, newly launched node. These 10 seconds the node could do something useful. Moreover, very often at this very time, the load on the service is significantly increased, and this node could handle the load, but it is not yet ready, because it is busy with something “really important”, like unpacking the archive.
This is not all. With a surge in load, it may be necessary to launch not one new node, but a lot. Then 10 seconds from the previous paragraph will uselessly work all newly launched nodes. It seems everyone is busy, but there is no use.
Every second, spent on something “really important”, is spent unproductively just at the very time when the useful work of the node is most in demand.
In our case, obviously, it is necessary, if possible, to speed up the action of deploying FRE. One way is to use a virtual hard disk. If you create a disk to format it under NTFS and enable the built-in data compression, the disk image turns out about 600 megabytes. The image can be put in Blob Storage, the service node will start downloading it and simply mount it. So you can reduce the deployment time several times.
In order for the service to fully utilize the elasticity of the cloud, additional efforts are needed in developing the service, allowing new nodes to be launched as quickly as possible. Otherwise, elasticity will remain a beautiful word.
Dmitry Mescheryakov,
product department for developers