The title of this post is a direct reference to the diagram of
“Delay times that every programmer should know .
” Currently there are several versions of this diagram, and it is difficult to establish the original author. Some say it is
Jeff Dean .
If you are working on a project that should reach a large scale, you need to balance a few problems. What assumptions are we making and how to confirm them? How to quickly enter the market? Will our design support the expected scale?
One of the issues of scaling is the cost of infrastructure. Cloud providers allow you to create thousands of processors and place terabytes of data with one click. But it is expensive, and what is insignificant for several thousand users can be a huge hole in the budget when you reach millions of users.
In this article, I will list some reference numbers that are useful to keep in mind when considering architecture. These numbers are not intended for accurate budget estimates. They help determine whether your design makes sense or whether it goes beyond what you can afford. Therefore, we consider the orders of magnitude and relative values, not absolute values.
')
Also note that your company may receive discounts from AWS, and this can make a huge difference.
Calculations
What is the cost of the CPU today? Through the wonderful
ec2instances.info interface
, I got median vCPU prices.
You can get raw data from
the Github repository . I copied and processed them with a Python script, which I also took
on Github . All prices are for the region eu-west-1.
I rated spot prices according to various users. As prices change throughout the day, I could not find a reliable source of data.
AWS represents the computing power of its machines in the Elastic Compute Units, and 4 ECUs more or less correspond to the power of a modern processor. Thus, the prices above are shown for one processor or core, and not one instance.
Here is the price of 1 ECU in dollars per hour in all types of instances that I rate:

And this is how prices on demand are compared with one- and three-year bookings (advance payments):

Storage
So you want low latency, high bandwidth and plan to store everything in Redis? Then on top of these costs, the CPU will need to pay for RAM.
I used the same approach to get the median price of 1 GB of RAM on EC2. Elasticache is about twice as expensive as on-demand, but prices are falling quite quickly in reserved instances.
Although this is a net storage cost, you also need to look at the usage patterns of your data. How many processors do you need for 24/7 round-the-clock work?
Same for S3: how much will you pay for read / write requests? I saw workloads where the storage cost on S3 was insignificant, but the cost of writing a large number of objects in S3 made the team write its own file system on top of S3.
Broadcast
A few
comments on HackerNews indicated that I missed the cost of the transfer. Indeed, if you provide data to end users or need inter-regional replication, you need to take these costs into account.