Experience using AWS ECS in our infrastructure

In this article, I would like to share our experience of using AWS ECS in infrastructure, talk about the pros and cons of this product and how we solved the problems associated with this. Let's start with the definition:

Amazon EC2 Container Service is a high-performance, highly scalable container management service.

In essence, ECS is an attempt by Amazon to get into the container management market, where Kubernetes , Mesos / Marathon , Docker Swarm and others now exist. However, unlike them, Amazon provides a service with an API, so the closest analog is the Google Container Engine (aka kubernetes-as-a-service). It should be noted that ECS itself is free, and you only pay for EC2 instances.

Work with ECS is as follows:

Create multiple EC2 instances, install ECS agents on them and cluster them
Create a file with a description of the containers and how they need to be run (number, deployment strategy, environment variables, etc)
We transfer this file through API in our cluster
ECS will select on which of the instances it should launch the container and restart it in case of a drop.
???
PROFIT

For readers who have already used Kubernetes, this scheme will seem very familiar. In addition to working through the API, the same actions can be performed via the web interface. Visually in the diagram:

The problem of balancing and service discovery

It is usually not enough to launch containers, it is also necessary to somehow get to them from outside. To do this, you need to know on which of the instances and on which port our container is running. To solve this problem, AWS offers ELB integration, where you must specify a static port for your container and connect to an ELB cluster on this port. This approach is used in PaaS Empire.
')
Note: I did not look at the new AWS ALB, maybe everything is different there.

This method is not without flaws, for example, we need to pay for each ELB for each type of service , even if there is no traffic on them. This problem is especially acute if we plan to create services dynamically. Also, we can not run more than 1 service per 1 instance due to port conflicts . All this leads to inefficient utilization of resources and inconvenience of use. This is not our way.

After searching for tools for service discovery, our choice fell on consul from Hashicorp. To register containers in consul, we use registrator . Thus, in addition to the ECS agent, we also install the consul agent and the registrator. In the diagram:

After the container is started by the ECS agent on a random port , the registrator receives a notification through the docker api and registers the service with the consul. Registrator allows you to use magic environment variables in order to define healthcheck for the service in consul, as well as specify tags.

Next, we need to solve the problem of balancing traffic between consul services and we decided to try fabio , the main advantage of which is the lack of configuration. Fabio configures itself by reading the list of services from consul, and, according to tags, builds a routing table. When you change the status of the service or add a new one, fabio instantly rebuilds the routing table. Fabio also has an interface where you can view current routes and the amount of traffic redirected to a specific container:

In a simplified scheme, you can see that we managed to use only 1 ELB for the entire cluster:

Services

In ECS, we deploy standard stateless 12factor applications wrapped in docker containers. We use our own wrapper over the AWS API for ECS to make it easier and more convenient to deploy services. So instead of JSON, which accepts ECS, we use a simplified yml, reminiscent of the Kubernetes format:

demo-service.yml

--- - name: demo-service image: test-image version: 4.2 region: eu-central-1 cluster: ecs-cluster cpu: 128 memory: 256 port: 8080 instances: 2 custom_env: SOMEVAR: somevalue

This file is in every repository of every service that is deposited on ECS. You can compare it with JSON, which AWS accepts . At the same time for deployment only a wrapper and an AWS API key are needed.

We also integrated ECS with github via Jenkins and deployed each pull request to ECS when opened, for developers it looks like this:

So with a code review, everyone can immediately check the proposed changes in the service and run integration tests. Yes, and run the application immediately "in the cloud" quite nice.

pros

In my opinion, ECS copes well with the task of orchestrating containers and with this “can live”. To the obvious advantages, I would refer the following:

Free of charge - payment only for EC2.
Hosted solution where you do not need to think about fault tolerance.
Low entry threshold, everything can be “clicked” through the web interface.
Integration with EC2 and ECS immediately knows when an instance is not available.
Integration with CloudWatch and AutoScalingGroup , which allows you to adjust the number of EC2 instances in the ECS cluster, depending on the number of running containers.

Minuses

And now for what we lack and why for the next project I will use Kubernetes.

Vendor-lock on AWS.
The slow development of ECS as a product, and the integration of other services with it. So in eu-central-1 there is still no integration with OpsWorks.
In AWS SDK there is no direct way to check the exit code of the container when starting one-off task. In Kubernetes it is.
Although it is stated that ECS knows about availability zones (aka availability zone aka AZ), the distribution of containers between AZs is not what you would expect. For example, in the event that one AZ is not available, all containers will be moved to another, however, after raising this AZ, your containers will not be moved back. I also observed oddities at a delay when the whole service was closed only at 1 AZ, because there were the instances with the most resources.
ECS knows nothing about healthcheck containers, the running service is the one where the container started. In Kubernetes it is.
If you want a statefull application - ECS is not for you. At the same time, Kubernetes already has decent support and the situation improves with each release.
A proprietary product where a patch cannot be sent.

Conclusion

When we started using ECS (end of 2015) it was very good compared to its competitors. However, over time, it can be said that ECS does not keep up with the market and this product has little chance of success in the future. You can, for example, see where the most active developer of ecs-agent now commits . ECS is suitable for those who have just begun their immersion in the world of container orchestration, but if you are planning a serious project for a long time, then look towards Kubernetes or Mesos / Marathon.

Source: https://habr.com/ru/post/311976/

All Articles