A year ago, Amazon launched its new cloud product in some regions - Elastic File System. This summer, the "novelty" has finally reached Europe and Russia. Why did we need this service at all, what are its chips and why it is absolutely not suitable, we briefly asked AWS expert Corey Quinn.
About speaker
Corey Quinn: The proven speaker of many DevOps conferences, an AWS specialist, is known for sending lastweekinaws.com. Engineer Manager and Cloud Strategist')
About service
Elastic File System is a file storage that simplifies working with the cloud for applications oriented to interact with a regular file system. At the same time, due to “cloudiness”, it is possible to flexibly adjust its volume.- To whom is the EFS service targeted?Corey Quinn: In general, Amazon EFS aims to replace NFS tools (with a network file system). Since Amazon Web Services (AWS) still does not allow using NetApp in the us-east-1 data center, users historically had to install their own NFS implementation.
In scenarios where the application architecture requires sharing of data arrays, until recently this was the only option.
“Unlike the already existing Amazon S3 service, which provides users with a cloud-based interface to access storage from anywhere in the world, EFS supports the ideology of file locking and other aspects of“ classic ”file systems. Are there any problems that EFS can solve, but could not solve, for example, S3?Corey Quinn: The great advantage of Amazon EFS over Amazon S3 is that you don’t need to rewrite your legacy applications to work with new object storage concepts. You can simply use NFS as you have always done.
A great example is Wordpress. Instead of teaching the console, various nodes, and other components to properly interact with Amazon S3, you can use one mounted volume that will just work out of the box, without any changes to the applications. In all other respects, let's be frank, EFS is terrible.
- What are some alternatives to Amazon EFS? What are their main differences?Corey Quinn: Obviously, for many tasks, Amazon S3 is the best choice, but with the proviso that not all applications support the model, in which objects do not live on the side of the API. It’s easy to sit on top of an idealized ivory tower and say that every application you use must be recycled. But in this way, we ignore the reality that many enterprises face. They need a different solution.
By the way, if we talk about alternatives, do not forget about the Elastic Block Store. It also works well. However, at a time it can only be mounted to one instance. Therefore, it is impossible to share it.
“Although Amazon EFS is primarily focused on IoT and big data processing, is there a solution in AWS infrastructure that is more interesting for these aspects of the application?Corey Quinn: In principle, the S3 speed indicators are quite good, but here you need to focus on the types of tasks. However, this topic is better discussed with experts in this field. IoT and BigData are a very specific area. And, unfortunately, not my specialization.
- EFS is focused on working with Amazon EC2 instances. Can I use it outside of the AWS infrastructure?Corey Quinn: This is possible, but in many cases, the game will not be worth the candle. Most of the existing operating systems will not digest the delays that will arise when transferring data over the Internet in the process of what will be presented to them by local disk operations. If you approach the process from the official point of view, you need to use AWS Direct Connect to access EFS. However, let's be frank, you can get the necessary access using various VPN tricks.
- Can you give an example of some hidden features or problems in the service?Corey Quinn: The most interesting hidden EFS feature is that storage performance scales linearly and automatically as storage data grows. As a result, the only way today to improve performance on existing EFS volumes is to put large amounts of excess data into it. Thus, the system will increase the limit on I / O operations per second in accordance with the amount of stored information.
- What other factors have the greatest impact on service performance?Corey Quinn: Of course, the most important is the use case. Cloud is a flexible structure. For some applications, the driver provides low I / O latency, while in other situations it organizes a massively parallelized task.
But there are a number of scenarios where the use of cloud storage is impractical. In some places due to network restrictions, and somewhere - due to delays in data transmission, especially in regions with economic problems or low availability of telecommunication channels. Overall, the cloud is great. But this does not mean that it should be used always and everywhere.
At our
DevOops 2017 conference, which will be held on October 20 in St. Petersburg, Corey Quinn will present the report
"Come scale away .
" In addition, you will certainly be interested in these hot topics: