In addition to Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (Amazon S3), Amazon announced the launch of Amazon Elastic MapReduce, which is currently in beta status.
Elastic MapReduce is a web service that allows you to easily process huge amounts of diverse data. The service is based on a combination of EC2 and S3, as well as the Hadoop framework.
According to Amazon, using Elastic MapReduce, you can easily:
- Develop applications for processing large amounts of data in any language you like: Java, Ruby, Perl, Python, PHP, R, or C ++.
- Download data and applications for their processing in Amazon S3. Reliability, scalability, ease of use - that’s all, Amazon S3.
- Start through the AWS Management Console the so-called MapReduce “job flow”. You simply simply select the desired Amazon EC2 instance, then select the path to the data and the application for processing them that are on Amazon S3, click the “Create Job Flow” button and MapReduce will start its work.
- Monitor job flow status via AWS Management Console, command line, or special API. After work is completed, the result is placed in Amazon S3.
In order to use Amazon Elastic MapReduce, you must first create an EC2 instance in the US, as MapReduce is not yet supported for EC2 instances located in Amazon's European data centers.
')
Well, of course the price. When using MapReduce, you pay for an EC2 instance, the amount of data that is stored in S3, and also for using MapReduce technology.
When using Standard Amazon EC2 Instances:
- Small $ 0.015 hour
- Large $ 0.06 hour
- Extra Large $ 0.12 hour
For High CPU EC2 Instances respectively:
- Medium $ 0.03 hour
- Extra Large $ 0.12 hour
Amazon seems to be the first to provide commercial MapReduce services using Hadoop. To talk about the effectiveness of such a solution is to wait for tests and working projects based on Elastic MapReduce.
For those who want more details:
aws.amazon.com/elasticmapreduce