Probably everyone has heard about the interesting solutions offered by Amazon: Elastic Compute Cloud (EC2), SimpleDB, Simple Storage Service (S3), Simple Queue Service.
Just recently, the list was added to
CloudFront.CloudFront is a CDN or content delivery network. Of course, this is not new and there are many alternatives, but this service will be especially useful and interesting for those who use other Amazon services.
')
Since we are storing some data on S3 and are interested that our users receive content as quickly as possible, this solution seemed tempting.
CloudFront uses 4 data centers: in the USA, in Europe, Hong Kong and Japan. It is a cache that knows about the storage containing the necessary files (this can be your server or S3).
Thus, when a user requests a file, the nearest data center is determined and the necessary file is searched for. If the file is not found in the cache, it is requested from the repository. The cached value “live” will not be forever, by default the lifetime is 24 hours and this is the minimum acceptable value. You can increase this value using the Cache-Control, Pragma, or Expires headers.
The important point is that sometimes you need to version files. Otherwise, users will receive old files, despite changes in the repository. In our case, we should have no problems with this - we do not change the files in the storage.
According to the developers, setting up S3 with CloudFront is a matter of minutes. Let's try.
So we assume that you have S3 buckets.
The main task is to create a Distribution; in the workers' peasant way, this is a personal named cache. By naming it is understood that as a result some domain will be obtained, which will need to be used when constructing the path to the files.
To begin with, we create a configuration file that will customize the distribution.
<? xml version ="1.0" encoding ="UTF-8" ? >
< DistributionConfig xmlns ="http://cloudfront.amazonaws.com/doc/2008-06-30/" >
< Origin > mybucket.s3.amazonaws.com </ Origin >
< CallerReference > 20080930090000 </ CallerReference >
< Comment > Creating my first distribution </ Comment >
< Enabled > true </ Enabled >
</ DistributionConfig >
* This source code was highlighted with Source Code Highlighter .
The Origin indicates S3 bakt in the format <bucket name> .s3.amazonaws.com. This is a standard feature of S3 instead of s3.amazonaws.com/<bucket name>. I draw your attention to the fact that with such an approach certain restrictions are imposed on the names of the buckets, offhand, you cannot use capital letters.
CallerReference is a unique number necessary to exclude random repeated requests.
Next, run
./cfcurl.pl --keyname < friendly key name > -- -X POST -i -H "Content-Type:text/xml; charset=UTF-8" --upload-file create_request.xml cloudfront.amazonaws.com/2008-06-30/distribution
* This source code was highlighted with Source Code Highlighter .
In order not to be distracted by what
cfcurl.pl is and the
friendly key name , I’ll talk about this later.
And we get an answer about the content
201 Created
Location: cloudfront.amazonaws.com/2008-06-30/distribution/PDFDVBD632BHDS5
<? xml version ="1.0" encoding ="UTF-8" ? >
< Distribution xmlns ="http://cloudfront.amazonaws.com/doc/2008-06-30/" >
< Id > PDFDVBD632BHDS5 </ Id >
< Status > InProgress </ Status >
< LastModifiedTime > 2008-07-24T19:37:58Z </ LastModifiedTime >
< DomainName > e604721fxaaqy9.cloudfront.net </ DomainName >
< DistributionConfig >
< Origin > mybucket.s3.amazonaws.com </ Origin >
< CallerReference > 20080930090000 </ CallerReference >
< Comment > Creating my first distribution </ Comment >
< Enabled > true </ Enabled >
</ DistributionConfig >
</ Distribution >
* This source code was highlighted with Source Code Highlighter .
Id a unique distribution number.
Status can have two InProgress and Deployed values. InProgress means that distribution has not yet been created. We need to wait for its creation (Deployed)
You can check the status by calling
./cfcurl.pl --keyname < friendly key name >
-- cloudfront.amazonaws.com/2008-06-30/distribution < your distribution ' s ID > ;
* This source code was highlighted with Source Code Highlighter .
The DomainName element contains the domain name that you want to use when building the path to the file.
It's all. Now to create a link to image.jpg you need to use this path
<domain name> /image.jpg
where the <domain name> in the example was e604721fxaaqy9.cloudfront.net.
We use as much as baket to store all files. One goal - to achieve parallel loading. Files are small, and their number can be decent. More information about this technique can be found, for example, on
Webo.inThus, we need to create several distributors for different buckets and make changes to the project.
As you can see, the implementation is quite simple. Now for a more interesting price issue.
Obviously, due to caching, traffic to S3 will fall, and the number of GET requests will also decrease. As a percentage, it is difficult to estimate.
To this all will add more traffic CloudFront'a. Its value is known, but the amount is difficult to calculate, it is different in different regions (you need to know the distribution of traffic by region)
In the description it is indicated that the prices for traffic may be even lower than that of S3. This is true, but the benefit will be only if you have traffic in the United States and Europe over 10Tb, and in Asia 40Tb. Our project has not yet reached such numbers, so CloudFront will be a bit more expensive.
Using cfcurl.pl
Download the script from
Amazon.Next, you need to create a file .aws-secrets in your home folder
%awsSecretAccessKeys = (
# primary account
primary = > {
id = > ' < Your primary AWS Access Key ID > ',
key = > ' < Your primary Secret Access Key > ',
},
);
* This source code was highlighted with Source Code Highlighter .
with this configuration, <friedly key name> must be specified
primary .