"Gugol (from the English googol) - the decimal number system represented by the unit with one hundred zeros."(Wikipedia)Many people know the many facets of Google - this is a popular search engine and social network, as well as many other useful and innovative services. But few people thought: how did the company manage to deploy and maintain them with such high speed and fault tolerance? How is organized that provides such opportunities - Google data center , what are its features? Just about this and will be discussed in this article.
Actually, Google has not surprised anyone for a long time with the powerful dynamics of its growth. An active fighter for "green energy", many patents in various fields, openness and friendliness towards users - these are the first associations that arise when many mention Google. No less impressive is the data center of the company: this is a whole network of data centers located throughout the world, with a total capacity of 220 MW (as of last year). Given the fact that investments in the data center in the last year alone amounted to $ 2.5 billion, it is clear that the company considers this area as strategically important and promising. At the same time, there is a certain dialectical contradiction, because, despite the publicity of the activities of Google, its data centers are a kind of know-how, a secret hidden behind seven seals. The company believes that the disclosure of details of projects can be used by competitors, so only part of the information on the most advanced solutions leaks to the masses, but even this information is very interesting.
To begin with, Google itself is not directly involved in the construction of a data center. He has a department that deals with development and maintenance, and the company cooperates with local integrators who perform the implementation. The exact number of built data centers is not advertised, but in the press their number varies from 35 to 40 worldwide. The main concentration of data centers is in the USA, Western Europe and East Asia. Also, some of the equipment is located in rented premises of commercial data centers that have good communication channels. It is known that Google used the space to accommodate the equipment of commercial data centers Equinix (EQIX) and Savvis (SVVS). Google is strategically focused on the transition exclusively to the use of its own data centers - in the corporation this is explained by the growing demands for confidentiality of information of users who trust the company, and the inevitability of information leaks in commercial data centers. Following the latest trends, this summer it was announced that Google will lease a “cloud” infrastructure for third-party companies and developers, corresponding to the Compute Engine IaaS service, which will provide the computing power of fixed configurations with hourly pay for use.
The main feature of the network of data centers is not so much in the high reliability of a single data center as in geo-clustering. Each data center has many high-capacity communication channels with the outside world and replicates its data to several other data centers, geographically distributed throughout the world. Thus, even force majeure circumstances like the fall of a meteorite will not significantly affect the safety of data.
')
The geography of data centers
The first of the public data centers is located in Douglas, USA. This container data center (
Fig. 1 ) was opened in 2005. This same data center is the most public. In fact, it is a kind of frame construction resembling a hangar, inside of which containers are arranged in two rows. In one row - 15 containers located in one tier, and in the second - 30 containers located in two tiers. This data center has approximately 45,000 servers from Google. Containers - 20-foot sea. To date, 45 containers have been commissioned, and the power of IT equipment is 10 MW. Each container has its own connection to the hydraulic cooling circuit, its own power distribution unit - it has, in addition to automatic switches, also analyzers of electricity consumption of groups of electrical consumers for calculating the PUE coefficient. Separately located pumping stations, chiller cascades, diesel generator sets and transformers. Declared PUE equal to 1.25. The cooling system is two-circuit, while in the second circuit economizers with cooling towers are used, which allow to significantly reduce the operation time of chillers. In essence, this is nothing more than an open cooling tower. Water, which takes heat from the server, is fed into the upper part of the tower, from where it is sprayed and flows down. Thanks to the spraying of water, heat is transferred to the air, which is pumped from outside by fans, and the water itself partially evaporates. Such a solution can significantly reduce the time of operation of chillers in the data center. Interestingly, originally used purified drinking water was used to replenish the water supply in the external circuit. Google quickly realized that the water does not have to be so clean, so a system was created that cleans the wastewater from the nearby sewage treatment plants and replenishes the water supply in the external cooling circuit.
Inside each container, racks with servers are built on the principle of a common “cold” corridor. Along it under the raised floor are a heat exchanger and fans that blow cooled air through the grilles to the air intakes of the servers. The heated air from the back of the cabinets is taken under the raised floor, passes through a heat exchanger and cools down, forming recirculation. The containers are equipped with emergency lighting, EPO buttons, smoke and temperature sensors for fire safety.
Fig. 1. Container Data Center in Douglas
Interestingly, Google subsequently patented the idea of a “container tower”: containers are placed on each other, on the one hand the entrance is organized, and on the other - engineering communications - power supply systems, air conditioning and supply of external communication channels (
Fig. 2 ).
Fig. 2. The principle of cooling air in a container
Fig. 3. Patented “container tower”
Opened a year later, in 2006, in Dulles (USA), on the banks of the Columbia River, the data center already consisted of three separate buildings, two of which (each with an area of 6,400 square meters) were erected in the first place (
Fig .4 ) - machine rooms are located in them. Next to these buildings are buildings in which refrigeration plants are located. The area of each building is 1,700 square meters. meters In addition, the complex has an administrative building (1,800 square meters) and a dormitory for temporary accommodation for staff (1,500 square meters).
Previously, this project was known under the code name Project 02. It should be noted that the place for the data center was not chosen by chance: before this, an aluminum smelter with a capacity of 85 MW was operating, which was suspended.
Fig. 4. Construction of the first stage of the data center in Dallas
In 2007, Google began building a data center in southwestern Iowa - at the Council Bluffs, near the Missouri River. The concept resembles the previous object, but there are also external differences: the buildings are combined, and refrigeration equipment instead of cooling towers is located along both sides of the main building (
Fig. 5 ).
Fig. 5. The data center in the Council Bluffs - a typical concept of building data centers Google
Apparently, this concept was taken as the best practices, as it can be traced in the future. Examples of this are data centers in the USA:
- Lenoir City (North Carolina) —the building of 13,000 square meters. meters; built in 2007—2008 ( Fig. 6 );
- Monks Corner (South Carolina) - opened in 2008; consists of two buildings, between which a site for the construction of a third is reserved; the presence of its own high-voltage substation;
- Mayes County (Oklahoma) —the construction was delayed for three years - from 2007 to 2011; The data center was implemented in two stages - each included the construction of a building of 12,000 square meters. meters; power supply of the data center is provided by a wind power plant.
Fig. 6. Data Center in Lenoir City
But the primacy of secrecy belongs to the data center in Saint-Ghislain (
Fig. 7a ). Built in Belgium in 2007–2008, this data center is larger than the data center at the Council Bluffs. In addition, this data center is notable for the fact that when you try to view the satellite map of Google, instead of it you will see an empty platform in an open field (
Fig. 7b ). Google says there is no negative impact from the work of this data center on the surrounding settlements: the quality of water for the cooling tower in the data center uses treated wastewater. To this end, a special multi-stage purification station has been built nearby, and water is supplied to the station via the technical navigable canal.
Fig. 7a. Saint-Ghislain data center on a bing search engine map
Fig. 7b. But on the Google map it does not exist!
Google specialists in the construction process noted that it would be logical to get the water used to cool the outer contour from its natural storage, rather than take it from the water supply. For its future data center, the corporation acquired an old paper mill in Finland (Hamina), which was reconstructed into a data processing center (
Fig. 8 ). The project was carried out for 18 months. 50 companies participated in its implementation, and the successful finish of the project was in 2010. At the same time, such a scope is not accidental: the developed cooling concept really allowed Google to state that its data center was also environmentally friendly. The northern climate, combined with the low freezing point of salt water, allowed cooling without chillers at all, using only pumps and heat exchangers. It uses a typical scheme with a double circuit "water - water" and an intermediate heat exchanger. Sea water is pumped, gets to heat exchangers, cools the data center and then dumped into a bay in the Baltic Sea.
Fig. 8. Reconstruction of the old paper mill in Hamina turns it into an energy efficient data center
In 2011, a data center of 4,000 square meters was built in the northern part of Dublin (Ireland). meters To create a data center, a warehouse building that already existed here was reconstructed. This is the most modest of the well-known data centers, built as a springboard for the deployment of the company's services in Europe. In the same year, the development of a network of data centers began in Asia: three data centers should appear in Hong Kong, Singapore and Taiwan. And this year, Google announced the purchase of land for the construction of a data center in Chile.
It is noteworthy that in the Taiwanese data center under construction, Google experts went a different way, deciding to take advantage of the economic benefits of a cheaper, nightly electricity tariff. Water in huge tanks is cooled using a chiller unit with tanks — cold accumulators — and is used for daytime cooling. But whether there will be used a phase transition of the coolant or the company will stop only on the tank with chilled water - is unknown. Perhaps after putting the data center into operation, Google will provide this information.
Interestingly, the polar idea is an even bolder project of the corporation — a floating data center, which patented Google in 2008. The patent states that IT equipment is located on a floating vessel, cooling is performed with cold seawater, and electricity is produced by floating generators that generate electricity from the movement of waves. For the pilot project, it is planned to use floating generators produced by Pelamius: 40 such generators, floating on an area of 50x70 meters, will allow generating up to 30 MW of electricity, sufficient for the data center operation.
By the way, Google regularly publishes the PUE energy efficiency indicator for its data centers. And the method of measurement itself is interesting. If, in the classic understanding of
Green Grid standards, this is the ratio of data center power consumption to its IT capacity, then Google measures the PUE for the whole object, including not only the data center life support systems, but also conversion losses in transformer substations, cables, and power consumption in office premises, etc. - that is, everything that is inside the perimeter of the object. The measured PUE is given as the average value for the annual period. As of 2012, the average PUE for all Google data centers was 1.13.
Features of the choice of building a data center
Actually, it is clear that, building such huge data centers, their location, Google chooses not by chance. What criteria are the company's specialists first of all taking into account?
- Enough cheap electricity, the possibility of its supply and its environmentally friendly origin. Adhering to the policy of preserving the environment, the company uses renewable energy sources, because one large Google data center consumes about 50-60 MW - enough to be the sole client of the entire power plant. Moreover, renewable sources make it possible to be independent of energy prices. Currently used hydroelectric power stations and wind farms.
- There is a large amount of water that can be used for the cooling system. It can be both a canal and a natural reservoir.
- The presence of buffer zones between roads and settlements to build a protected perimeter and preserve the maximum confidentiality of the object. At the same time, the presence of highways is required for normal transport communication with the data center.
- The area of land purchased for the construction of data centers should allow its further expansion and construction of auxiliary buildings or its own renewable sources of electricity.
- Channels of connection. There should be several of them, and they should be reliably protected. This requirement has become especially relevant after the regular problems of loss of communication channels in a data center located in Oregon (USA). Overhead lines passed through power lines, insulators on which became for local hunters something of a target for shooting competitions. Therefore, in the hunting season, the connection with the data center was constantly cut off, and it took a lot of time and considerable effort to restore it. In the end, the problem was solved by laying underground communication lines.
- Tax breaks. The logical requirement, given that the used "green technologies" are much more expensive than traditional ones. Accordingly, in calculating the payback tax benefits should reduce the already high capital costs in the first stage.
Features in detail
Let's start with the server park. The number of servers is not disclosed, but various sources of information call the figure from one to two million servers, while it is said that even the last digit is not the limit, and the existing data centers are not completely filled (taking into account the area of the server rooms, it’s hard not to agree). Servers are selected on the basis of value for money, and not on the basis of absolute quality or performance. The server platform is x86, and the operating system uses a modified version of Linux. All servers are integrated into a cluster solution.
Back in 2000, the company thought about reducing the loss of transmission and transformation of electricity in servers. Therefore, the power supply units correspond to the Gold level of the Energy Star standard - the efficiency of the power supply unit is at least 90%. Also, all the components that are not required to run the applications running on them were removed from the servers. For example, in the servers there are no graphics cards, there are fans with speed control, and the components allow to reduce their power consumption proportionally to the load. Interestingly, in large data centers and container data centers, where servers are in fact consumables, it was apparently decided: the lifetime of servers is comparable to the life of batteries. And if so, then instead of the UPS - the battery, which is installed in the case of the server itself. Thus, it was possible to reduce losses on the UPS and eliminate the problem of its low efficiency at low load. It is known that the x86 dual-processor platform was used, and the well-known Gigabyte company was engaged in the production of motherboards specifically for Google. It is curious that the server does not have a closed case familiar to us: there is only its lower part, where the hard drives, motherboard, battery and power supply are located (
Fig. 9 ). The installation process is very simple: the administrator pulls a metal plug out of the mounting box and inserts a server instead, which is freely blown from the front to the back. After installation, the battery and power supply are connected.
The status and performance of each server hard disk is monitored. Additionally, data is archived to tape. The problem of disposing of non-working storage media — hard drives — was solved in a peculiar way. At the first stage, the disks take turns on a kind of press: the metal tip pushes the hard disk, compacts the camera with the plates to make it impossible to read them in any way currently available. Then the disks fall into the shredder, where they are crushed, and only after that they can leave the territory of the data center.
Equally high level of security for employees: perimeter security, rapid response teams are on duty around the clock, first identifying an employee by pass, made using lens (lenticular) printing, reducing the likelihood of forgery, and after - biometric control over the iris of the eye.
Fig. 9. Typical "Spartan" Google server - nothing superfluous
All servers are installed in 40-inch two-frame open racks, which are placed in rows with a common "cold" corridor. Interestingly, Google’s data centers do not use special designs to limit the “cold” corridor, but use mounted rigid mobile polymer lamellae, making sure that this is a simple and inexpensive solution that allows you to quickly install cabinets into existing rows and, if necessary, minimize existing lamellae over the top of the cabinet.
It is known that, in addition to hardware, Google uses the Google File System (GFS) file system, designed for large data arrays. The peculiarity of this system is that it is clustered: information is divided into blocks of 64 MB and is stored in at least three places simultaneously with the ability to find replicated copies. If any of the systems fails, the replicated copies are automatically found using specialized programs of the MapReduce model. The model itself implies the parallelization of operations and the execution of tasks on several machines simultaneously. At the same time inside the system information is encrypted. The BigTable system uses distributed storage arrays to store a large amount of information with fast access for storage, such as web indexing, Google Earth and Google Finance. Google Web Server (GWS) and Google Front-End (GFE) using the optimized Apache core are used as basic web applications. All of these systems are closed and customized - Google explains this by the fact that closed and customized systems are very resistant to external attacks and they have significantly fewer vulnerabilities.
Summing up, I would like to note a few important points that can not but surprise. Google is reasonably planning the costs and development strategy of data centers, applying the concept of "best in price / quality" instead of "best solution." There is no unnecessary functionality, there are no decorative excesses - only the “Spartan” filling, although this may not seem aesthetic to someone. The company actively uses “green” technologies, and not as an end in itself, but as a means to reduce operating costs for electricity and fines for environmental pollution. When building data centers, the bias is not made on a large number of system redundancy - the data center itself is reserved (thereby minimizing the influence of external factors). The main focus is on the software level and non-standard solutions. The focus on renewable energy sources and the use of natural water resources suggests that the company is trying to be as independent as possible from rising energy prices. The environmental friendliness of the solutions used correlates well with their energy efficiency. All of this suggests that Google has not only strong technical competence, but also knows how to properly invest money, look ahead, keeping up with market trends.
Konstantin Kovalenko magazine
TsODy.RF , issue number 1