Almost a year has passed since the release of the final versions of Windows 7 and Windows Server 2008 R2. What is not a reason to once again recall these operating systems. I would like to draw attention to the two most interesting, in my opinion, new Windows features: BranchCache and DirectAccess. This article is about BranchCache.
What is BranchCacheBranchCache is a caching technology that is built into Windows 7 and Windows Server 2008 R2 and is designed to optimize (reduce) network traffic transmitted over WAN links. Accordingly, the main scope of BranchCache is organizations with branches and remote offices that are connected to each other and the central office by relatively slow data lines.
BranchCache supports caching of HTTP and SMB traffic. At the same time, client computers should have Windows 7 installed (Ultimate or Enterprise editions, BranchCache does not work in other editions), and Windows Server 2008 R2 should be installed on servers. Thus, BranchCache works only in a bundle of Windows 7 + Windows Server 2008 R2. If from this point you have not lost the desire to read further, let's discuss the main features of the technology in question.
')
BranchCache FeaturesWhat is the key difference between BranchCache and other caching technologies, such as Offline Files or ISA Server Cache? Data is returned to the client application from the cache only if the original data has not changed. Let me explain by example. Suppose a branch user is trying to open a document from a file server in the central office, let's say, a template for
a leave letter
of leave. The BranchCache module of the user's computer requests information about the file from the server and checks if the requested file is in the local cache. If not, then the file is, of course, downloaded from the central office server. If the file is already in the local cache, the server will still be accessed at the central office to check if the original file on the server has changed. If it has changed, the file is downloaded again from the server. And only if the original file on the server and the file in the cache are absolutely identical, is the data from the cache used. The real algorithm for processing a request is more complicated, but it seems to me that there is enough information to understand the essence.
Two important BranchCache features that follow from this example.
1. Data in BranchCache is always relevant. To put it more precisely, if an application retrieves data from the cache, BranchCache technology ensures that this data is up to date.
2. No access to the server - no access to the cache. In other words, if the BranchCache module cannot verify the identity of the original and cached files (the server is turned off, problems with the communication channel, etc.), then the data from the cache is not used.
Well, it’s worth adding that BranchCache’s work is transparent to applications and users. The Windows interface does not reflect the fact that the document opened just by the user is taken from the cache. In contrast, for example, from the mechanism of offline files.
MetadataLet us now ask ourselves a fundamental question, namely: how does the cache check and comparison of the original and cached information occur? BranchCache uses so-called metadata. The requested file (document on the file server, html page on the web server, etc.) is divided into 32 MB segments. If a file is less than 32 MB, it by definition consists of one segment. Segments, in turn, are divided into blocks of 64 KB. If the file is smaller than 64 KB, it is always downloaded directly from the server, and BranchCache is not used. For each block and segment, a hash is calculated using the SHA 256 algorithm. All these calculations occur on the server with BranchCache support enabled, where the requested file is located. The set of hash values ​​of segments and blocks form a hash list (hashlist) and serve as the basis for file metadata. It is these metadata that are transmitted to the client computer, where they are compared with the hash list of the cached file. The data hash size is approximately 2000 times smaller than the size of the data itself, so the load on the WAN channel during the transfer of metadata is minimal.
Splitting into segments and blocks allows you to optimize the search and download data. The segment hash is a search unit. As already mentioned, when accessing a file on a remote server — at a central office or another branch office — the first thing that the BranchCache module of the client computer does is request file metadata from the server. Based on the received hash list, BranchCache checks whether the local cache contains file segments. If so, the file is opened from the local cache. If not, the client computer sends a search request to the network: “Who has a segment with such a hash?” Depending on the BranchCache mode (see below), this request is sent either to a specially configured server running Windows Server 2008 R2 in the same branch, or located in the same IP subnet computers with Windows 7. In the case of a positive response, the desired data segment blocks downloaded from the "neighbor". In this sense, a block is a unit of download. Thus, the presence of segments reduces the number of search queries, and the presence of blocks - more quickly transfer the requested data to the application.
BranchCache ModesTo use BranchCache, you need to configure this technology, both on the client and on the server. However, there are two BranchCache modes of operation: distributed cache (distributed cache) and dedicated cache (hosted cache).
Distributed cacheIn distributed mode, the data is cached on the computer with Windows 7, which is the first in the branch office, and more specifically in the IP subnet, this data was downloaded from a remote server. After that, these data become available to other computers of the branch. BranchCache's behavior is as follows:
1. The user behind the computer at the branch office is trying to open a document from a remote server. In this case, the computer establishes a connection with the server and requests the required file as if the BranchCache was not at all.
2. The server authorizes the client and verifies that the client has appropriate access rights to the file. If not, access to the file is denied.
3. If the BranchCache module is configured on the server and client, the server instead of the file returns metadata, including a hash list.
4. If there are no file segments in the local cache and the communication channel speed to the server is low (latency exceeds the specified threshold, default is 80 ms), the client generates requests for searching for missing segments using the Web Service Dynamic Discovery (WS-Discovery) protocol. These are multicast requests that are distributed only within the subnet, unless the routers are configured differently.
5. If someone has the requested segments, they are returned in blocks to the client computer. The computer checks the integrity of the received blocks, stores them in its cache and transfers data to the application. The user sees the opened document.
6. If none of the “neighbors” have the necessary data, they are downloaded from the server via the WAN channel and stored in the local cache.
Distributed mode is recommended for small branches where all machines are located within the same subnet. BranchCache on client machines is easy to configure using group policies, with no Windows Server 2008 R2 server required. However, we must remember that when you turn off the computer, its cache becomes inaccessible to other branch customers.
Dedicated cacheIn this mode, the cache (dedicated cache) is focused on a branch server running Windows Server 2008 R2, configured accordingly. Any computer with Windows 7 addresses search queries to the server with the dedicated cache, and only to it. The dynamic is:
1. The user behind the computer at the branch office is trying to open a document from a remote server. In this case, the computer establishes a connection with the server and requests the required file as if the BranchCache was not at all.
2. The server authorizes the client and verifies that the client has appropriate access rights to the file. If not, access to the file is denied.
3. If the BranchCache module is configured on the server and client, the server instead of the file returns metadata, including a hash list.
4. If there are no file segments in the local cache and the speed of the communication channel to the server is low (latency exceeds the specified threshold, the default is 80 ms), the client directly accesses the local server with a dedicated cache. The IP address or FQDN of the server with a dedicated cache must be specified in the client settings manually or using group policies. In this case, as is already clear, a segment or segments are requested.
5. If there is data in the dedicated cache, it is returned in blocks to the client computer. The computer checks the integrity of the received blocks, stores them in its cache and transfers data to the application. The user sees the opened document.
6. If in the dedicated cache there is no requested data, the client downloads it from the remote server and saves it in the local cache.
7. After that, the client sends to the server with a dedicated cache a packet notifying that new data is available for the dedicated cache.
8. The server sends a request to the client to receive new data.
9. The client copies the blocks to the server in a dedicated cache so that other machines of the branch can use this information.
Dedicated cache provides a higher level of data availability, since, unlike client computers, the server is constantly running and does not turn off. Usually. Although in small offices that just does not happen. In addition, there are no restrictions on the network topology. A request for a dedicated cache is a unicast request that is routed normally. However, the described operation mode assumes the presence of a server with Windows Server 2008 R2 in a branch office.
Concluding our review of the BranchCache modes of operation, it should be noted that these modes are mutually exclusive. A specific client with Windows 7 can not work simultaneously in one or in another mode.
HTTP supportBranchCache supports caching of HTTP and SMB traffic. There are some features inherent in the caching mechanism in question in the context of these protocols.
Let's start with HTTP. Since the BranchCache module is integrated only in Windows Server 2008 R2 and Windows 7, it is probably already clear that BranchCache for HTTP is applicable only if IIS 7.5 from Windows Server 2008 R2 is used as a web server.
The second feature is related to the generation of hashlists for web site files. The hash list for any website file (html, jpg, etc.) is generated after the first access to this file. This leads to the fact that only on the third file access, the file body can be obtained from BranchCache. Suppose a branch customer first accesses a web page. IIS gives the client an HTTP or HTTPS page and generates metadata for it. Therefore, the client received a page for his request, but did not receive a hash list, and therefore cannot put this page into his or a dedicated cache. The second time the client accesses the same IIS page, it returns not the data, but the metadata that is already available. However, since after the first request the data was not cached, the client has no choice but to download the whole page again. But this time it can be placed in the cache. And a third request for this page can be served from BranchCache.
Finally, due to the fact that BranchCache actually works to transport mechanisms, caching does not affect SSL and vice versa. That is, BranchCache works effectively with both HTTP and HTTPS. By the way, this applies equally to IPSec for the same reason. In this video, I demonstrated the configuration and operation of
BranchCache for HTTP .
SMB supportThe amount of data on file servers can be very large and, under heavy load, hash calculation is a very expensive operation. As a result, in the case of SMB, the generation of metadata occurs in advance. Therefore, in response to the first request, the client receives a hash list, and already the second access to this file can be served from the cache. After the initial generation, the metadata is automatically updated every time the file is changed. Plus, the administrator has the ability to update the hash list for a given file or folder using the hashgen command line utility.
On the client side, BranchCache for SMB also uses the Offline Files service. If you disable this service, SMB traffic caching will stop working. This will not affect the caching of HTTP traffic.
You can look at the settings and features of
BranchCache for SMB .
Applications and dataIn terms of architecture, BranchCache is located below the SMB and HTTP drivers. The operation of this module is transparent to applications. In other words, caching will work when using any application that uses the SMB or HTTP stack built into Windows.
However, I would point out that the effect of BranchCache depends a lot on the nature of the data used. I will explain. Recall the example already considered when a user at a branch opens a document from a remote server. The client receives a hash list from the server and downloads the file body either from a remote server or from BranchCache (its own or “neighbor”). What happens if a user changes the contents of this document and closes it while saving the changes? The entire file is saved on a remote server! Once the file has changed, it means that it is necessary to recalculate the hash list, and this takes the server side, so you cannot save the modified file directly to the cache. If the user immediately tries to open the file again, then according to the considered algorithm, the client computer will receive updated metadata from the server, and there is nothing left to do but completely download the file body from the server. The conclusion is simple: BranchCache will give a tangible effect on relatively static data.
SecurityBranchCache security theme completely deserves a separate topic, so if it’s interesting for Habr readers, I’m ready to write about BranchCache security in more detail. In the meantime, I would like to mention just a few important points.
First, BranchCache does not provide any special protective measures when transferring data from a remote server to a branch office. If, for example, a file is downloaded via HTTP, not HTTPS, then the body of the file is transmitted in clear text, and BranchCache, for its part, does not add any encryption for the data.
Secondly, the cache itself, that is, the file on the hard disk in which the cached blocks are stored, is not encrypted. If you need additional security measures, you can use the appropriate tools, for example, built-in Windows EFS or BitLocker.
Finally, BranchCache security mechanisms take effect when exchanging information with “neighboring” computers or a server with a dedicated cache. All requests and responses in the framework of such an exchange are encrypted to prevent intentional substitution of incorrect data from the attacker.
Summing up, I would like to emphasize BranchCache again:
- reduces the load on the WAN-channels connecting the branches of the enterprise, and reduces the associated costs;
- increases the response rate of applications in the branches;
- is a built-in feature of Windows 7 and Windows Server 2008 R2 and is controlled by standard tools.