SAS interface: history, storage organization examples

Last time, we looked at everything related to SCSI technology in a historical context : by whom it was invented, how it developed, what types it has, and so on. We finished on the fact that the most modern and current standard is Serial Attached SCSI, it appeared relatively recently, but received rapid development. The first implementation of “in silicon” was shown by LSI in January 2004, and in November of the same year, SAS entered the top of the most popular queries of storagesearch.com .

Let's start with the basics. How do devices work on SCSI technology? In the SCSI standard, everything is built on a client / server concept.

The client, called the initiator (English initiator), sends different commands and waits for their results. Most often, of course, the SAS controller acts as a client. Today, SAS controllers are HBAs and RAID controllers, as well as storage controllers that are located inside external storage systems.
')
The server is called the target device (English target), its task is to accept the request of the initiator, process it and return the data or confirmation of the execution of the command back. In the role of the target device can act as a separate disk, and the whole disk array. In this case, the SAS HBA within the disk array (the so-called external storage system), designed to connect servers to it, operates in Target mode. Each target device (“target”) is assigned a separate SCSI Target ID.

To connect clients with the server, the data delivery subsystem (English Service Delivery Subsystem) is used, in most cases, this tricky name hides just cables. Cables are for external connections as well as for connections inside servers. Cables change from generation to generation SAS. Today there are three generations of SAS:

- SAS-1 or 3Gbit SAS
- SAS-2 or 6Gbit SAS
- SAS-3 or 12 Gbit SAS - getting ready for release in mid-2013

Internal and external SAS cables

Sometimes part of this subsystem may include expanders or expanders SAS. Expanders (Expanders) are extenders, but the word “expander” has taken root in Russian means devices that help deliver information from initiators to goals and back, but are transparent to target devices. One of the most typical examples is the expander, which allows you to connect several target devices to a single initiator port, for example, an expander microcircuit in a disk shelf or on a server backplane. Thanks to this organization, servers can have more than 8 disks (controllers that are currently used by leading server manufacturers, usually 8-port), and disk shelves - any necessary number.

The initiator connected to the target device by the data delivery system is called a domain. Any SCSI device contains at least one port, which can be an initiator port, a target device, or combine both functions. Ports can be assigned an identifier (PID).

Target devices consist of at least one logical device number (Logical Unit Number or LUN). It is the LUN that identifies with which of the disks or partitions of the target device the initiator will work. It is sometimes said that the target provides the initiator LUN. Thus, a pair of SCSI Target ID + LUNs is used to fully address the required storage.

As in the well-known joke (“I do not lend, but the First National Bank does not sell seeds”) - the target device usually does not act as a “sending command”, and the initiator does not provide a LUN. Although it is worth noting that the standard allows for the fact that one device can be both the initiator and the target, but in practice it is used little.

For “communication” of devices in SAS, there is a protocol, according to the “good tradition” and on the recommendation of OSI, divided into several layers (from top to bottom): Application, Transport, Link, PHY, Architecture and Physical.

SAS includes three transport protocols. Serial SCSI Protocol (SSP) - used to work with SCSI devices. Serial ATA Tunneling Protocol (STP) - to interact with SATA drives. Serial Management Protocol (SMP) - for managing a SAS factory. With STP, we can connect SATA drives to SAS controllers. Thanks to SMP, we can build large (up to 1000 disk / SSD devices in one domain) systems, as well as use SAS zoning (for more details, see the article about the SAS switch).

The link layer is used to manage connections and transfer frames. PHY level - used for things like setting the connection speed and encoding. At the architectural level there are issues of extenders and topology. The physical layer determines the voltage, the waveform of the connection, etc.

All interaction in SCSI is based on the commands that the initiator sends to the target device and waits for their result. These commands are sent as command description blocks (Command Description Block or CDB). A block consists of one byte of the command code and its parameters. The first parameter is almost always LUN. CDBs can be from 6 to 32 bytes in length, although recent SCSI versions allow for variable length CDBs.

After receiving the command, the target device returns a confirmation code. 00h means that the command was received successfully, 02h indicates an error, 08h means a busy device.

Teams are divided into 4 large categories. N, from the English "non-data", are intended for operations that are not directly related to the exchange of data. W, from “write” - write data received by the target device from the initiator. R, as it is not difficult to guess from the word "read" is used to read. Finally B - for two-way data exchange.

There are quite a few SCSI commands, so we list only the most frequently used ones.

Test unit ready (00h) - check whether the device is ready, if there is a disk in it (if it is a tape drive), if the disk is unwound, and so on. It is worth noting that in this case, the device does not produce a complete self-test, for this there are other commands.
Inquiry (12h) - get the main characteristics of the device and its parameters
Send diagnostic (1Dh) - perform device self-diagnostics - the results of this command are returned after diagnostics with the Receive Diagnostic Results command (1Ch)
Request sense (03h) - the command allows you to get the execution status of the previous command - the result of this command can be either a “no error” message or various failures, starting with the absence of a disk in the drive and ending with serious problems.
Read capacity (25h) - allows you to find out the volume of the target device
Format Unit (04h) - used to destructively format the target device and prepare it for data storage.
Read (4 variants) - data reading; exists in the form of 4 different commands that differ in the length of the CDB
Write (4 options) - write. As well as for reading in 4 options
Write and verify (3 options) - data writing and verification
Mode select (2 options) - setting various device parameters
Mode sense (2 variants) - returns current device parameters

And now we will consider some typical examples of data storage organization on SAS.

Example one is a storage server.

What is it and what does it eat? Large companies like Amazon, Youtube, Facebook, Mail.ru and Yandex use servers of this type to store content. Content refers to video, audio information, pictures, the results of indexing and information processing (for example, Hadoop, recently popular in the USA), mail, etc. To understand the task and choose the right equipment for it, you need to additionally know a few introductory, without which it is impossible. First and foremost, the more drives, the better.

Data center of one of the Russian Web 2.0 companies

Processors and memory in such servers are not used much. The second is in the world of Web 2.0, information is stored geographically distributed, several copies on different servers. 2-3 copies of information are stored. Sometimes, if it is requested frequently, more copies are stored for load balancing. And the third, based on the first and second, the cheaper - the better. In most cases, all of the above leads to the use of high-capacity Nearline SAS or SATA drives. Typically, enterprise-level. This means that such disks are designed to work 24x7 and are significantly more expensive than their counterparts used in desktop PCs. The case is usually chosen such where it is possible to insert more disks. If it is 3.5 ", then 12 disks in 2U.

Typical 2U storage server

Or 24 x 2.5 '' in 2U. Or other options in 3U, 4U, etc. Now, having the case, the number of disks and their type, we must choose the type of connection. In fact, the choice is not very big. And it comes down to the use of an expander or no expander backplein. If we use expander backplane, then the SAS controller can be 8-port. If without expander - the number of ports of the SAS controller must be equal to or exceed the number of disks. And finally, the choice of the controller. We know the number of ports, 8, 16, 24, for example, and choose a controller based on these conditions. Controllers come in 2 types, RAID and HBA. They differ in that RAID controllers support RAID levels 5,6,50,60 and have a fairly large amount of memory (512MB-2GB today) for caching. The HBA has either no memory or very little. In addition, HBAs either do not know how to do RAID in general, or only simple levels that do not require a large amount of computation are able to. RAID 0/1 / 1E / 10 is a typical HBA kit. Here we need HBAs, they are much cheaper, so we don’t need data protection at all and we strive to minimize server costs.

16-port SAS HBA

Example two, Exchange mail server. As well as MDaemon, Notes and other similar servers.

Here everything is not as obvious as in the first example. Depending on how many users the server should maintain, the recommendations will be different. In any case, we know that the Exchange database (the so-called Jet DB) is best stored on RAID 5/6 and is well cached using SSD. Depending on the number of users, we determine the necessary storage volumes “today” and “for growth”. Remember that the server lives 3-5 years. Therefore, “for growth” can be limited to a 5-year perspective. Then it will be cheaper to completely change the server. Depending on the volume of disks, choose a case. It is simpler to use the backplane, it is recommended to use expanders, since the price requirements are not as rigid as in the previous case, and generally, the server will cost $ 50- $ 100, and sometimes more, we will fully survive in favor of reliability and functionality. Drives choose SAS or NL-SAS / Enterprise SATA depending on the volume. Next, data protection and caching. Choose a modern 4/8 port controller that supports RAID 5/6/50/60 and SSD caching. For LSI, this is any MegaRAID except the 9240 with the CacheCade 2.0 caching feature, or the Nytro MegaRAID with SSD onboard. For Adaptec, these are controllers that support MAX IQ. For caching in both cases (except for Nytro MegaRAID), you will need to take a pair of SSDs on an enterprise-class e-MLC technology. Intel, Seagate, Toshiba, and so on. Prices and companies to choose from. If you don’t spoil the extra charge for the brand, then in the IBM, Dell, HP server lines, find similar products and go!

SSD Caching RAID Controller Nytro MegaRAID

Example three, external data storage system with your own hands.

So, the most serious knowledge of SAS, of course, is required by those who produce storage systems or want to make them by hand. We will focus on a fairly simple storage system, software for which is produced by Open-E. Of course, you can do storage on Windows Storage Server, on Nexenta, on AVRORAID, and on Open NAS, and on any other suitable software for this purpose. I just outlined the main directions, and then the manufacturers websites will help you. So, if this is an external system, then we almost never know how many disks the end user will need. We must be flexible. For this there are so-called JBOD - external shelves for disks. They consist of one or two expander, each of which has an input (4-port SAS connector), access to the next expander, the other ports are divided into connectors for connecting disks. Moreover, in two-expander systems, the first port of the disk is diluted to the first expander, the second port to the second expander. This allows building fault-tolerant chains of JBODs. The head server can have internal disks in its composition, or not have them at all. In this case, “external” SAS controllers are used. That is, controllers with ports "out". The choice between a SAS RAID controller or SAS HBA depends on the management software you choose. In the case of Open-E, it is a RAID controller. You can take care of the caching option on the SSD. If your storage system will have a lot of disks, then the Daisy Chain solution (when each subsequent JBOD is connected to the previous one, or to the head server) is not suitable for many reasons. In this case, the head server is either equipped with several controllers, or a device called the SAS switch is used. It allows you to connect one or more servers to one or more JBODs. More SAS-switches, we will analyze in the following articles. For external storage systems, it is strongly recommended to use disks only SAS (including NearLine) due to increased requirements for fault tolerance. The fact is that the SAS protocol has many more functions than SATA. For example, control of write-read data all the way with check amounts (T.10 End-to-End protection). And the path, as we already know, is very long.

Multiple JBOD

Finally, I want to share some information about the current SAS adaptation by global equipment manufacturers. SAS today is the de facto standard for server systems and professional workstations. Server systems of the overwhelming majority of both A- and B-brands include SAS controllers, both HBA and RAID. In the field of external data storage systems, the main equipment manufacturers (HP, EMC, NetApp, IBM) have for several years transferred the internal architectures of their systems to SAS. Thus, Fiber Channel drives have become a real exotic for the last couple of years. Fiber Channel continues to live and evolve, mainly as a way to connect servers to storage systems, although in the field of Low-End, Mid-Range and professional systems, SAS wins an increasing share.

On this, our excursion into the world of SCSI history and theory in general and SAS in particular came to an end, and next time I will tell you in more detail about the use of SAS in real life.

Source: https://habr.com/ru/post/175313/

All Articles

SAS interface: history, storage organization examples

More articles: