📜 ⬆️ ⬇️

Mike Shapiro, DSSD / EMC: “When the puzzle was formed, we got a solution ten times ahead of competitors' products”

In the post of February 29, we already talked about the quantum leap that occurred in the data storage area with the advent of DSSD D5, a high-performance storage of the rack-class on flash drives. Mike Shapiro, co-founder of the DSSD startup that was later purchased by EMC, and vice president of software development at EMC, spoke about some of the features of the product, why it is needed, and plans to develop the fastest flash storage in the industry.



When we listened to the presentation of DSSD, the question unwittingly arose - why did this product appear right now? You yourself said that development took about five years. How did you understand that the product is ready, and is it time to release it to the market?
')
I think in this case it’s right to talk about a peculiar set of circumstances. On the one hand, there are tasks that need a new level of performance data warehouses. And customers willing to pay for their decision. On the other hand, technology itself has matured. Now we have smartphones on the table, which in principle could not be done five years ago. There simply were not enough productive components with low energy consumption. A similar story with our DSSD: five years ago we could only assume the appearance of many technological solutions, but de facto they did not exist.

So, five years ago, the first experiments began with devices using the third generation PCI Express bus. And we, having begun the development of DSSD, quickly came to the conclusion that it was necessary to use the double PCI Express Gen.3 interface, having received, in fact, the speed of the future PCI Express Gen.4. This standard is planned to be approved only by the end of 2016, but we have thought about reaching its carrying capacity even before the beginning of a broad discussion.

Also five years ago, processors did not work well with such volumes of data and at that speed directly. Now they cope normally.



Five years ago, the logical interface NVM Express, used in D5, existed at the level of the very first version, not mature enough and not taking into account the needs of corporate users. We now have version 1.2, which allows, among other things, to update the firmware of the drives without stopping work.

And, of course, the price of the flash memory itself. In 2006, I developed the very first hybrid storage, and then we paid $ 20 for 1 GB of memory. That is, it was even more expensive for customers. Then only the cost of a flash in storage, comparable to the D5 DSSD, would be about $ 3 million. Now everything is, of course, different.

Thus, the original idea was significantly ahead of the capabilities of the industry. And as soon as the latter was tightened, D5 appeared.

Do I understand correctly that you are not using the most expensive types of flash memory?

Let's just say that we are not trying to buy the most expensive. I work directly with flash memory manufacturers, and for D5 we choose chips that provide the necessary level of speed and reliability. Only in our case, the lion's share as a result does not depend on the memory chips, but on the controllers under which it operates. Of course, we do not want to save at any cost, but we can afford some flexibility.



Is module design a property of EMC?

In the foreseeable future - yes, but the likelihood that it will become the standard for the industry cannot be excluded. After all, we did not develop it at all because we really wanted to do something such. No, simply in the existing versions it was not possible to apply more than 25 W for memory, and we increased this parameter to 50 W. Other drives run into the bottleneck of the PCI-E interface, we have eliminated it. Analogs are not serviced on - our serviced. And so on.



It is possible that such modules will eventually appear in other products. To prevent this, we definitely will not.

Price you, according to tradition, do not call?

No, it is very dependent on a specific customer. But in terms of the dollar / IOPS ratio, we tear everyone apart.

Let's go back to the five-year development cycle. When you conceived D5, no one has yet spoken about the Internet of things as a mass phenomenon. I do not remember that the term Big Data appeared in the media. In general, as they say, nothing foreshadowed. What then made you start thinking about products with such performance?

My favorite athlete, Wayne Gretzky, said: "We must rush to where the puck will be, not to where it is now." I profess this principle in my career. If you want to make not just another product that is slightly better and a bit cheaper than analogs, but something really breakthrough, you must be well aware of where the puck is going.

Five years ago, we drew graphics - what will be the speed of flash memory in 2016, how much the volume requirements increase, how the PCI Express bus bandwidth will change. We looked at them and began to prepare. Develop technologies, select components, negotiate with suppliers. And when the puzzle was formed, we received a solution that was ten times ahead of the production of our business colleagues.

Now everything goes to the fact that user data is becoming more important than money for some services. For example, everything goes to the fact that anti-virus products will be free for end users. After all, the data that they help to collect, increase developer awareness of attacks and threats, and, accordingly, increase the reliability of corporate decisions. Will the growing importance of data affect the growth in demand for D5 level solutions?

Strictly speaking, data has always been the greatest value to users. But today they can be safely measured not only by volume, but also by time. Everything around - people, computers, cars, airplanes, other equipment - generates huge amounts of information. And the figures are growing. If we want not only to store this data, but also to extract something useful from them, speed is necessary. What we propose now is to analyze the data ten times faster or to receive ten times more useful information in a standard time period. Of course, in conjunction with another iron of appropriate performance.



Those who have already learned to make money on data will appreciate this opportunity immediately. Well, we will wait until the rest learn.

And how do you see the development of the D5 idea for another five years?

The first is, of course, an increase in volume. We now have a system of 5 units and a maximum usable capacity of 100 terabytes. I see no reason to stop at this level. Plus we will experiment with various form factors.

We also want to further facilitate the process of upgrading the system so that it takes a few minutes.

We will definitely use higher capacity flash modules (now there is a limit of 4 TB, there will be 8, and even 16 TB). We will also refine the software so that many tasks are already optimized at the storage level.

Source: https://habr.com/ru/post/278975/


All Articles