📜 ⬆️ ⬇️

Quantcast File System and its small overview

With the development of information technology, the need for storing and processing large amounts of information is growing. To store a large amount of data using distributed file systems. About one of these file systems and will be discussed in this article.
While working on the project, I was faced with the need to store a large amount of data. One of the project's programs, written in c ++, generates a large amount of statistics that need to be stored somewhere further. Working autonomously on a server, such a program generates hundreds of gigabytes of information, and only growth in the volume of information generated is projected in the future.

Accordingly, the question arises - where to store all this data? If you store data in the north, on which the program is running, then disk space will be exhausted very soon. It became clear that it was necessary to use a distributed file system for data storage, but only which one? Googling the Internet we find various distributed file systems, but the look involuntarily stops at the new distributed file system QFS (Quantcast File System). Let's understand why. Do not confuse with QFS (Quick File System) - a file system from Sun Microsystems.

What is QFS (Quantcast File System)


Quantcast File System (QFS) is a high-performance, fault-tolerant, distributed file system designed to support MapReduce technology and other applications that sequentially read and write large files.
')
QFS is an open source distributed file system distributed under the Apache 2.0 license, developed by Quantcast and presented as an alternative to HDFS. It was formed from KFS (Kosmos File System), which Quantcast began using for secondary storage. In 2011, they transferred the main data processing to QFS and stopped using HDFS. Over the next year, they recorded and read more than 4 exabytes of information, which became a guarantee of product readiness for publication. In September 2012, QFS 1.0 was released. File system code is now available at github.com/quantcast/qfs

A bit about the company Quantcast from Wikipedia:
Quantcast is a web analytics company that acts as an independent audience meter. The company's activity is mainly to help advertisers and advertising sites to find each other, and then act as a neutral third-party measurer, which determines the amount of advertising sold.


QFS architecture



QFS consists of 3 components


QFS features




Known issues and limitations




Fast start


Installing QFS in a test configuration is fairly straightforward.


An example of a C ++ application working with QFS can be found in the project archive at /examples/cc/qfssample_main.cc

Famous rake


Experiments with this file system revealed some problems.


Interesting links (English)




Conclusion


The essence of this post is not in the overview of various distributed file systems, and not in comparing the capabilities of QFS, for example with HDFS Hadoop. The purpose of this post is to draw the attention of the domestic IT community to a new, interesting against the background of others, distributed QFS file system (Quancast File System). Let it still be young and not without some flaws, but certainly ready to take its place in the DFS ecosystem in the near future.

As far as I know, this is the first Russian-language article dedicated to this project, so I chose the introductory style of presentation, without seriously delving into the architecture of the project, its installation and configuration.

Source: https://habr.com/ru/post/203120/


All Articles