📜 ⬆️ ⬇️

Big data is an integral part of our lives.

The last article dealt with how Big Data in general and LSI products in particular make it possible to predict the weather, and why it is so important. Since the publication of that article, one interesting event occurred confirming the importance of the topic touched upon. The well-known company Monsanto, the world leader in plant biotechnology, has acquired The Climate Corporation from San Francisco for $ 930 million, the latter is engaged in the analysis of “big data” related to weather and climate. According to CEO Monsanto: "Climate Corporation focuses on providing agriculture more opportunities through the science of data processing." But, of course, not only the predictions of the state of the atmosphere are useful for us "big data", let's look at a couple of interesting applications.

Every year, at the end of autumn - the beginning of winter, we all with a certain resignation await the beginning of an inevitable flu epidemic. Despite the relative "safety" of this disease, it is often capable of giving enormous complications, and the annual number of victims around the world, according to WHO, ranges from 250 to 500 thousand people.



Influenza viruses belong to the Ortomyxoviridae family, which includes the Influenza A, B, C genera. Belonging to one of these types is determined by the antigenic properties of the internal proteins of the virion (M1 and NP). Further division is carried out according to the subtypes of the hemagglutinin (HA) and neuraminidase (NA) surface proteins. Currently, 16 subtypes of hemagglutinin and 9 subtypes of neuraminidase are known. Viruses containing three subtypes HA (H1, H2, H3) and two subtypes NA (N1, N2) lead to epidemics that are dangerous for people. It is these characteristics that give us the "code names" of viruses. If in a certain year the flu virus was somehow different from the “classic” one, then the year is also added to the name (probably everyone remembers the epidemic caused by the 2009 H1N1 strain).
')

In general, for the classification of influenza viruses, WHO uses a classification that includes many features.

- Antigenic type: A, B and C
- Origin: pork, horse, chicken, etc.
- Geographical detection area: Chinese, Dutch, etc.
- Number of strain: 7, 15, etc.
- Year of detection: 56, 2009, etc.
- Hemagglutinin and neuraminidase subspecies: H1N1, H5N1, etc.

As you can see, influenza viruses are very different from each other, and even during one epidemic they can change, being transmitted from person to person. It is easy to guess that this is where the big data we are familiar with comes into play: disease statistics collected by national health organizations are really extensive and detailed, and to effectively combat the epidemic, you need to be able to quickly analyze these data.

US Centers for Disease Control and Prevention are analyzing this data using various tools to determine which particular virus strains will threaten the United States and create a vaccine based on this. It depends on the accuracy of this prediction how effective vaccination will be, how many people will get sick, and how many will remain healthy. For example, in 2012/2013, the main strain was influenza A H3N2, but small foci of influenza B and influenza A H1N1 were also observed.



In addition to determining the dominant virus, the CDC (abbreviated name for the Centers) analyzes the data in order to track the spread of the virus and its potential effect on the population. To do this, huge amounts of data are analyzed, including information about past epidemics, vaccinations, population data, and even weather forecasts. The results of this work are predictions, where, first of all, expect virus strikes, what kind of epidemic will be, and how long it will last. This helps to produce a sufficient amount of vaccine, correctly “guess” over the time of its production and vaccination, and distribute it correctly. From these predictions depends directly on how effective the use of the vaccine will be this year.

As in the case of weather forecasts, in this case, such a tool as Apache Hadoop effectively shows itself. To speed up its work, LSI has Nytro hardware solutions, which you can learn more about on our website .

Weather forecasting and predictions concerning influenza make one thing in common: people’s lives depend on the accuracy of these predictions. Unfortunately, there are many areas in which life and health are at stake. Are there any not so vital areas in which Big Data nevertheless plays a big role? In fact, there are a lot of them; I’ll tell you about a very unexpected use of big data in the clothing and fashion industry.

August in many countries around the world means not only the end of summer, but also the beginning of classes in schools. Schoolchildren of many nationalities with their parents go to stores to buy stationery, school supplies and often school uniforms. At the time of our parents, the choice in the stores was not so great, so many students wore the same jackets and coats.

In our age of developed consumerism, the choice has become much wider. Huge mega-malls and smaller supermarkets, specialty stores and online portals - all this leads to the presence of a considerable choice. If you add to this a huge variety of styles, materials, styles, manufacturers and sizes - it becomes clear the problems that typical retailers have to face.

All participants in the production chain rely on Big Data in their work. It all begins with the producers of matter. They analyze last year's orders, competitors' offers, fashion trends, the market for raw materials and the cost of production. Keeping track of any of these factors alone is not an easy task, but as the number of related factors grows, the complexity of the analysis increases many times over. In one of its 2012 reports, Gartner analysts stressed that the main problems facing Big Data are precisely in work when it is necessary to analyze the interaction between two or more data sources.

The following participants in the production chain are large clothing companies. It is they who set the trends in the market, so the tasks facing them are even more difficult. They use big data tools to create production plans. Analyzing such information as historical sales data, weather forecasts, demographic and economic data, they choose the right colors, styles, models and price frames for the clothes they produce.

The latest in this line are consumers. They are the ones who buy clothes. At the same time, everything that hangs on hangers and lies on store shelves (as well as put up for sale online) was selected and ordered from six months to 9 months ago. Take as an example the largest retailer in the US market - Kohl's. They need to take into account weather forecasts, to know where swimsuits will be sold, and where - warm jackets, the economic situation and data on competitors, to correctly form pricing policies, demographic data in order to better assess the needs of people and the size grid. The more accurate these forecasts are, the less goods will then be sold with big discounts on sales, and the higher will be the company's profit.



Of course, the company's profit is not commensurate with human lives, but even here Big Data shows itself to be the most important and valuable tool necessary for success. LSI is one of those companies that are able to offer solutions that really speed up and simplify work with big data.

Source: https://habr.com/ru/post/197462/


All Articles