Recently, the word big data sounds from everywhere and in some ways this concept has become mainstream. Terms such as data science, data analysis, data analytics, data mining and machine learning are closely related to big data.
Why has everyone become so obsessed with
big data and what do all these words mean?

Why is everyone praying on a big date
The more data, the harder it is to work with them and analyze. Mathematical models applicable to small data arrays most likely will not work when analyzing big dates. Nevertheless, in data science, big data occupies an important place. The larger the array, the more interesting will be the results extracted from the depths of the large heap of data.
')

The advantages of big data:
- It is interesting to work with them.
- The larger the data array, the less likely the researcher will make the wrong decision.
- Accurate studies of the behavior of Internet users without a large amount of data are almost impossible.
- Data warehouses have become cheaper and more affordable, so storing and analyzing big data is much more profitable than building deliberately incorrect forecasts.
Data science
The science of data is a deep knowledge of the output. To do data science you need to know high-level mathematics, algorithmic techniques, business analytics, and even psychology. All this is needed to shovel a huge pile of information and discover useful insight or interesting patterns.
The science of data is based around rigorous analytical evidence and works with structured and unstructured data. In principle, everything related to the selection, preparation and analysis lies within the science of data.
Examples of the application of data science:
- Tactical optimization - improving marketing campaigns, business processes.
- Predicted analytics - demand and event forecast.
- Recommender systems are Amazon, Netflix.
- Automatic decision-making systems — for example, face recognition or even drones.
- Social research - processing of questionnaires or data obtained in any other way.
In simple terms, the science of data contains all the concepts listed in the title.

Analytics
Analytics is the science of analysis, the application of data analysis to decision making.
Data analytics is designed to embed insights into an array of data and involves the use of information queries and data fusion procedures. It presents various dependencies between input parameters. For example, automatically identified, not obvious connections between purchases.
In data science, raw data are used to build a predictable model. In analytics, data is often already prepared, and reports can be interpreted by almost any user. The analyst does not need deep knowledge of higher mathematics, it is good enough to operate with data and build good forecasts.
Data analysis
Data analysis is the activity of a specialist, which aims to obtain information about the data array. The analyst can use various tools for analysis, and can build conclusions and predictions based on accumulated experience. For example, a Forex trader can open and close trader positions based on simple observations and intuition.
Machine learning
Machine learning is closely related to data science. This is an artificial learning technique used to collect big data. In simple terms, this is an opportunity to train a system or algorithm to obtain various representations from an array.
In machine learning, a certain initial set of knowledge is used to build a predictive model of target variables. Machine learning is applicable to various types of complex systems: from regression models and support vector machine to neural networks. Here the center is a computer that learns to recognize and predict.
Examples of algorithms:
- Models that can predict user behavior.
- Classification models that can recognize and filter spam.
- Recommender systems - study user preferences and try to guess what they might need.
- Neural networks - not only recognize the images, but also can create them themselves.
Researchers use machine learning techniques to automate the solution of certain tasks. These systems are very necessary to work with some very complex projects. For example, to find out in which country the happiest people live, scientists
identified smiles in photos uploaded to Instagram.

Data selection
Raw data is inherently messy and confused, gathered from various sources and unverified records. Not cleared data can hide the truth, buried deep in the big date, and mislead the analyst.
Date mining is the process of cleaning up large data and preparing it for later analysis or use in machine learning algorithms. Date miner needs to possess exceptional recognition qualities, wonderful intuition and technical skills to combine and transform a huge amount of data.
Abstract
- The more data, the more difficult their analysis.
- The science of data is the knowledge of the output, the selection, preparation and analysis.
- Machine learning is used to collect and analyze data arrays.
- Date mining is the process of cleaning up large data and preparing it for later analysis.