

Yandex opens a new direction of its activity - Yandex Data Factory
How we did the polar graph in DevExtreme
Salary and tools of data analyst based on O'Reilly survey results
Why is R difficult to learn? - An updated version of an article from the blog r4stats.com about the R programming language.
In which areas is Data Science applied?
Forecast for 2015 for Data Science from Data Science Central
KDnuggets.com Forecast for 2015 for Data Science
Analytics Forecasts for 2015 from the International Institute of Analytics (IIA)
A large list of public data sets - an excellent list of data sets on various topics.
The Big Data Dictionary is a small list of various Big Data topics from the Data Science Central portal that everyone should know.
Myths Big Data
5 major trends of Big Data in 2014
Forbes Big Data Analytics Market Forecasts for 2015
All you need to know in order to become an analyst is a good selection of links to useful materials on the topic of data analysis from the author of the blog Analytics Vidhya.
How content quality is used when ranking Bing
Interesting articles from Vincent Granville - a small list of 3 articles that the author of the Data Science Central portal recommends perusal.
Interesting articles from Vincent Granville - 3 more recommended articles from Vincent Granville.

Hacker's guide to neural networks. Schemes of real values. Schemes with several logical elements

Comparison of the speed of building linear models in R and Eviews

Angle detectors
Data Science without the use of statistics is not only possible, but also desirable - interesting reflections from Vincent Granville on the subject of Data Science.
Use Random Forest: testing 179 classifiers on 121 datasets - an interesting article that reflects on the correct choice of machine learning algorithm in various situations.
Comparison of busterap and cross-checks is a continuation of a series of articles from the author of the book Applied Predictive Modeling on the use of cross-validation in machine learning.
3 questions that need to be answered before choosing a machine learning algorithm is a good set of tips for choosing a machine learning algorithm appropriate for the task.
12 tips on the algorithm of the naive Bayes classifier - an excellent set of tips on using the algorithm of the naive Bayes classifier from the author of the blog Machine Learning Mastery.

Naive Bayes classifier from scratch in Python - the author of the Machine Learning Mastery blog describes in detail the implementation of the algorithm of the naive Bayes classifier from scratch using the Python programming language.

Naive Bayes with Python
Deeppy: Deep Learning Library for Python
Ask a Data Scientist: distorting factors (Confounding Variables) - another article from the popular portal insideBIGDATA from the cycle “Ask a Data Scientist”, in this issue we will talk about distorting factors (Confounding Variables).
Using Apache Hadoop to predict flight delays (part 2) is the second part of a series of articles from the Hortonworks blog about the practical use of Apache Hadoop to predict flight delays.
Spark (1) usage example: find a person with a similar list of links
Spark (2) usage example: text search using SQL
Introduction to the text tonality analysis from Kaggle - a new, quite interesting machine learning competition started on the Kaggle website, which focuses on text tonality analysis (Sentiment analysis) and this competition is especially attracted by the fact that it includes four lessons describing the work of NLP and Sentiment analysis.
The methodological manual "Statistical Analysis and Data Visualization with R" is a free book in Russian on the R programming language from the author of the blog "R: Data Analysis and Visualization".
The announcement of the new online course “Statistical Learning” from Stanford University - in about a month, Stanford Online launches an interesting machine learning course called Statistical Learning.
Materials from AMP Camp 5 - a collection of materials from AMP Camp 5, dedicated to the topic of Big Data, data analysis and machine learning, and held under the auspices of UC Berkley in California in November of this year.


Introduction to data analysis
Hadoop for network engineers
Time Series, metrics and statistics: familiarity with InfluxDB
5 rules for organizing data - a set of tips that gives Vincent Granville, which will help better organize the data structure. A very interesting comparison of these rules with similar rules, but dated 1999.
5 main problems of measuring the performance of Big Data systems - an interesting article from the blog Cloudera about 5 problems that arise when solving problems of evaluating and comparing the performance of various Big Data systems.
Collection of useful tips on Cloudera Impala
Interesting from the world of R (December 1-7, 2014)
The best materials for the week from KDnuggets.com (November 30 - December 6)
Weekly Digest from DataScienceCentral (December 15)
The best resources for the week from Data Elixir (# 13)
Weekly collection of the best materials from R1Soft (December 12)
The most interesting materials from Freakonometrics No. 191
The most interesting materials from Freakonometrics # 192
The most interesting materials on High Scalability (December 12)
This month in the Hadoop Ecosystem (November 2014)Source: https://habr.com/ru/post/245795/
All Articles