Overview of the most interesting materials on data analysis and machine learning â„–12 (September 1 - 8, 2014)
I present to you the next issue of the review of the most interesting materials on the topic of data analysis and machine learning. This release has turned out quite voluminous, it has a lot of materials on Data Engineering. More and more materials appear from the KDD 2014 conference. As usual, there are articles about various machine learning competitions, including articles about the recent “ImageNet Large Scale Visual Recognition Challenge (ILSVRC)” competition. There are also quite a few examples of code in the R and Python programming languages. There is a mention of, I think, a very curious online course on “Introduction to Computational Finance and Financial Econometrics”.
Data analysis and machine learning materials
Analysis of the results of ILSVRC Analysis of the results of the recently held ImageNet Large Scale Visual Recognition Challenge (ILSVRC), an annual image processing competition in which the Google team won first place.
MongoDb Data Modeling Guide Not so long ago, Data Modeling Adviser for MongoDB was published on Daprota's website - a very useful guide to data modeling in the MongoDb NoSQL database.
AVITO.ru competition on Kaggle The author tells about the experience gained while participating in the AVITO.ru competition on Kaggle and about the analysis of various approaches to solving the problem that other participants of the competition used.
A framework for constructing a dictionary when analyzing text Continuation of a series of articles on text analysis and work with unstructured data. In this article, the author talks about possible approaches to solving the problem of constructing a dictionary when analyzing text data.
Improved image processing algorithms A small article about the annual competition in image processing, in which the Google team won first place, having doubled the last year’s result.
Online course "Introduction to Computational Finance and Financial Econometrics" More recently, an online course has started on Coursera, which will be useful to those who are interested in statistics and the R programming language, as well as to those who are interested in using statistical methods in the financial sphere.
Stinger.next: Improved SQL with Hadoop and Hive An article from Hortonworks' blog about plans for the new product Stinger.next, which will significantly improve many of the qualitative indicators of the performance of SQL queries when working with Hadoop.
Deep Learning at Google A small news article about Google’s progress in machine learning, Deep Learning. The article does not address the technical details of the implementation of the Deep Learning algorithms.
ShinyTree: jsTree + shiny A short visualization example using the shinyTree library for the R programming language and the jsTree JavaScript library.
NoSQL Trends: August 2014 Current trends on the main NoSQL-systems from various sites of online recruitment (Indeed, SimplyHired).
My favorite graphics The author of the article talks about several types of graphs, which allow you to simply and clearly visualize various types of source data.
Machine learning with R The author of the blog MachineLearningMastery tells how to quickly start using machine learning algorithms in the R programming language.
How to translate MapReduce requests to Apache Spark A useful article from the Cloudera blog about how to translate MapReduce requests into the increasingly popular Apache Spark and understand the difference between the concepts in these two approaches.
What is Big Data? What is Big Data, more than 40 experts answer this question on the Berkley blog.
Evaluating the accuracy of a predictive model using R Caret 5 ways to assess the accuracy of the predictive model available in the Caret machine learning library for the R programming language, described by the author of the popular blog MachineLearningMastery.
Introduction to Predictive Analytics The first part of a new series of articles from the portal insideBIGDATA, this time on the topic of Predictive Analytics.
Work with MongoDb from R A useful and very relevant article on how to work with MongoDb from the R programming language with NoSQL.
Machine Learning and Data Analysis Newsletters It is often difficult to keep track of all the news in data analysis and machine learning. The author of the popular blog MachineLearningMastery offers a small list of newsletters that can simplify the task of getting the latest news from the field of Data Science.
Notifications in R An example of code that will allow to receive notifications when the script in the R programming language has ended.
Error notifications in R Another code example that allows sending notifications in case of errors when executing a script in the R programming language.