Overview of the most interesting materials on data analysis and machine learning №7 (July 28 - August 4, 2014)
I present to you the next issue of the review of the most interesting materials on the topic of data analysis and machine learning. In this review, there are several articles that will be of interest to beginners. There are several interesting video lectures on the subject of Data Science. In the current release, as usual, there are many articles on machine learning and data analysis with examples of code in the R and Python programming languages. The review may also be interested in several reviews of books on the topic of data analysis.
Data analysis and machine learning materials
Introduction to Gaussian Processes An interesting introductory article devoted to Gaussian processes, with examples in Python, which are often used when using machine learning algorithms in non-parametric regression and classification.
HighlightHTML library for R A brief article about the useful HighlightHTML for the R programming language and for working with the html markup of R Markdown documents.
Data Science using Python (Part 1) The first part of a series of articles on Data Science using the Python programming language. The first part contains a video from the conference "Pycon 2014", and is also devoted to the issue of data collection for analysis using Python.
Create and publish interactive ggplot2 charts An interesting article about the possibility of creating and publishing interactive graphs created using the ggplot2 package for the R programming language online using plot.ly. This article provides several practical examples of using this service.
Competition on the analysis of data from Yelp The popular portal Yelp announced the launch of a new data analysis competition based on data provided by Yelp. This competition will last until December 31, 2014.
The book "Neural Networks and Deep Learning" An interesting book on the popular machine learning direction. The book is not yet finished, but about half of the chapters of this very interesting book have already been written and are accessible to readers.
Xavier Amatriain video lectures on advisory systems Netflix's Xavier Amatriain presents another lecture series from the Machine Learning Summer School (MLSS '14) summer school in Pittsburgh. This series of video lectures is devoted to recommender systems.
Using Cassandra in real-time systems An interesting article on the topic of Data Engineering on how you can use the popular NoSQL solution of Apache Cassandra to work with real-time systems.
Recommendations everywhere A small and fairly simple article from the Microsoft Technet Machine Learning Blog, devoted to the principle of the recommendation systems.
Want to learn SQL?There is an excellent starting course for beginners. A popular blog on data analysis, Data Science 101, has published a news story that will be interesting to those who want to master SQL, which obviously does not lose its relevance and relevance during the period of growing popularity of various NoSQL solutions.
100 million images from Flickr from Yahoo Labs Yahoo Labs reported that they have published a large data set of 100 million images and video clips under the Creative Commons license for various studies.
What is machine learning? A small article from John Platt, who has worked at Microsoft for 17 years and actively uses machine learning in his daily work. In this article, he explains how machine learning is applied when solving various tasks in Microsoft projects.
Nonlinear regression with decision trees Another article from the author of Mahine Learning Mastery. This time we will talk about non-linear regression with decision trees with examples of code in the Python programming language.
20 years of Microsoft machine learning A small article about how machine learning technologies have been used at Microsoft for quite some time and has accumulated rich experience in this area. Of course, the author mentions Microsoft Azure Machine Learning - a new cloud service from Microsoft for use in solving problems that require the use of machine learning techniques.
Real-Time Queries to Cassandra with Spark and Shark Even Chan, a developer from Ooyla, located in Silicon Valley, talks about the experience of using Spark and Shark frameworks over Cassandra to execute requests in real time.