Overview of the most interesting materials on data analysis and machine learning №2 (June 16 - 23, 2014)
In the latest review of the most interesting materials on the topic of data analysis and machine learning, quite a lot of attention is paid to the popular set of machine learning algorithms Deep Learning and its practical application. Several articles are devoted to what are the ways for their own development as a specialist in data analysis and machine learning. Also, several articles cover such topics as Data Engineering and consider such popular products as Cassandra and Apache Kafka. But this issue begins with an overview of the online courses that start in the near future, related to the topic of data analysis and machine learning.
Online courses (MOOC) on Data Science, starting soon
Machine Learning (Coursera - Stanford University) One of the most famous Machine Learning courses, led by Stanford University professor Andrew Ng. The course began on June 16 and will last 10 weeks. The course is quite simple and clear, it does not require any special knowledge for its successful completion, and at the same time it covers quite a lot of Machine Learning areas. You can still manage to register for this session of the course, having time to pass the first test.
Mathematical Biostatistics Boot Camp 1 (Coursera - Jonhs Hopkins University) The first part of the course on biostatistics from Johns Hopkins University. It began on June 16 and will last 7 weeks. It is an unofficial addition to the Data Science specialization from the same university. Well covers the basics of statistics and probability theory. Again, you can still register for this session of the course, having time to pass the first test.
Introduction to Data Science (Coursera - University of Washington) Course on the basics of Data Science from the University of Washington. The course starts on June 30 and will last 8 weeks. One of the most popular online courses on the basics of Data Science.
SABR101x Sabermetrics 101: Introduction to Baseball Analytics (edX - Boston University) Although the course started at the beginning of May, it is not too late to join it, since the deadline for passing tests for all modules is July 18. The course explains many aspects of Data Science and Big Data based on an analysis of sports statistics (in this case, baseball).
Data analysis and machine learning materials
A series of materials on the popular method of machine learning Deep Learning:
Deep Learning Lecture [EN] The method of machine learning Deep learning is gaining popularity lately. In the next video, Adam Gibson explains the details of this technology at a level that is quite simple for beginners.
Basics of Deep Learning [EN] An excellent collection of articles on the basics of deep learning.
Preparing data for analysis using the Pandas [EN] library Usually, the data for analysis are initially raw and require additional processing. This material will be interesting to those who use Python SciPy when analyzing data. The article describes the practical use of the Pandas data processing and analysis library.
RStudio Product Family [EN] An article telling about the RStudio product line and their capabilities in data analysis.
Startup Ideas in Data Science [RU] A set of potentially interesting ideas for a startup in the field of Data Science.
Kaggle competitions won't teach you machine learning [EN] A curious point of view on the question of the connection between Kaggle competitions and real life machine learning tasks. Thoughts are quite controversial, although it certainly makes sense to get acquainted with them.
Summer season in machine learning [EN] In summer, the holiday season usually begins, but it also means that you can spend more time on machine learning competitions. This article provides a list of interesting opportunities for the development of their skills in data analysis and machine learning in the summer.
The best algorithm for machine learning [EN] Another useful article from the author of MachineLearningMachinery.com, which is a popular question in the data analysis community about which machine learning algorithm is the best and is it correct to even raise a question in this way.
KDnuggets data analysis tools popularity rating [EN] Analysis of the popularity of various tools in the field of Data Mining and Data Science from one of the most popular resources on this topic.
Build a ML Portfolio [EN] The article gives very valuable advice on the importance of creating your own small portfolio on the subject of Machine Learning. This can be an important aspect in the development of your data analysis career.
Required machine learning equipment [EN] Useful article about the approaches that you need to apply to your equipment in data analysis and machine learning.
Cassandra architecture and speed of this product [EN] A fresh overview of the popular NoSQL solution Cassandra and a comparison of its performance with other NoSQL leaders such as MongoDb, Couchbase, HBase.