New library for machine learning in Java An article that discusses the advantages and disadvantages of a new open source machine learning framework for the Java programming language called Datumbox.
MIT Scientists Predict Bitcoin Cost An article about a group of MIT scientists who built a regression-based predictive model to predict short-term fluctuations in Bitcoin, which allowed them to double their investments in two months.
An Introduction to In-Memory Computing (Part 4) Continuation of a series of articles on the topic of In-Memory Computing from the portal insideBIGDATA. In this case, it will be about measuring the performance of In-Memory Computing.
An Introduction to In-Memory Computing (Part 5) The fifth and final part of a series of articles on In-Memory Computing from the insideBIGDATA portal. In this case, we will focus on the product GridGain Data Fabric.
SQL or NoSQL? Another small article that contains the author's thoughts on such a popular issue as the choice of technology for data storage.
15 non-aging articles on Data Science A list of 15 articles from the DataScienceCentral portal, which were published 1-2 years ago, but still have not lost their popularity and relevance.
Theory and algorithms of machine learning, code examples
How to master machine learning algorithms 5 great tips from the author of the blog MachineLearningMastery about how to properly approach the issue of studying various machine learning algorithms.
Nonlinear regression A fairly simple description of the concept of nonlinear regression.
First look at Distributed R A small note about a very interesting project from HP Labs called Distributed R.
How MKL Improves Revolution R Open In the past, the review was a reference to the announcement of the Revolution R Open, and in this article we will discuss the implementation details of this version of the R programming language, namely, the acceleration of certain operations using the Intel Math Kernel Library (MKL).
Tips for choosing a model in machine learning competitions Continuing the discussion of the previous topic about the choice of the final model in machine learning competitions, in this case, is the opinion of the author of the popular blog MachineLearningMastery on this interesting question.
Online courses, educational materials and literature
New courses on Big Data from MIT to edX Some time ago, a publication appeared on the MIT website about an interesting MIT initiative on edX, namely, the launch of the first session of the Tackling the Challenges of Big Data course on edX at MIT, which will be available to everyone.
3 great free books on Data Science A set of three books on Data Science with a small description of each, which can be obtained free of charge.
Book Data Fluency A review of the curious new book Data Fluency from the authors.
Scaling fuzzy search algorithms An interesting report from Ken Kugler (President, Scale Unlimited) from the Cassandra Summit 2014 conference on the topic of scaling Fuzzy Matching functionality using the example of comparing the degree of similarity of customer data in the banking sector using Apache Cassandra.