Review of the most interesting materials on data analysis and machine learning №14 (September 15 - 21, 2014)
I present to you the next issue of the review of the most interesting materials on the topic of data analysis and machine learning. I also want to note that I released the first digest on high performance and Data Enginering: Review of the most interesting materials on high performance (September 15-21 , 2014) . I think that he may also be interested in someone.
General
KDD 2104: Google KV and Topic Modeling The authors of the URX blog share their impressions of the recently held KDD 2014 conference in New York, namely, they talk about the system called Google Knowledge Vault, which is actively used by Google to improve search quality and another interesting topic of topic modeling.
CuDNN Library for Deep Learning The NVIDIA library's announcement for working with Deep Learning algorithms, which uses GPU for computing, this approach allows to increase the efficiency of machine learning algorithms.
Statistics against heuristics Interesting thoughts of the author of the article about when it is reasonable to use ephristic approaches.
In London, the conference "Effective use of the language R" The author of the “R: Data Analysis and Visualization” blog talks about the “Effective Applications of the R Language (EARL)” conference on the use of the R programming language.
Introduction to Predictive Analytics (Part 2) The second part of the new series of articles from the insideBIGDATA portal on Predictive Analytics. In this case, we will focus on the areas of the use of Predictive Analytics in the corporate sphere of business.
Introduction to Predictive Analytics (Part 3) The third part of the new series of articles from the insideBIGDATA portal on Predictive Analytics. The third part describes the main approaches that are used in supervised learning, such as regression, classification and clustering.
Vincent Granville about Big Data Vincent Granville is the author of DataScienceCentral, gives his thoughts and gives the definition of Big Data.
How to succeed in Big Data A small article with infographics that tells about the main factors that influence the success of the company in the field of Big Data.
R support in Azure ML A small article from the blog of Microsoft Technet Machine Learning about the possibility of using R in the cloud solution Azure ML.
5 key ideas for understanding Big Data An interesting post from the portal Smart Data Collective, which tells 5 key points that will help derive benefit from the data most effectively.
Morse code decoding competition on Kaggle in Class In this small post we will discuss a new competition that began on the Kaggle in Class called Morse Learning Machine - v1. Morse Learning Machine participants are expected to build a system that will decode Morse code messages contained in audio files.
Microsoft Hackathon An article from the blog Microsoft Technet Machine Learning, which tells about the past under the auspices of Microsoft hackathon on machine learning.
Theory and algorithms of machine learning, code examples
GPS data visualization A good example of code for visualizing data from a GPS device using the R programming language.
Configure .RProfile The article is devoted to a useful and interesting topic of setting the parameters for launching R using the configuration file .RProfile.
Data Visualization with R Caret The author of the blog MachineLearningMastery tells about the possibilities of data visualization in the popular machine learning library Caret for the R programming language.
Using R Caret for Predictive Modeling The author of the blog MachineLearningMastery talks about using the popular Caret library for the R programming language for Predictive Modeling.
Improving the training model with R Caret The author of the blog MachineLearningMastery talks about the possibility of improving the training model using the Caret library for the R programming language
A series of slides on data analysis on R In this slide set, Yanchang Zhao covers seven fairly interesting topics in data analysis and uses the R programming language for code examples.
Diagnostics of linear regression models.Part 1 The first part of a series of articles on a rather interesting topic of diagnosing linear regression models from the blog “R: Analyzing and Visualizing Data”. Code examples in the article are written in the R programming language.
Analysis of the tonality of the text in film reviews An interesting example of the analysis of textual information, namely the analysis of the tonality of the text in film reviews, using the popular graph database Neo4j and the Java programming language.
Machine learning on a live environment Colin Ristig talks about a rather interesting and important question that is sometimes forgotten - the operation of the machine learning algorithm in a lively environment.
Deep Learning Bibliography A large list of various scientific materials on the popular method of machine learning Deep Learning, categorized.
Video
Andrew Ng on Deep Learning Andrew Ng from Stanford University spoke at the 2014 Robotics: Science and Systems Conference with an interesting talk about Deep Learning.
Moscow Data Science.September 2014 Meetup On September 5, I visited a rather interesting meetup called Moscow Data Science - “September 2014 Meetup”, organized by Mail.ru. The link will allow you to watch the video from this meeting, for convenience, I have marked the start time and the duration of each participant’s performance.
Data engineering
Who uses Hadoop and how An interesting article about the current state of affairs in the Hadoop ecosystem: who uses it and how, as well as the prospects for development.
Upcoming meetings on Data Science in Moscow In the near future, several interesting meetings are planned at once, so I decided to publish a small list of upcoming interesting meetings on the topic of data analysis and high performance in Moscow.
Welcome to HadoopKitchen Announcement of a meeting dedicated to Hadoop, which will be held in the office Mail.ru. I am also going to attend this event.
Introduction to HBase An article containing video and explanatory material on HBase - data storage from the Hadoop ecosystem, and also on situations when this solution should be applied and when it is not.
Apache Spark 1.1 Announcement Announcement of the new version of Apache Spark 1.1 and a description of the main innovations.
PS I think that many would like to see more material on the subject in Russian, so if someone can advise on those, I would be very grateful and add them to my list of resources that I follow.