Review of the most interesting materials on high performance (September 15-21, 2014)
I present to you the first issue of the review of the most interesting materials on high performance. When I was preparing the next issue of the review of the most interesting materials on data analysis and machine learning, I realized that the self-sufficient subject matter of the collected materials stands out. I hope that this type of review will also be useful and interesting. I will try to expand the list of resources for which I follow when preparing these reviews.
High Performance Materials
Using Apache Samza on LinkedIn An article from LinkedIn blog about how they use Apache Samza in their application and how this product helped solve problems when working with data.
Who uses Hadoop and how An interesting article about the current state of affairs in the Hadoop ecosystem: who uses it and how, as well as the prospects for development.
Upcoming meetings on Data Science in Moscow In the near future, several interesting meetings are planned at once, so I decided to publish a small list of upcoming interesting meetings on the topic of data analysis and high performance in Moscow.
New type of aggregation in Elasticsearch An article from the Elasticsearch blog about the new aggregation function top_hits, which was added to the large list of such functions in version 1.3.0.
New version of Apache Tez A small article from the blog of Hortonworks about the capabilities of the new version of Apache Tez 0.5.
10 lessons from Microsoft Azure Very interesting post, which gives 10 useful recommendations for the proper scaling of the application when using the Microsoft Azure cloud, based on their own experience.
Using Redis on Twitter An interesting video in which Yao Yu talks about using Redis at Twiiter for scaling. And in the article on the link you can find excellent material based on the presentation.
KDD 2104: Google KV and Topic Modeling The authors of the blog company URX share their impressions of the recently held in New York KDD 2014 conference, namely, they talk about the system called Google Knowledge Vault, which is actively used by Google to improve the quality of search, and they also talk about thematic modeling (Topic Modeling) .
FireBox: building block for Warehouse-Scale Computers in 2020 FAST'14 conference video titled “FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers” in which Krste Asanović (University of California, Berkeley) presents his view on the future development of Warehouse Scale Computers (WSC).
About caching on @Scale The authors of the blog company OpenDNS share their impressions of the @Scale conference organized by Facebook and talk about various modern approaches to caching, which were described at the conference.
Facebook has completely disabled one data center for fault tolerance Jay Parikh from Facebook at the @Scale conference held in San Francisco told about an interesting experiment conducted on Facebook, namely about the complete disconnection of one of the data centers to check the overall resiliency of the system.
Apache Spark 1.1 Announcement Announcement of the new version of Apache Spark 1.1 and a description of the main innovations.
Introduction to HBase An article containing video and explanatory material on HBase - data storage from the Hadoop ecosystem, and also on situations when this solution should be applied and when it is not.
Welcome to HadoopKitchen Announcement of a meeting dedicated to Hadoop, which will be held in the office Mail.ru. I am also going to attend this event.
How to succeed in Big Data A small article with infographics that tells about the main factors that influence the success of the company in the field of Big Data.
Vincent Granville about Big Data Vincent Granville is the author of DataScienceCentral, gives his thoughts and gives the definition of Big Data.
5 key ideas for understanding Big Data An interesting post from the portal Smart Data Collective, which tells 5 key points that will help derive benefit from the data most effectively.