12 tools that every Big Data programmer needs to know about
Whether you are designing a system for analyzing Big Data or simply trying to collect and process data from your mobile applications, you can not do without high-quality analytics tools.The good news is that at the moment, many companies are launching tools on the market that take into account the needs of developers and their respective skills. Over the past year, I have met many startups, projects and tools designed to provide programmers with advanced analysis tools. In some cases, this was implemented in the form of simple scripts that made up quite powerful solutions. In others, these tools simply provided data delivery to developers in a form more suitable for analysis, which relieved those, in turn, from the lion’s share of the dirty work and facilitated further work. I think this is a significant trend in this area.
In today's world of mobile applications and cloud technologies, it has become much easier than ever to do business on a fairly simple application. Even in large companies, developers have to fight for enterprise resources, proving the greater attractiveness of their applications or finding more profitable ways to monetize it. Sometimes it even leads to the introduction of some data processing in the application itself. In any case, if your work is related to writing code, and not to data streams, you will probably need a little help. Next, I gave 12 tools (in alphabetical order) designed to help you with this difficult task. As often happens with similar selections, I could also miss some good examples, so I invite you to active discussion in the comments.
1. BitDeli
BitDeli , a startup launched in November, allows programmers to evaluate anything they want using various metrics using Python scripts. Co-founder and CEO Willy Tuulos (Ville Tuulos) said that scripts can be both simple and complex, depending on the needs, up to self-learning. Unlike Hadoop heavyweight, BitDeli is positioning itself as an easier solution, comparable to the Ruby on Rails framework, but only for analytics. ')
2. Continuuity
The brainchild of Yahoo's former chief cloud architect, Todd Papaioannou, and HBase's Facebook database engineer, Jonathan Gray, Continuuity is designed to help all companies work at the same high level as in the firms mentioned above. The team created a data structure that implements a new level of abstraction over complex connections to Hadoop and HBase clusters, and also includes a complete set of development tools. The main goal of the project is to simplify the process of creating big data applications that work with both internal and external audiences.
3. Flurry
The Flurry project, as a single store-application, actually brings its creators about $ 100 million a year, because it copes with the tasks assigned to it. The company helps developers not only make mobile apps, but also analyze all the data they produce to make these apps even better. In addition, this data can form the basis of an advertising campaign, bringing together advertisers and their target audience.
4. Google Prediction API
Of all the development tools from Google, the Google Prediction API claims to be the coolest. If you have the right data for learning the Prediction API, then this interface will be able to recognize any number of templates and give the right answers to your application. Among the examples provided by the company itself, there are such as the spam detection engine, wish analysis and the engine that can make recommendations, and Google also gives step-by-step instructions on how to build these models.
5. Infochimps
Although Infochimps is trying to make itself an IT-company (and become closer to money), the platform of the same name, however, is of real value to developers. And the top of their big data configuration and management technologies is the Wukong framework designed to work with Hadoop and its data streams using Ruby scripts.
6. Keen IO
This project has won first place in our competition Structure 2012 Launchpad as the most powerful analytics tool for mobile application developers. Only with a single line inserted into the source code and indicating what exactly to track, programmers can follow everything that interests them in their applications. In this case, bringing the data into a form suitable for analysis is just a matter of creating a convenient visual panel.
7. Kontagent
Kontagent's main business is its platform for analyzing mobile, social and web applications, working with Hadoop and capable of processing really huge amounts of information. Earlier this year, the company launched a product that allows users to collect information from their applications using an SQL-like query language Hive for Hadoop. Instead of tracking predefined variables, there is freedom of choice with this product.
8. Mortar Data
Mortar Data is Hadoop for developers, simple and clear. Almost a year ago, the company offered its cloud service, replacing MapReduce with a combination of Pig and Python. In November, the release of the open-source framework Mortar took place with the aim of creating a community for sharing data and experiences on working with Hadoop. Currently, Mortar Data runs on top of Amazon Web Services and supports Amazon S3 and MongoDB (hosted on Amazon EC2) as information resources.
9. Placed Analytics
Placed does away with scripts, APIs, and other hard work and simply provides its users with a ready-made result. In this case, this is detailed information about where and when consumers used a mobile application or site, even the name of the business. This information can be very useful for attracting advertisers, as well as for creating information functionality of the application (for example, embedding a voice alert to use the application while driving).
10. Precog
Precog at first glance may seem like an ordinary private business, but it is not so simple on closer inspection. The company offers a service called Labcoat, which is an interactive development environment for analytic models based on the Quirrel open query language. The IDE includes a study guide for the language and some complex functions, and the Precog executive director, Jeff Carr, said that even people without technical education can easily learn this language in a few hours.
11. Spring for Apache Hadoop
Although Hadoop is written in Java, this does not mean that it will be easy for Java developers to work with Hadoop. That is why in early 2012, SpringSource announced the release of Spring for Apache Hadoop. This means that it is now possible to integrate with other Spring applications, as well as writing scripts in JVM-like languages, and besides this, the process of creating applications using Hadoop and related technologies such as Hive and HBase is greatly simplified.
12. StatsMix
Acting in the same way as BitDeli with Keen IO, StatsMix allows developers to collect and process a large amount of data coming from their applications using only the languages that they already know. The service automatically tracks some indicators, but their list can be significantly expanded thanks to the StatsMix API and standard libraries. The results of this tool are presented to the user in the form of visual panels, the form of which he can customize to fit his needs, and can also share them or use to bring together several sources of information into one presentation.
Do you have experience with any of the presented services? Can you add?