
Taking advantage of the relatively recent New Year holidays, I began analyzing projects on the largest international stock exchange Odesk. The resulting results, along with a description of the research methodology and auxiliary scripts, I think, will be quite interesting to the general public. Unlike the
previous excellent article on this topic , I decided to conduct a little research from the other side. First of all, I was looking for the answer to the question “what do odesk pay the most for after the fact and what will it be better for me to earn there”? In the second place, I wanted to evaluate “from a bird's-eye view” - what is working at Odesk in general? Under the cut you expect:
+ examples of scripts used to work with the Odesk API and description of several pitfalls
+ analysis of more than 200,000 completed projects for a total amount of over $ 40,000,000 USD
+ familiarity with the program for visualizing reports Tibco Spotfire
+ A lot of different interesting graphs.
+ scandals, intrigues, investigations
Immediately before the analysis it is necessary to make several important explanations on the essence of the research and the content of the article. I am an independent web developer and consultant, incl. I am engaged in freelancing and this research was primarily done for myself personally, in order to better define the strategic and tactical goals for the year. Therefore, initially there was no goal to plow up absolutely all the data from Odesk under all cuts, I analyzed the data selectively. If you want to do any additional research, it can be done quite easily with the help of the tools described. Also please note that I am not a professional statistician and in general I could be mistaken in my reasoning. This article is also written in order to jointly discuss the data obtained and perhaps look at them from a different angle. Therefore, I will be glad of any constructive discussion in the comments. And now - let's go!
')
1) Preparation for data collectionOdesk provides a fairly rich API for working with data. With pretty good documentation and examples in different languages. All you need to start is to get an individual key for working with API. Here I faced the first difficulty. All applications are moderated and when filling out the application for the key you will be asked to describe the essence of the application. First, I wrote that I wanted to do personal research on project data. This was not enough and in the process of correspondence with the support I also had to write that I do not plan to make a commercial application, I don’t want to make money on it and for the most part I am going to use the API for myself and for writing similar articles. As a result, in less than a day I was still given a key and I was able to start collecting data.
For the convenient formation and subsequent analysis of the database, I initially made a small database on MySQL from three tables. It contains project descriptions, tags to them and links between projects and tags. As you will find out later in the article, as a result, these plates were not enough and I had to add one more to analyze payments for hourly pay projects.
2) Data collectionThe following Ruby script allows you to upload project data with tags that interest you, just list them on line 32. If you wish, you can also upload data for open vacancies, replacing “completed” with “open” in line 34. I unloaded the base in chunks, starting first of all with the tags I was interested in and gradually adding to the list the adjacent tags that got to them at the same time in the database that interested me. Thus, I unloaded the data from one side rather quickly, on the other - after several iterations I captured a lot of popular tags. The main ones are:
html, html5, css, ccc3, javascript, jquery, php, wordpress, magento, drupal, python, django, ruby, ruby-on-rails, mysql, postgresql, mongodb, linux .
Data was uploaded without time limits. The result was 212,618 records. Unfortunately, if you only upload data using this script, you will not receive data on payments for projects with hourly rates, but this is not a problem for several reasons. First, the analysis of projects with a fixed payment is also interesting. Secondly, we will already have data on the number of hourly projects and their properties. Thirdly, the budgets of projects with hourly rates in the end can be analyzed and you will learn how to do this below. So far, I will describe how events developed in the course of the study. By the way, this script also takes into account the bug that in some cases tags are unloaded as a comma-separated string, and not as an array of strings.
require 'odesk/api' require 'odesk/api/routers/auth' require 'odesk/api/routers/jobs/search' require 'mysql2' require 'sequel' db = Sequel.connect('mysql2://XXX:XXX@127.0.0.1/odesk') config = Odesk::Api::Config.new({ 'consumer_key' => 'XXX', 'consumer_secret' => 'XXX', 'access_token' => 'XXX',
While the script is working, you can drink tea. However, it is not necessary to leave for the night. And if you rewrite it to a node and start processing data in several threads, then everything will be possible at times faster, but in this case I did not care.
3) Preparation for data processingTo build visualizations in the interactive mode, it is convenient to connect any program for business analytics directly to the database. I used Tibco Spotfire Desktop. On the official website it is indicated that the program seems to be paid, but when I downloaded the evaluation version I did not find any restrictions in it (although I was looking very carefully) and used it with pleasure. You can also take advantage of free products from QlikView. Well, Excel, I initially did not even consider specifically for this task - there are not enough necessary tools for advanced graphs. Perhaps the most advanced program in this niche I’ve seen is Tableaue Desktop. More intuitive than Spotfire. With more tweaks. But it is free only during the 14-day trial period, and so it costs $ 2,000.
First of all, let's take a look at the whole picture. Let me remind you that hereinafter we will consider only completed projects. As experience shows, very often there is a tangible difference between the projects for which they promise to pay and those for which they really pay. Therefore, hereinafter, data on completed projects will be taken into account. Here is the distribution of the budgets of projects with a fixed payment and their number by year:

In 2007 there were only 613 such projects with a total budget of $ 250,000. For the period from 2004 to 2007, the budgets are even less, so I just threw this data out of consideration. Also, initially around 2011–2012, a surprisingly large “hump” arose, and I saw its reason when I separately considered the breakdown of projects by the size of their budgets. Less than a dozen of obviously test projects with budgets ranging from $ 50,000 to $ 1,000,000, which in total gave the budget as much as $ 5,000,000 and slightly distorted all the statistics, got into the database. Naturally, I also threw out these anomalies from the base. Projects with a budget above $ 10,000 turned out to be only 92 pieces and they were of little interest to me, so I also brought them beyond the scope of the study. Total for the years 2007-2014 turned out 103,159 completed projects with a fixed payment and a total budget of $ 20,000,000. This is the period we will continue to analyze. At the same time, in the best 2012, 23,000+ such projects were completed with a total budget of $ 5,000,000+
Looking ahead a bit, I’ll say that at first I was very interested in various graphs and the question “where is the money, Zin” did not immediately reach me? After all, Odesk officially declares
www.marketwired.com/press-release/odesk-reaches-1-billion-spent-cumulatively-via-its-online-workplace-1817652.htm that by August 2013, customers had spent a billion dollars on the stock exchange ! Against this background, the analyzed 20 million seem to be mere crumbs and in no way fit with reality. After all, 20 million in salaries for web developers are mere pennies when it comes to developers all over the world! So on this occasion later had to conduct a separate mini-investigation. As I promised, the scandals-intrigues are waiting for you at the end of the article :) In the meantime, let's try to get the most out of the data that we have managed to collect at the moment.
To begin, we will conduct a quantitative analysis. The tag map clearly shows the most popular tags (by the number of projects in which they are listed). Unfortunately, I did not come up with an easy and fast way to build visualization of tag intersection (like Venn diagrams), so the total amount of projects here will exceed the actual number. However, I calculated some of the intersections manually and they turned out to be quite predictable.

Everything is clear almost without words. PHP taxis and pedalit - 124 182 projects. Accompanying html, javascript, mysql, css is also right there - they have an average of 45,000 projects. From CMS, the clear leader of WordPress, Joomla is exactly twice as inferior to it. Drupal is twice as common as Joomla. Magento is only 1.5 times less popular than Drupal and generally the only specialized CMS for stores that is visible on this map. Python and Ruby on Rails are roughly equal in the number of projects - in the region of 5000.
If we now look at the total budgets by tags broken down by year, we will see approximately the same picture, only in a different view:

Well, let's see for the sake of interest on the "long tail" of tags:

Well, where are the frameworks like Yii, Symfony, Zend, Laravel, AngularJS and other fancy stuff? And they are not! At the very least, customers with Odesk pay the most for ubiquitous simple technologies tested over the years. I suppose that all sorts of cool modern things are either a lot of the chosen ones, or an uncritical customer wish (that is, without a tag), or their choice remains at the discretion of the developer. For example, at the very end of this chart, right before ecommerce-consulting turned out to be node.js with a total budget of only $ 155,000, which PHP with a budget of $ 12,000,000, i.e. 80+ times more! This moment was a revelation to me. Of course, I assumed a difference of at least an order of magnitude, but could not imagine that it was so great. Almost the same can be said about Python / Ruby developers - their total budgets are less than $ 600,000.
However, general budgets are far from the only interesting indicator. After all, it is clear that PHP is not only a huge number of projects, but also developers. In this case, the total small budget of a tag does not mean that the developer will live handy. I am sure that a lot of PHP projects are just a penny. Let's see which skills are on average more valued. Now is the time to estimate the cost of an "average" project for a particular tag. Unfortunately, Spotfire did not allow the median average function to be used to calculate the budget in the following graph. But we restrict ourselves to at least the arithmetic average, while comparing it with the total budget:

Oh, here's a fresh look! PHP is actually a black hole that absorbs developers of projects with an average budget of $ 200 :) Worse than Joomla and WordPress, both in terms of the number of projects and their average cost. Well, it is understandable in principle. At the same time, Ruby on Rails is the highest-paid average developer, with an average project involving them costing $ 380. If you remove PHP from the window, you will clearly see that ajax is well paid - $ 347, flash - $ 323, html / xml - $ 294, mysql - $ 282.

I also manually calculated the intersection of the most popular tags using SQL. It was interesting - how are things for those who know only php without js or vice versa. As it was said in one KVNovskoy sketch: "I, of course, guess, but I would like to know exactly ..." Projects with the php: 119 427 tag, AND WITHOUT javascript: 98 800 tag. Projects with javascript: 45 179 tag, AND WITHOUT PHP: 24,553. That is if you take on a project with PHP - usually, be kind and JS know! But in the case of JS in addition to know PHP you need only half the time. html / css I didn’t check further, and everyone knows them anyway :)
And now let's see how popular in 2014 projects of various scales were among the top tags:

Pay attention to the outstanding posts with marks of $ 500 and $ 1000. Apparently these are psychologically important levels for customers. As my friend trader would say - support and resistance levels. Given the turnover of projects with these amounts - I think it will be very promising when looking for orders to target them. If, of course, you are developing in PHP / JS :) The “death valley” in the range of $ 500- $ 1000 is also interesting. Choose where you want to be - left or right. It's funny that starting with $ 500 + html5, css3 and jquery appear on this graph in addition to the usual html, css, javascript.
However, we can compare the number and budgets of projects between different technologies. Further, there will be an aesthetic graph, ugly from an aesthetic point of view, which nevertheless is very interesting to me. It displays as a map the total budgets of projects with certain amounts. The color means the number of projects. The sampling period is 2014. In general, he confirms the hypothesis that $ 500, $ 1000 - just a very popular amount.

Well, now is the time of scandals, intrigues and investigations! :) Indeed, for the time being, the budgets of hourly-paid projects and the search for a ghostly billion dollars remained behind the scenes. Let's start with the simple - hourly pay projects. There, of course, it will not be possible to make such an accurate analysis of budgets as with projects with a fixed price, but at least it will be possible to estimate the volumes! However, it will be necessary to add one more script to pull out data on payments for such projects. Unfortunately, for each project you have to make a separate request ... but what to do! But if you have something there is a ready-made script. By the way, it again takes into account the underwater stone - apparently, in different periods of time, the information in the database was structured in a different way and data sometimes come in different structures. Also, surprisingly, often the data on old projects are closed due to account blocking or in connection with the settings of their privacy, but the overall picture seems to be as good as possible.
require 'odesk/api' require 'odesk/api/routers/auth' require 'odesk/api/routers/jobs/profile' require 'mysql2' require 'sequel' db = Sequel.connect('mysql2://XXX:XXX@127.0.0.1/odesk') config = Odesk::Api::Config.new({ 'consumer_key' => 'XXX', 'consumer_secret' => 'XXX', 'access_token' => 'XXX',
Before running this script, I had two thoughts. First, since the number of completed projects with a fixed price and hourly pay is approximately equal, their total budgets are likely to be approximately equal. On the other hand, Odesk favorably differs from the Russian stock exchanges just by the possibility of hourly payment, and therefore such projects may be more popular here. In fact, the budgets of different types of projects turned out to be really approximately equal, which, by the way, surprised me a little. On the other hand, it's good - the significance of the tags is probably analyzed correctly. In total, the total budget of all projects in the database has increased twice as much and amounted to approximately $ 40,000,000. And here I thought hard ... How can this be if
Odesk shows beautiful graphics with a total budget of more than a billion in recent years?
In no case do I want to blame anyone, and first of all I want to find a reasonable explanation for this. Maybe I only downloaded the data on a small part of the projects? Judging by the different schedule (
see slide 23 ), the budgets for web development make up almost 30% of the total turnover of the exchange, and PHP is used in more than half of the projects in this area. So, by modest calculations, only on projects with PHP should be about $ 160,000,000, which is 4 times more than in my database! Maybe the API did not give me all the data? I checked this by making requests for open source projects with different tags through the API and through the site. Surprisingly, the difference really was. Although not very large - the site showed about 10% more projects, this is no difference 4 times. What is the matter? Could it be that the missing money is simply hidden in private projects to which customers directly invite performers? Technically and practically it is possible. True, I doubt that there are 4 times more closed projects on the exchange than open ones. So this moment has so far remained for me the biggest mystery in this study. I really hope that knowledgeable people in the comments will help me to sort out this issue in order to conduct a more accurate analysis and generally understand the general state of the stock exchange at the moment.
In any case, if you wish, you can earn money on Odesk, and even quite well. So I wish all those interested in success, good luck and beneficial use of the information from this article.