The other day, to some extent, a significant event occurred and one of the largest Russian companies announced that it now publishes open data on its website. This company is Sberbank and the corresponding section on their website. The opening of the section received a press release on their website and about it, as an important event, was written by dozens of financial and non-financial media.
Did Sberbank really do something incredible? Is this a common phenomenon and is what Sberbank has now made open data? That's what will be discussed further.
Before proceeding with Sberbank, let's return to the term open data.
1st official definition from Law 112- (these are amendments to 8-)
Information posted by its owners on the Internet in a format that allows for automated processing without prior changes by a person in order to reuse it is publicly available information that is placed in the form of open data.
2nd definition from wikipedia
Open data is a concept that reflects the idea that certain data should be freely available for machine-readable use and further republishment without restrictions of copyright, patents and other control mechanisms. You can free up data from copyright restrictions through free licenses, such as Creative Commons licenses. If any data set is not in the public domain, or is not bound by a license giving free reuse, such a data set is not considered open, even if it is laid out in computer-readable form on the Internet.
3rd of the open data charter
If necessary, there is no need for any information about
Or in chaotic Russian:
Open data is digital data made publicly available with technical and legal characteristics obligatory for them to be freely used, reused and distributed by anyone, anytime and anywhere.
Also, open data has clearly formulated principles for their publication, reflected in the charter of open data.
These principles are:
During those 7 years that I personally deal with the topic of open data in Russia, I heard and saw how open data was called very, very much that they are not. The most outstanding nonsense question was that when the definition is given through the description of “freely accessible machine-readable data”, the question “Are machine readable data the ones that I can read in the machine?”.
But in all definitions it is important to remember one thing - open data is oriented towards a technologically qualified consumer . The state does not produce new information products itself; it enables startups, IT companies and public figures to do this.
To parse this particular case, it is important to know why the data owners publish them at all? Especially companies and government agencies - sometimes this may seem completely strange.
PR Obligation or Benefit
These are the three main reasons why someone publishes the data (I consciously leave the fan and vanity questions behind the brackets).
And if you see the activity of any organization in open data, and indeed in general in matters of openness and transparency, then look for the answer in one of these three reasons.
For example, how PR works on open data. Its main distinctive ability is orientation towards the mass consumer , the mass voter, the mass citizen.
Technology and data issues remain on the sidelines. Questions of attendance, media coverage, the number of articles with mention - come out on top.
A living example is the open data portal of Moscow - city authorities spread news about publications even if some meaningless data set of 28 lines is posted there.
Obligation or duress is when open data is published because the law requires their publication. The data owner may not always be interested in openness, but he is in compliance with the law and publishes them.
For example, the Central Bank collects reporting forms from banks and discloses it in a special section on the website - these are the statutory obligations of banks and the Central Bank.
Another example is the 112- and 8- mentioned above. The authorities are obliged to disclose basic data sets and publish them precisely as their obligations for the non-fulfillment of which they are responsible before the law.
Commitment is the foundation of openness. It is for this reason that many of those who are obliged to disclose data do not take additional actions on their availability. They only comply with the mandatory requirements, but do not write about this advertising press releases.
For example, if the Moscow Government publishes a dataset with addresses of 28 voentorgov and distributes it on news sites, it’s not at all a fact that, for example, they will publish income declarations of city officials as open data and also spread it through the media.
In other words - the obligation is fulfilled quietly and imperceptibly, as far as possible.
Why would someone benefit from publishing their own data? It would seem - to own and be silent, someone else does not necessarily know.
Nevertheless, there are reasons why open data are published by government and commercial entities. For example, the Datasets section in Kaggle is filled in search of new finds, solutions and insights for which thousands of data scientists are needed.
Or why the Federal Treasury has been distributing data from the public procurement portal through an FTP server (even before open data stories) for many years - because it is simpler and cheaper to distribute the database required by hundreds of contractors in the federation.
Some companies organize hackathons and are looking for employees. Others publish open data to maintain community reputations, as Google does in their Transparency Report.
If you look again at the Sberbank open data section, you will find the following features:
Instead of the freedom to use and distribute there, only a disclaimer sounding like
The information provided is the result of data analysis of PJSC Sberbank, 4th quarter, 2016. The data are not managerial, accounting, financial statements. When using references to the above information, the mention of Sberbank is mandatory. Not an advertisement.
What is not even close to the free licenses
To download the data you need to find a special button on the graph and there in the menu you can also find the unloading section in XLSX, CSV or JSON. The peculiarity is that all these uploads are downloads from the Javascript files performed on the client side.
All data, in fact, is stored in 13 Javascript files starting with http://www.rdatascience.ru/opendata/data1.js and up to http://www.rdatascience.ru/opendata/data13.js
And uploading to CSV and so on is done using Javascript code. And extort any data set directly impossible. The emphasis is on visualization, not on working with these analysts.
Despite the fact that the site even uses the term "Passport dataset" which is actively used in real passports of data sets on state portals, of course there is nothing like that there. Neither information about the responsible, nor the description of the structure of the sets - nothing
The section ends with the sale of Sberbank research and the fact that all this is done on big data. And the feed format itself looks more like a longride of some infobusiness, and not a section of open data.
From all this, you can make only one conclusion - the purpose of Sberbank for this section was only PR and nothing more. I just want to hope that someday Sberbank will find a form of working with open data that would bring benefits to them and the community. Because while it is more like trying to use the popular term for the promotion of its commercial services
Source: https://habr.com/ru/post/316186/
All Articles