The theme of
open government and
open data is increasingly gaining
momentum and gaining popularity among many countries of the world, their governments and
organizations . In addition, quite recently
a law on open data was
passed in Russia, which indicates a growing interest in this topic. In Ukraine, too, the government is
moving towards the publication of open data. Actually, since it is popular, you can make money on it or take part in the fashion
movement . In addition,
contests ,
festivals and
hackathons for creating websites and applications for publishing open data are held annually.
Open data is a way of presenting publicly available information in machine-readable form. In a form in which developers can download them into databases, analyze and present in a much more visual and understandable form than how it is done in state systems.

I would like to share my personal experience in creating a site for publishing open data. I used the open source platform
CKAN . It’s up to you to go in a similar way, use another platform or write your website from scratch. I hope my article will help you make the right choice.
CKAN is a data management system that makes them available through tools that simplify their publication, distribution, search, and use. More than 50 countries, organizations and cities have chosen this platform to publish their data. Among
them are the UK, USA, Czech Republic, Australia, Brazil and others. In general, the list is impressive. The platform itself is written in python.
Here is a detailed article in English.
Here is a detailed article in Russian.
CKAN installation
At this
address is a detailed instruction on installing the platform. True, not everything works as smoothly as described there. I spent a fair amount of days to sort out and install the platform. In turn, the developers offer
paid conditions for the installation, hosting and maintenance of the platform. Previously, they posted prices on the site, but now they are not. However, we are interested in CKAN as a free platform. You can also
fork this project if you wish. And this is one of the most popular
forks - the open data
hub of the UK government.
')
You are offered two ways to install the platform: installing by package or installing from source. The first way saves a huge amount of your “nervous” energy. But it will suit you only if you have a suitable system. At the moment it is Ubuntu 12.04 (until recently it was - 10.04). Here on it and I recommend you to put this platform. If you are confident in your abilities or you already have a configured system and do not want to give it up, then the project
wiki will help you. My experience is OpenVZ Ubuntu 12.04.
So the first way is batch installation. I did not succeed in it, for the reason indicated above (inconsistency of OS versions). But even here I can give you a couple of tips. Since this was my first experience of administering a virtual server (and indeed administration), my advice may seem like experienced (bearded) admins are childish, but for beginners, I hope, will be useful.
!!! Pay attention to the version of the installed platform. CKAN is currently being translated into more than 30 languages of the world, but with different success. Translation is done by volunteers. And each new version is released with a different set of translations. Check at this
address the translation status of the version you intend to install. I had to participate in the translation of the Russian and Ukrainian locale (ver. 2.0 - 2.1), since the translation was not ready. Translation is carried out on the site
transifex . You have a choice - either to install the latest version, which has a translation, or to participate in the translation.
Translation status
of the Russian locale.
1. Install the CKAN Package
We do everything according to the instructions. If no errors - go ahead, if errors - go to the second method. This rule works for all items. But first check the essence of the error - maybe it is you or the server settings.
2. Install PostgreSQL and Solr
Before installing the database, we should give ourselves the right to overwrite the stack / dev / null, otherwise we get the error / dev / null: Permission denied.
Fix simple - we get root rights and fix:
# rm /dev/null && mknod -m 0666 /dev/null c 1 3
Checking:
# ls -la /dev/null
rights should look like this:
crw-rw-rw-
After installing PostgreSQL, you must set the locale and text encoding. Install languages into the system:
apt-get install language-pack-ru-base (apt-get install language-pack-uk-base)
Stop the database:
pg_dropcluster --stop 9.1 main
And install the locale itself (note that all databases will have the same locale):
pg_createcluster --locale ru_RU.UTF8 9.1 main (pg_createcluster --locale uk_UA.UTF8 9.1 main)
We reboot and check - now the databases should have the locale and encoding we need:
reboot
sudo -u postgres psql -l
Developers recommend installing the solr-jetty package. But, according to my observations and experience - it does not work. I do not know why. I tried everything, but it does not work. I had to go around. If you are unable to run the sorl native method, then catch the fix:
Assign the value of the latest version of
jetty :
JETTY_VERSION=7.6.10.v20130312
Take her:
wget download.eclipse.org/jetty$JETTY_VERSION/dist/jetty-distribution-$JETTY_VERSION.tar.gz
Unpack:
tar xfz jetty-distribution-$JETTY_VERSION.tar.gz
We take the latest version of sorl:
wget apache-mirror.telesys.org.ua/lucene/solr/3.6.2/apache-solr-3.6.2.zip
Unpack:
unzip -q apache-solr-3.6.2.zip
Go:
cd apache-solr-3.6.2/example/
Run in the background sorl:
nohup java -jar start.jar&
Clearly follow all the instructions in the manual, and soon you will see a working site.
Now the second way, if you do not have Ubuntu 12.04
Once again I pay attention to the
wiki on installing CKAN.
1. Install the required packages
We are offered this set of packages:
sudo apt-get install python-dev postgresql libpq-dev python-pip python-virtualenv git-core solr-jetty openjdk-6-jdk
I recommend you install the following set (do not forget apt-get update and about / dev / null (described above)):
sudo aptitude install python-dev postgresql-9.1 libpq-dev python-pip python-virtualenv git-core openjdk-6-jdk curl nginx gcc bcc tcc
3. Setup a PostgreSQL database
+ additional configuration described above
5. Setup Solr
described above
9. You're done!
You are offered a code:
paster serve /etc/ckan/default/development.ini
My suggestion for running in the background is:
nohup paster serve /etc/ckan/default/development.ini&

For testing on a local machine, the steps are sufficient. But if you want to transfer your platform to the server, then here I will also give you one piece of advice.
My good advice (for which
many thanks
ibegtin ) sounds like this - use Nginx. This will greatly speed up your site. Here
there is a great instruction on how to install the paster + Nginx bundle. She really helped me to solve the problem with platform virtualization in this way.
In all other respects just follow the instructions, and everything will work out for you. If you have any questions, you can ask them to me or write to the
developers . You can also subscribe to the newsletter or follow the development of the project on
twitter .
Useful resources
CKAN Storage Extension for Google RefineIntegrating CKAN and DrupalSites on the CKAN platform
List of sites working on this platform
Directory site running on CKAN that collects data about existing data hubs.
Hub of open data in the Russian Federation
Hub of open data in the Russian Federation on the activities of
law enforcement authorities
International hub , working on the CKAN platform. You do not need to create your hub. You can upload any open data here and use api or link to this resource. The choice is yours. Good luck!