📜 ⬆️ ⬇️

The first meeting of the club DZ - MongoDB, Clojure, MapReduce and Azure

Yesterday I visited an interesting event and wanted to share my impressions. The event was an informal meeting with open discussions, communication and lots of practical information.

Some statistics can be found in the LiveJournal of the main organizer Dmitry Zavalishin from the Digital Zone.

Briefly on topics - MongoDB, Clojure, MapReduce and Azure. During the initial acquaintance of all those who came, it became known that the majority of people came to hear about the world without SQL in the face of MongoDB.

The full program looked like this:

')
As mentioned, Entarena Inc. is an ambitious Californian startup with a part of development in Russia. The prototype, which has been developed since last fall, is scheduled to be completed in 2-3 months.

Ilya explained the choice of MongoDB and Clojure by the convenience of their use by developers, which allows them to develop faster and more efficiently. The audience had a question about performance in “combat” conditions - on millions of records, etc. Exact figures from the tests at this stage have not yet appeared, but according to the “sensations of architecture” and the experience of other projects, the forecasts are optimistic. Ilya promised to inform specifics after the launch of the prototype that it would be really interesting to listen.

There was a question - why Clojure? What else was watching? We looked at what works on the JVM for the availability of all Java libraries (“in which there is everything!”). I remember that they compared it with Scala, which seemed too complicated.

Dmitry Martynov from Microsoft spoke about cloud storage, which can be either a regular relational or non-relational NoSQL. As I understand it, the real convenience of this service in its integration with the rest of the Microsoft technology is that there are convenient interfaces in C #, etc. But in general, the repository has a RESTful interface and you can work with it “even from curl”.

Most remembered and liked the story of Yandex from Pavel Aleshin and Alexander Serkov about the victory over terabytes of statistics. He simply caused a flurry of questions from almost everyone. There was a clear problem: there is more data, and the capacity is “not rubber” (over 8 years, the amount of data increased 2,000 times - from 2GB to 4Tb per day (!), While equipment performance - only 10). So what to do?
Oracle RAC did not help, the limit was on the horizon. We decided to use our own developed MapReduce (it was more accessible due to the presence of the developer itself than using external Hadoop). The most interesting thing is that this is not just an idea, but an already implemented and tested system that “really works”. The maximum that can be “lost” due to a failure is the last few minutes of statistics.

In general, the developers breathed a sigh of relief and now Yandex feels "dry and comfortable." In addition, the built system linearly scales and the guys are not afraid even petabytes.

In addition to the stories there were tea, coffee, and buns. In general, everything is as it should be for pleasant communication.

Summarizing: interesting, pleasant, useful. Thanks to Dmitry for the organization!
Next time - in two weeks on Thursday.

Source: https://habr.com/ru/post/86545/


All Articles