📜 ⬆️ ⬇️

Microsoft DocumentDB: Article One, Introduction

Hello,
In August, we released a large number of new things on Microsoft Azure ( pruflink ), and quite naturally one of the most interesting for our audience was the Document NoSQL Database service called DocumentDB. The time has come, and we begin to write about it - the first article, as usual, is an introduction.





')

What is Azure DocumentDB?



In the modern world, applications constantly produce and not always, but consume a large amount of data. The data, like the application, mimics over time, and the data schema changes with it, which periodically leads to the idea that the schema-free NoSQL database is a good solution for such scenarios - a quick, simple and customizable solution. However, many of these technologies do not allow performing complex queries and processes involving transactions, which complicates the process of managing nontrivial models.

Microsoft Azure DocumentDB is a document-oriented, NoSQL database specially designed for applications on the web and on mobile devices — with guaranteed fast read and write operations, circuit flexibility and the ability to quickly expand and scale the base down and up. Still in DocumentDB there are complex queries involving the SQL dialect, support for JavaScript, transaction processing with many documents and much more good. By default, DocumentDB supports JSON schema operations, and deep integration with JavaScript helps to execute business logic right inside the engine with transaction in mind.

Azure DocumentDB is:



Azure DocumentDB Resources



In Azure DocumentDB, data is replicated and addressed to a URI — simple RESTful access is set for all resources. You have an account for the base, and it is a unique global namespace. All resources inside the space are stored in JSON documents with metadata and collections of things. In the picture - the relationship between DocumentDB resources.



An account consists of a pack of databases, each of which consists of several collections, each of which contains stored procedures, triggers, UDFs, documents, and related attachments. Users can be assigned to the base with specific permissions to access collections, stored procedures, triggers, UDFs, documents, etc.

Develop with Azure DocumentDB



Once Azure DocumentDB exposes operations to resources with the REST API, requests can be performed with any language that can HTTP / HTTPS. For several languages, there are special libraries that simplify working with DocumentDB:



JavaScript transactions and execution



As already written, in Azure DocumentDB, you can write logic in the form of JavaScript, “programs” are then registered to collections and support operations on documents within these collections. An application on JS can be registered for execution for triggers, stored procedures and UDF, triggers and stored procedures can CRUD, while UDFs do not have write access. All JS logic is executed inside ambient ACID transaction with snapshot isolation, and the logic on JS is considered to be a kind of modern replacement for T-SQL. If, at runtime, JS throws an exception, the entire transaction is rolled back.

Let's look at an example! Msn.com MSN is a huge portal that visits half a billion users per month. Hence the need for large scalable distributed storage with a free scheme. At some point, the development team decided to transfer everything to Azure and create there a single distributed User Data Store storage system with the following requirements:

  1. Scaling up to +425 million unique users +100 million already authenticated users
  2. 20 terabytes of storage
  3. Write latency - up to 15 ms
  4. No fixed scheme
  5. Transaction support
  6. Hadoop analytics over data
  7. Geographical distribution and availability

The choice fell on Azure DocumentDB. One part of the system, Health and Fitness, consists of the following components:



The new MSN portal stores user data in DocumentDB with 150 bandwidth units with SSDs and three geographic regions.

The size of the documents varies from 1 to 10 kilobytes, and do not have any general scheme. Most collections are set up in such a way as to give optimal values ​​of throughput, the minimum overhead for indexing.



UDS distributes user information to collections, each user's data is stored in documents. In the process, there is a horizontal scaling and distribution by user ID.

Create a DocumentDB account



Go to the new Microsoft Azure Management Portal
Click New -> DocumentDB Account.


Or you can do the same by going to the “Data, storage, + backup” category and choosing DocumentDBand.



In New DocumentDB (Preview) select the desired configuration.



In Name, enter the name - it will be used in addressing the account (host). The Pricing Tier setting can not be put so far, since the functionality is in the preview and only one payment mode is available (for more information about prices here ) . In the optional settings you can specify the capacity that will be allocated for the account - it is measured in units, adding or removing which you can quickly scale the solution (a unit consists of a weighted amount of storage and bandwidth and the default for the account is 1 unit). Read more about performance and bandwidth here .

Creating an account takes a few minutes.





Account created and ready to use. The default consistency mode is set to Session.



You can see what is happening with DocumentDB accounts in the Browse window.



Total - we looked at what DocumentDB is, at the basic concepts of the service, for example, use and created an account. In the next part - more about the concept and use.

useful links


Source: https://habr.com/ru/post/240955/


All Articles