📜 ⬆️ ⬇️

Selection of random documents from the MongoDB collection

Recently, I ran into one rather trivial task, where I had to randomly select posts from the base written by users of the site. The project is written in Rails using MongoDB as a database and mongoid jam to work with it. Not that the task was difficult to accomplish, but at the same time, surprisingly, there is no absolutely simple solution for sort_by_random or the like. Under the cut a couple of examples how to solve this.


First, let's look at a simple way to solve the problem. In mongoid there is a method that allows you to skip several records or, in other words, set the cursor for the point of reference. This method is called skip and you can pass it the number of records that should be skipped. If we have a collection with three entries, then to get the second one, you can do something like this Post.skip (1) .first . Knowing the number of documents in the collection, we can make a shift to a random number of documents and start reading from there:
proxy = Post.where(...) skip = rand(proxy.count - COUNT_OF_POSTS_TO_SHOW) @posts = proxy.skip(skip).limit(COUNT_OF_POSTS_TO_SHOW) 


If you do not have special conditions for which you make a sample, the code will look easier. Usually, some conditions will still be present, such as the creation date or status. This sample is still quite random, but not quite, since we randomly select a point of reference, and then all the documents go in a row. Perhaps this variant of chance will suit someone, especially if you need to choose only one record. But this method may be completely unacceptable in cases where we select products, thus showing products from the same category or with the same price (depending on the collection indices)
My solution for getting completely random records was a little more difficult, but it gave more correct results. To do this, I needed to add a new field to the collection from which the selection was made, I called it rand_order. We wrote a random floating-point number from 0 to 1 into it. The most accurate way to fill this field is to add a before_save filter for a model that might look like this:
  def set_rand_order self.rand_order = (rand 0.0..1).round(15) unless rand_order end 

')
Thus, each time when saving an object, we check whether the value for the rand_order field is filled in and fill it if it is empty. Getting random entries will now happen this way:
 proxy = Post.where(...) skip = rand(proxy.count - COUNT_OF_POSTS_TO_SHOW) @posts = proxy.asc(:rand_order).skip(skip).limit(COUNT_OF_POSTS_TO_SHOW) 


It is worth taking into account that if you apply this method to an existing collection that contains documents, then you need to generate random numbers for the rand_order field for them. This can be done in the migration and taking into account the fact that we made it in before_filter, you just need to call the save method for each of the objects:
 Post.all.each{|p| p.save} 

Source: https://habr.com/ru/post/210706/


All Articles