An index across a field in a database can potentially speed up a SELECT operation with a condition on this field, can speed up a query like this: ORDER BY is a LIMIT 20 field, but the index significantly slows down table changes, etc.
When is an index needed, when will it help and will be used for SELECTs? It all depends on the selectivity of the index, i.e. from the count of lines that we get if we set the condition:
indexed_field = value
An excellent candidate for indexing is selectivity 1, i.e. a unique index (for example, id), when by the specified value we find a maximum of one record.
')
Consider as an example a table of users with fields of information about the region: country (country_id) and city (city_id). Well, when the selectivity is <5% (for example, the field city_id user). At the same time, PostgreSQL is smart, it considers not selectivity “in general” across the field, but selectivity in the form of a histogram for individual field values. Those. if we set a condition like
country = Russia
then we get 10% of the records from the database, and if the condition
country = Uruguay
then we get 2 entries, and this is what PostgreSQL understands. (Of course, here we propose that there are far fewer users from Uruguay on our server than there are users from Russia).
So, if the selectivity is bad (we get a lot of records), PostgreSQL will prefer to perform a full database scan without using an index. And this index only hinders.
PS
Cross post from my blog