Today
let's talk about how
we solve the following tasks:
- Fall fixation;
- Elimination of false positives;
- Calculate Uptime. Optimistic and pessimistic scenario.

Fixing problems and eliminating false positives
After the user adds a site for monitoring, the system starts polling it at a specified interval. The interval can be from minute to hour.
Checks are carried out from
geographically distributed monitoring points . These are all independent servers spaced around the world. Now there are more than 20 of them.
')
The agent is randomly selected from the total pool of current working agents. If during the verification point returned an error, then the process of rechecking is started with 5-7 independent agents.
After rechecking the site is considered "fallen", if the majority of points confirm the problem. Otherwise, it is considered that a local problem has arisen on the agent who has fixed the “initial error”.

The same algorithm with the definition of "lifting".
The algorithm allows to reduce the false positives to almost zero.
Counting statistics
We judge the inaccessibility of the site, only on the basis of checks at a specified interval. It is impossible to say with 100% probability that the site did between checks. However, with high probability between the two problem checks - the site is. But if there is a recovery after the error, then the site can lie as well as work in this interval. Based on this, we expect pessimistic and optimistic uptime. What is at stake can be understood by looking at the picture.
Optimistic uptime is taken into account when calculating statistics. And when notifying users, in alerts, downtime is indicated according to a pessimistic scenario.
May Uptime be with you!
Recall that to raise the uptime, you can use our service to monitor the availability of the site , as well as make an online check of the site’s performance and speed. In addition, our service allows you to quickly find out about problems with your web site or server using SMS or Gtalk.