Hello, dear Habrayuzer!
Continuing the series of posts on multi-threaded programming, I would like to touch on one fundamental problem of using signal variables in Linux, unfortunately, which still does not have a beautiful universal solution (or maybe it is just unknown to me). Many, unfortunately, do not even realize that such a problem is taking place.
Consider a simple example of using a signal variable:
')
struct timeval now; struct timespec timeout; gettimeofday(&now, 0); timeout.tv_sec = now.tv_sec + 2;
The meaning of using pthread_cond_timedwait is that we either wait for a signal (pthread_cond_signal or pthread_cond_broadcast) as a notification that somethingHappens (), or stop waiting after the timeout specified by us. In the second part of the phrase lies the very potential problem! Please note that the time passed as the third parameter in pthread_cond_timedwait is set in absolute form! But what if the time is transferred back (!) After we get the current time (gettimeofday) and before we fall asleep waiting on pthread_cond_timedwait?
What will be the behavior of pthread_cond_timedwait if our process is already sleeping on this call? Everything is clean here! On all the platforms on which I conducted the experiment with the transfer of time back, the change was simply ignored, i.e. in reality, inside the call, time is still converted from an absolute to a relative value. I wonder why this is not rendered in the function interface? That would solve all the problems!
Critics may argue that this is some kind of negligibly improbable situation, so that the translation of the system time falls into this negligibly small piece of code. Let me disagree. On the one hand, if the probability of an event is not zero, then it will necessarily happen (it is accepted to call it the “general effect”), and on the other hand, everything strongly depends on the specific program. We encountered this problem when developing a video surveillance system, and these are dozens of threads (threads), in each of which pthread_cond_timedwait is done at 25 times per second, and shifting the time an hour ago resulted in a probability close to 100% , any stream and fall asleep for this hour plus 1/25 seconds!
What to do?
As I said at the beginning of my story, there is no beautiful solution to this problem, but it’s impossible not to solve it at all! In our system, we have organized a separate stream, let's call it the “system time monitoring stream”, which tracks “time transfers back” and, if they are detected, all alarm variables wake up. Those. in essence, the solution assumes the presence of a dedicated manager in the system in which it is necessary to register all the signal variables used. It turned out something like this:
class SystemTimeManager { public: SystemTimeManager(); ~SystemTimeManager(); void registerCond(pthread_mutex_t *mutex, pthread_cond_t *cond); void unregisterCond(pthread_cond_t *cond); private: static void *runnable(void *ptr); private: time_t _prevSystemTime; pthread_t _thread; bool _finish; pthread_mutex_t _mutex; std::map<pthread_cond_t *, pthread_mutex_t *> _container; }; SystemTimeManager::SystemTimeManager (): _prevSystemTime(time(0)), _finish(false) { pthread_mutex_create(&_mutex, 0); pthread_create(&_thread, 0, runnable, this); } SystemTimeManager::~SystemTimeManager() { _finish=true; pthread_join(_thread, 0); pthread_mutex_destroy(&_mutex); } void SystemTimeManager::registerCond(pthread_mutex_t *mutex, pthread_cond_t *cond) { pthread_mutex_lock(&_mutex); _container.insert(std::make_pair(cond, mutex)); pthread_mutex_unlock(&_mutex); } void SystemTimeManager::unregisterCond(pthread_cond_t *cond) { pthread_mutex_lock(&_mutex); std::map<pthread_cond_t *, pthread_mutex_t *> it=_container.find(cond); if(it!=_container.end()) _container->erase(it); pthread_mutex_unlock(&_mutex); } void * SystemTimeManager::runnable(void *ptr) { SystemTimeManager *me=reinterpret_cast< SystemTimeManager *>(ptr); while(!_finish) { If(time(0)<_prevSystemTime) { pthread_mutex_lock(&me->_mutex); for(std::map<pthread_cond_t *, pthread_mutex_t *> it=_container.begin(); it!=_container.end(); ++it) { pthread_mutex_lock(it->second); pthread_cond_broadcast(it->first); pthread_mutex_unlock(it->second); } pthread_mutex_unlock(&me->_mutex); } _prevSystemTime=time(0); sleep(1); } }
Now we just need to create an instance of the SystemTimeManager class and remember to register all the signal variables we use in it.
In conclusion, I would like to draw the attention of the respected community on the topic of this article “problem, solution, discussion”. The problem, I hope, described quite clearly. The solution of the described problem, though not the most elegant one, I cited - I hope it will be useful for someone. However, the last is a discussion - I can’t do without you, dear Habrawers. Maybe someone has some other, more elegant solutions to this problem?