Windows server for 24x7 tasks - a myth or my “crooked hands”?
History of the last days. We have two SQL Servers (2016 with an SSD disk) and Express Edition (2012 with a traditional HDD). Hardware both computers are about the same (CPU / RAM / LAN). In general, 2016 "gives" data 2-5 times faster, with the exception of a certain set of tables, for which 2012 runs faster by 1.5-2 times. Such behavior completely contradicts any logic. Any manipulations with database settings only worsen the situation. 2016 is increasingly slowing down, but only for this set of tables.
To understand the paradox of the situation, the same database is deployed on both servers (from the same backup file). There are about 600 tables in this database. Those 5-6 who behave surprisingly are no different from dozens of similar (in structure and number of records) with which there is no such problem. On both “servers” - Windows 10 with the latest updates (these are development servers, not productive ones). On both SQL Server's latest service packs (without hot fixes). No special “modes” (trace flags, etc.) of SQL Server are included.
This riddle was great, but my colleague decided it. You even know how ... He rebooted the computer with a 2016 SQL server. After that, all the "relapses" are gone. How!? Or rather, “As long as !!!!!!!” I want to exclaim. ')
What we just did. What kind of "shamanism" did not indulge in, to explain this strange and completely illogical behavior. We rebuilt the indexes and changed the data types (in one of these “problem” tables there was a type of data marked as “obsolete” in 2016, the fact that it also applied in other “non-problematic” tables did not stop us) the number of records on the page and compared the page sizes of SQL servers, increased the version of database compatibility and rebuilt the statistics and so on, so forth.
Another problem with Windows, which was solved as in that classic joke, “Did you not just try to log out and log in again?”.
The same joke
Three are driving in the car - a technician, a businessman and a programmer. Suddenly the car stalled. The technician got out, opened the hood, began to dig there. The businessman took out his mobile phone and began to call technical assistance. The programmer looked at them in surprise and asked: - Guys, and you did not try to just go out and enter again?
An attentive reader (and a supporter of Microsoft) should immediately catch me saying that I am “distorting”. I am using a non-server operating system for the database server (Windows 10). The problem is clear - this is the desktop (!!!!!!!) operating system, but on Windows servers, of course, there is no such problem.
I hasten to upset you - with enviable regularity, I observe such “freaks of fate” on Windows servers as well.
Who is guilty!?
Well, you know - Windows Updater. He sets up something for himself there in the background and everyone hopes that he will give a ride. I mean that if you replace without restarting some of the executable files of the operating system, some settings, etc. (everything that any software update does), then it can unpredictably affect processes that are already running and which cannot be restarted imperceptibly (as in my case with SQL Server). Updating executable files and libraries that are already loaded into server memory will most likely be delayed until the computer is restarted, but everything else is a mystery. As I see it, a background Windows update will always lead to similar “surprises”.
The problem, in general, is as old as the world. The first time I encountered Windows Updater with such consequences back in 2007, it seems. Since then, nothing, unfortunately, has radically improved. In order to justify Windows Updater, I’ll say that I somehow watched as a colleague at the dawn of the 2000s installed a service pack on a Windows Server during the lunch break and could not restart the server until the end of the working day. Interesting effects were observed :)
But could it be otherwise?
Perhaps, if the executable files were “self-sufficient” - i.e. included in one form or another copies of all the libraries that they use (except for the lowest-level). The operating system updates were not background ... and we return to the world of earlier versions of Windows when all (?) Updates were “postponed” until the reboot, users did not wait until the updates were installed and turned off the computers, and as a result turned out to be non-bootable “operating systems”.
Those. most likely for server solutions you can find a solution - update everything on command and with the knowledge of the “admin”, and for “desktop” we see “the least of evils”.
My Windows Update is disabled!
As the saying goes, "If you are paranoid, it does not mean that you are not being watched." I mean, the farther away, the harder and harder it is to disable Windows Updater. Even in the off state (I’m talking about the "Control Panel"), Windows installs "critical updates" (including on server operating systems).
I strongly recommend that everyone for server solutions disable the auto-update of Windows, but at the same time perform it regularly “manually”, as is done in Unix-like operating systems. I'm talking about package managers apt, yum or the same brew for OSX. If there is something similar for Windows servers and updates, please tell me about it. I want to remind you that even if you do not have access to the Internet on your windows server, Windows Update can “pull down” updates on the local network (such infection).
Ha, these are all windows problems ...
Not really. Critical updates without your knowledge on workstations - many suppliers “roll out”. The same Apple, if I do not confuse anything (this is not about notifications for which you need to click something in the App Store, but that you have been installed without your knowledge). Link
I hope no one except Microsoft installs updates to the servers without the knowledge of the admin. In principle, I also watched “whistles” even worse than Unix, when the “curved” packet brought by apt-get did not stop the old service instance and replaced the executable files. The fact that the matter is in the “cant” of updating the package and the service needs to be stopped before the update, I guessed after the second recovery of the server from the backup (since it was a “virtual” and it was done in a couple of clicks).
Which of all this we can draw conclusions?
I would not use Microsoft server solutions without good reason (I’m not about the active domain or where there is a file / print server), but about services that should work 24x7. I can only say that over the past year I remember at least three situations where problems with our systems were solved by restarting the OS (these were all windows servers of different versions with auto-updates turned off). No, we do not have a "flowing memory." No, restarting our application servers did not solve the problem. And how do you get the situation when the date comparison operation stops working correctly for some values? I, unfortunately, watched it this year. Why we (sometimes) use Windows operating systems on application servers is a customer requirement.
Doctor, I do it and it hurts me - don't do it!
But what about the SQL Server that started this monologue? How to be with him!? The answer is simple - SQL Server 2016 on Linux . Like many, it’s probably scary for me to launch it into “industrial exploitation”, but if someone has similar experience with it, please share it.