📜 ⬆️ ⬇️

Monitoring Windows services with PowerShell and Python

image
Background:
I myself work in the technical department of a brokerage company in Toronto, Canada. We also have another office in Calgary. Somehow after the planned installation of Windows updates on a single domain controller at a remote office, the W32Time service, which is responsible for synchronizing time with an external source, did not start. Thus, for about a week, the server time went down by about 20 seconds. Our workstations at that time, by default, received time from the controller. You understand what happened. In bidding time is very important, the difference in seconds can solve a lot. The first discrepancy in time, unfortunately, was noticed by our brokers. Our department of technical support, consisting essentially of 3 people, was uttered for this. It was necessary to urgently do something. The solution was to apply a group policy that sent all the machines to an internal NTP server running on CentOS. There were also problems with the DC Barracuda Agent, a service responsible for connecting domain controllers to our Web filter, and sometimes a couple of services caused us some concern. Nevertheless, they decided to come up with something to keep track of a couple of services. I googled a bit and realized that there are many solutions, mostly commercial ones for this problem, but since I wanted to learn some scripting language, I volunteered to write a script on Python using our local Linux guru. In consequence, this turned into a script that checks all services, comparing their presence and status with the list of desired services, which unfortunately must be done manually for each individual machine.

Decision:

On one of the Windows servers, I created the following PowerShell script:
echo "Servername" > C:\Software\Services\Servername.txt get-date >> C:\Software\Services\Servername.txt Get-Service -ComputerName Servername | Format-Table -Property status, name >> C:\Software\Services\Servername.txt 


In my case there were 10 such pieces for each server.
')
I added the following batch file to Task Scheduler (it seemed to me easier than trying to start the PowerShell script from there directly):

 powershell.exe C:\Software\Services\cal01script.ps1 


Now every day I received a list with all services in a separate file for each server in a similar format:

 Servername Friday, October 26, 2012 1:24:03 PM Status Name ------ ---- Stopped Acronis VSS Provider Running AcronisAgent Running AcronisFS Running AcronisPXE Running AcrSch2Svc Running ADWS Running AeLookupSvc Stopped ALG Stopped AppIDSvc Running Appinfo Running AppMgmt Stopped aspnet_state Stopped AudioEndpointBuilder Stopped AudioSrv Running Barracuda DC Agent Running BFE Stopped BITS Stopped Browser Running CertPropSvc Running WinRM Stopped wmiApSrv Stopped WPDBusEnum Running wuauserv Stopped wudfsvc 


Now the most important part. On a separate machine with CentOS on board, I wrote this script:

 import sys import smtplib import string from sys import argv import os, time import optparse import glob # function message that defines the email we get about the status def message(subjectMessage,msg): SUBJECT = subjectMessage FROM = "address@domain.com" TO = 'address@domain.com' BODY = string.join(( "From: %s" % FROM, "To: %s" % TO, "Subject: %s" % SUBJECT , "", msg ), "\r\n") s = smtplib.SMTP('mail.domain.com') #s.set_debuglevel(True) s.sendmail(FROM, TO, BODY) s.quit() sys.exit(0) def processing(runningServicesFileName,desiredServicesFileName): try: desiredServicesFile=open(desiredServicesFileName,'r') except (IOError,NameError,TypeError): print "The list with the desired state of services either does not exist or the name has been typed incorrectly. Please check it again." sys.exit(0) try: runningServicesFile=open(runningServicesFileName,'r') except (IOError,NameError,TypeError): print "The dump with services either does not exist or the name has been typed incorrectly. Please check it again." sys.exit(0) #Defining variables readtxt = desiredServicesFile.readlines() desiredServices = [] nLines = 0 nRunning = 0 nDesiredServices = len(readtxt) faultyServices = [] missingServices = [] currentServices = [] serverName = '' dumpdate='' errorCount=0 # Trimming file in order to get a list of desired services. Just readlines did not work putting \n in the end of each line for line in readtxt: line = line.rstrip() desiredServices.append(line) # Finding the number of currently running services and those that failed to start for line in runningServicesFile: nLines+=1 # 1 is the line where I append the name of each server if nLines==1: serverName = line.rstrip() # 3 is the line in the dump that contains date if nLines==3: dumpdate=line.rstrip() # 7 is the first line that contains valueable date. It is just the way we get these dumps from Microsoft servers. if nLines<7: continue # The last line in these dumps seems to have a blank character that we have to ignore while iterating. if len(line)<3: break line = line.rstrip(); serviceStatusPair = line.split(None,1) currentServices.append(serviceStatusPair[1]) if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] == 'Running': nRunning+=1 if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] != 'Running': faultyServices.append(serviceStatusPair[1]) if nLines==0: statusText='Dumps are empty on %s' % (serverName) detailsText='Dumps are empty' # Checking if there are any missing services for i in range(nDesiredServices): if desiredServices[i] not in currentServices: missingServices.append(desiredServices[i]) # Sending the email with results if nRunning == nDesiredServices: statusText='%s: OK' % (serverName) detailsText='%s: OK\nEverything works correctly\nLast dump of running services was taken at:\n%s\nThe list of desired services:\n%s\n' % (serverName,dumpdate,'\n'.join(desiredServices)) else: statusText='%s: Errors' % (serverName) detailsText='%s: Errors\n%s out of %s services are running.\nServices failed to start:%s\nMissing services:%s\nLast dump of the running services was taken at:\n%s\n' % (serverName,nRunning,nDesiredServices,faultyServices,missingServices,dumpdate) errorCount=errorCount+1 return (statusText,detailsText,errorCount) # Defining switches that can be passed to the script usage = "type -h or --help for help" parser = optparse.OptionParser(usage,add_help_option=False) parser.add_option("-h","--help",action="store_true", dest="help",default=False, help="this is help") parser.add_option("-d","--desired",action="store", dest="desiredServicesFileName", help="list of desired services") parser.add_option("-r","--running",action="store", dest="runningServicesFileName", help="dump of currently running services") parser.add_option("-c","--config",action="store", dest="configServicesDirectoryName", help="directory with desired services lists") (opts, args) = parser.parse_args() # Outputting a help message and exiting in case -h switch was passed if opts.help: print """ This script checks all services on selected Windows machines and sends out a report. checkServices.py [argument 1] [/argument 2] [/argument 3] Arguments: Description: -c, --config - specifies the location of the directory with desired list of services and finds dumps automatically -d, --desired - specifies the location of the file with the desired list of services. -r, --running - specifies the location of the file with a dump of running services. """ sys.exit(0) statusMessage = [] detailsMessage = [] body = [] errorCheck=0 directory='%s/*' % opts.configServicesDirectoryName if opts.configServicesDirectoryName: check=glob.glob(directory) check.sort() if len(check)==0: message('Server status check:Error','The directory has not been found. Please check its location and spelling.') sys.exit(0) for i in check: desiredServicesFileName=i runningServicesFileName=i.replace('desiredServices', 'runningServices') #print runningServicesFileName status,details,errors=processing(runningServicesFileName,desiredServicesFileName) errorCheck=errorCheck+errors statusMessage.append(status) detailsMessage.append(details) body='%s\n\n%s' % ('\n'.join(statusMessage),'\n'.join(detailsMessage)) if errorCheck==0: message('Server status check:OK',body) else: message('Server status check:Errors',body) if opts.desiredServicesFileName or opts.desiredServicesFileName: status,details,errors=processing(opts.runningServicesFileName,opts.desiredServicesFileName) message(status,details) 


The dump and list files with the desired services should have the same name. The list with the services we are following (desiredServices) should be like this:

 Acronis VSS Provider AcronisAgent AcronisFS AcrSch2Svc 


The script will check the services, and then link it all into one email message, which, depending on the result, will say that everything is in order in the subject line of the message or that there are errors, and in the body of the message to reveal what kind of errors it is. For us, one check per day is enough, so early in the morning we receive a notification about the status of our Windows servers. To copy files from a Windows server to a Linux machine, my colleague helped me with the following bash script:

 #!/bin/bash mkdir runningServices smbclient --user="user%password" "//ServerName.domain.com/software" -c "lcd runningServices; prompt; cd services; mget *.txt" cd runningServices for X in `ls *.txt`; do iconv -f utf16 -t ascii $X > $X.asc mv $X.asc $X done 


This script also changes the encoding, because on my Linux machine I didn’t really like to work with UTF16. Further, in order to clean the folder from the dumps with services, I added a batch file to the Task Scheduler to launch the PowerShell script that erases the dumps.
Batnik:
 powershell.exe C:\Software\Services\delete.ps1 


Poweshell script:
 remove-item C:\Software\Services\ServerName.txt 


The project had two goals: monitoring services and training Python. This is my first post on Habré, so I already expect an influx of criticism in my address. If you have any comments, especially to improve this system, then you are welcome, share. I hope that this article will seem to someone necessary, because I did not find such a solution free and with notification by email. Maybe that was looking bad.

Source: https://habr.com/ru/post/156597/


All Articles