📜 ⬆️ ⬇️

Simple monitoring of DFS Replication in Zabbix

Introduction


With a sufficiently large and distributed infrastructure that uses DFS as a single data access point and DFSR for data replication between data centers and branch office servers, the question arises of monitoring the status of this replication.
Coincidentally, almost immediately after using DFSR, we began to implement Zabbix in order to replace the existing zoo with various tools and bring the monitoring of the infrastructure to a more informative, complete and logical look. The use of Zabbix to monitor the replication of DFS will be discussed.

First of all, we need to decide what kind of DFS replication data needs to be obtained to monitor its condition. The most relevant indicator is the backlog. It includes files that were not synchronized with other members of the replication group. You can view its size with the dfsrdiag utility, which is installed along with the DFSR role. In the normal state of replication, the size of the backlog should tend to zero. Accordingly, large values ​​of the number of files in the backlog indicate problems with replication.

Now about the practical side of the issue.
')
In order to monitor the size of the backlog via the Zabbix Agent, we will need:



Script parser


To write the parser, I chose VBS as the most universal language present in all versions of Windows Server. The logic of the script is simple: it receives from the command line the name of the replication group, the replicated folder, and the names of the sending and receiving servers. Further, these parameters are transferred to dfsrdiag , and depending on its output, it is issued:
Number of files - if there is a message about the presence of files in the backlog
0 - if there is a message about the absence of files in the backlog ("No Backlog"),
-1 - if dfsrdiag error message is received while executing the query ("[ERROR]").

get-backlog.vbs
strReplicationGroup=WScript.Arguments.Item(0) strReplicatedFolder=WScript.Arguments.Item(1) strSending=WScript.Arguments.Item(2) strReceiving=WScript.Arguments.Item(3) Set WshShell = CreateObject ("Wscript.shell") Set objExec = WSHshell.Exec("dfsrdiag.exe Backlog /RGName:""" & strReplicationGroup & """ /RFName:""" & strReplicatedFolder & """ /SendingMember:" & strSending & " /ReceivingMember:" & strReceiving) strResult = "" Do While Not objExec.StdOut.AtEndOfStream strResult = strResult & objExec.StdOut.ReadLine() & "\\" Loop If InStr(strResult, "No Backlog") > 0 then intBackLog = 0 ElseIf InStr(strResult, "[ERROR]") > 0 Then intBackLog = -1 Else arrLines = Split(strResult, "\\") arrResult = Split(arrLines(1), ":") intBackLog = arrResult(1) End If WScript.echo intBackLog 


Detection script


In order for Zabbix itself to determine all the replication groups present on the server, and to figure out all the parameters required for the request (folder name, names of the servers-neighbors), we need this information, first, to get, and second, submit it in a format that Zabbix can understand. A format that the discovery tool understands looks like this:

  "data":[ { "{#GROUP}":"Share1", "{#FOLDER}":"Folder1", "{#SENDING}":"Server1", "{#RECEIVING}":"Server2"} ... "{#GROUP}":"ShareN", "{#FOLDER}":"FolderN", "{#SENDING}":"Server1", "{#RECEIVING}":"ServerN"}]} 


The information we are interested in is easiest to get through WMI by pulling it out of the appropriate DfsrReplicationGroupConfig sections. As a result, a script was born that makes a request to WMI and, at the output, produces a list of groups, their folders and servers in the required format.

DFSRDiscovery.vbs
 dim strComputer, strLine, n, k, i Set wshNetwork = WScript.CreateObject( "WScript.Network" ) strComputer = wshNetwork.ComputerName Set oWMIService = GetObject("winmgmts:\\" & strComputer & "\root\MicrosoftDFS") Set colRGroups = oWMIService.ExecQuery("SELECT * FROM DfsrReplicationGroupConfig") wscript.echo "{" wscript.echo " ""data"":[" n=0 k=0 i=0 For Each oGroup in colRGroups n=n+1 Set colRGFolders = oWMIService.ExecQuery("SELECT * FROM DfsrReplicatedFolderConfig WHERE ReplicationGroupGUID='" & oGroup.ReplicationGroupGUID & "'") For Each oFolder in colRGFolders k=k+1 Set colRGConnections = oWMIService.ExecQuery("SELECT * FROM DfsrConnectionConfig WHERE ReplicationGroupGUID='" & oGroup.ReplicationGroupGUID & "'") For Each oConnection in colRGConnections i=i+1 binInbound = oConnection.Inbound strPartner = oConnection.PartnerName strRGName = oGroup.ReplicationGroupName strRFName = oFolder.ReplicatedFolderName If oConnection.Enabled = True and binInbound = False Then strSendingComputer = strComputer strReceivingComputer = strPartner strLine1=" {" strLine2=" ""{#GROUP}"":""" & strRGName & """," strLine3=" ""{#FOLDER}"":""" & strRFName & """," strLine4=" ""{#SENDING}"":""" & strSendingComputer & """," if (n < colRGroups.Count) or (k < colRGFolders.count) or (i < colRGConnections.Count) then strLine5=" ""{#RECEIVING}"":""" & strReceivingComputer & """}," else strLine5=" ""{#RECEIVING}"":""" & strReceivingComputer & """}]}" end if wscript.echo strLine1 wscript.echo strLine2 wscript.echo strLine3 wscript.echo strLine4 wscript.echo strLine5 End If Next Next Next 


I agree, the script may not shine with the elegance of the code and something in it can certainly be simplified, but its main function is to give information about the parameters of replication groups in a format understandable to Zabbix - it performs successfully.


Scripting the Zabbix agent configuration


Everything is very simple here. Add the following lines to the end of the agent configuration file:

 UserParameter=check_dfsr[*],cscript /nologo "C:\Program Files\Zabbix Agent\get-Backlog.vbs" $1 $2 $3 $4 UserParameter=discovery_dfsr[*],cscript /nologo "C:\Program Files\Zabbix Agent\DFSRDiscovery.vbs" 

Of course, we reign the paths to those where we have scripts. I put them in the same folder where the agent is installed.

After making changes, we restart the Zabbix agent service.

Change of the user from whom the Zabbix Agent service is running


In order to receive information through dfsrdiag , the utility must be run on behalf of an account that has administrative rights to both sending and receiving replication group members. The Zabbix agent service, running by default under the system account, will not be able to perform such a request. I created a separate account in the domain, gave it administrative rights on the necessary servers and set up on these servers to start the service from under it.

You can go another way: since dfsrdiag , in fact, works through the same WMI, you can use the description of how to give a domain account rights to use it without issuing administrative rights, but if we have many replication groups, then issue rights to each group will be difficult. However, in case we want to monitor the replication of the Domain System Volume on domain controllers, this may be the only acceptable option, since it is not a good idea to give domain administrator rights to the accounting service account.

Monitoring template


Based on the data, I created a template that:

Download the template for Zabbix 2.2 here .

Total


After importing the template into Zabbix and creating an account with the necessary permissions, we will only need to copy the scripts to the file servers that we want to monitor for DFSR, add two lines to the agent configuration on them and restart the Zabbix agent service by setting it to run on behalf of required account. No other manual settings for monitoring DFSR are required.

Source: https://habr.com/ru/post/212953/


All Articles