It is important for the database administrator to know what tasks were performed and how this happened (by duration, successfully or not successfully, etc.). To not do this manually on each server, it is better to automate this process.
In this article I will give an implementation of the automatic daily collection of information about completed tasks of the Agent in MS SQL Server.
Algorithm:
1) create a submission for the selection of tasks:
USE [__] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE view [srv].[vJobRunShortInfo] as SELECT sj.[job_id] as Job_GUID ,j.name as Job_Name ,case sj.[last_run_outcome] when 0 then '' when 1 then '' when 3 then '' else case when sj.[last_run_date] is not null and len(sj.[last_run_date])=8 then ' ' else NULL end end as LastFinishRunState ,sj.[last_run_outcome] as LastRunOutcome ,case when sj.[last_run_date] is not null and len(sj.[last_run_date])=8 then DATETIMEFROMPARTS( substring(cast(sj.[last_run_date] as nvarchar(255)),1,4), substring(cast(sj.[last_run_date] as nvarchar(255)),5,2), substring(cast(sj.[last_run_date] as nvarchar(255)),7,2), case when len(cast(sj.[last_run_time] as nvarchar(255)))>=5 then substring(cast(sj.[last_run_time] as nvarchar(255)),1,len(cast(sj.[last_run_time] as nvarchar(255)))-4) else 0 end, case when len(right(cast(sj.[last_run_time] as nvarchar(255)),4))>=4 then substring(right(cast(sj.[last_run_time] as nvarchar(255)),4),1,2) when len(right(cast(sj.[last_run_time] as nvarchar(255)),4))=3 then substring(right(cast(sj.[last_run_time] as nvarchar(255)),4),1,1) else 0 end, right(cast(sj.[last_run_duration] as nvarchar(255)),2), 0 ) else NULL end as LastDateTime ,case when len(cast(sj.[last_run_duration] as nvarchar(255)))>5 then substring(cast(sj.[last_run_duration] as nvarchar(255)),1,len(cast(sj.[last_run_duration] as nvarchar(255)))-4) when len(cast(sj.[last_run_duration] as nvarchar(255)))=5 then '0'+substring(cast(sj.[last_run_duration] as nvarchar(255)),1,len(cast(sj.[last_run_duration] as nvarchar(255)))-4) else '00' end +':' +case when len(cast(sj.[last_run_duration] as nvarchar(255)))>=4 then substring(right(cast(sj.[last_run_duration] as nvarchar(255)),4),1,2) when len(cast(sj.[last_run_duration] as nvarchar(255)))=3 then '0'+substring(right(cast(sj.[last_run_duration] as nvarchar(255)),4),1,1) else '00' end +':' +case when len(cast(sj.[last_run_duration] as nvarchar(255)))>=2 then substring(right(cast(sj.[last_run_duration] as nvarchar(255)),2),1,2) when len(cast(sj.[last_run_duration] as nvarchar(255)))=2 then '0'+substring(right(cast(sj.[last_run_duration] as nvarchar(255)),2),1,1) else '00' end as [LastRunDurationString] ,sj.last_run_duration as LastRunDurationInt ,sj.[last_outcome_message] as LastOutcomeMessage ,j.enabled as [Enabled] FROM [msdb].[dbo].[sysjobservers] as sj inner join msdb.dbo.sysjobs_view as j on j.job_id=sj.job_id; GO
It uses two system views sysjobservers and sysjobs_view
2) create a table for storing the selected information:
USE [__] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [srv].[ShortInfoRunJobs]( [Job_GUID] [uniqueidentifier] NOT NULL, [Job_Name] [nvarchar](255) NOT NULL, [LastFinishRunState] [nvarchar](255) NULL, [LastDateTime] [datetime] NOT NULL, [LastRunDurationString] [nvarchar](255) NULL, [LastRunDurationInt] [int] NULL, [LastOutcomeMessage] [nvarchar](255) NULL, [LastRunOutcome] [tinyint] NOT NULL, [Server] [nvarchar](255) NOT NULL, [InsertUTCDate] [datetime] NOT NULL, [ID] [int] IDENTITY(1,1) NOT NULL, CONSTRAINT [PK_ShortInfoRunJobs] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [srv].[ShortInfoRunJobs] ADD CONSTRAINT [DF_ShortInfoRunJobs_InsertUTCDate] DEFAULT (getutcdate()) FOR [InsertUTCDate] GO
3) create a task in the Agent and daily collect information on those tasks that have either been completed for a long time (more than 30 seconds) or failed in the last 2 days:
USE [__]; GO truncate table [srv].[ShortInfoRunJobs]; INSERT INTO [srv].[ShortInfoRunJobs] ([Job_GUID] ,[Job_Name] ,[LastFinishRunState] ,[LastDateTime] ,[LastRunDurationString] ,[LastRunDurationInt] ,[LastOutcomeMessage] ,[LastRunOutcome] ,[Server]) SELECT [Job_GUID] ,[Job_Name] ,[LastFinishRunState] ,[LastDateTime] ,[LastRunDurationString] ,[LastRunDurationInt] ,[LastOutcomeMessage] ,LastRunOutcome ,@@SERVERNAME FROM [srv].[vJobRunShortInfo] where [Enabled]=1 and ([LastRunOutcome]=0 or [LastRunDurationInt]>=30) and LastDateTime>=DateAdd(day,-2,getdate()); GO
Here or in step 2, you can configure the filter to remove unnecessary tasks. For example, associated with replication, because they work long
4) generate an HTML report for further sending to administrators about the results by mail:
USE [__] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE [srv].[GetHTMLTableShortInfoRunJobs] @body nvarchar(max) OUTPUT AS BEGIN /* HTML- */ SET NOCOUNT ON; SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; declare @tbl table ( Job_GUID uniqueidentifier ,Job_Name nvarchar(255) ,LastFinishRunState nvarchar(255) ,LastDateTime datetime ,LastRunDurationString nvarchar(255) ,LastOutcomeMessage nvarchar(max) ,[Server] nvarchar(255) ,ID int identity(1,1) ); declare @Job_GUID uniqueidentifier ,@Job_Name nvarchar(255) ,@LastFinishRunState nvarchar(255) ,@LastDateTime datetime ,@LastRunDurationString nvarchar(255) ,@LastOutcomeMessage nvarchar(max) ,@Server nvarchar(255) ,@ID int; insert into @tbl( Job_GUID ,Job_Name ,LastFinishRunState ,LastDateTime ,LastRunDurationString ,LastOutcomeMessage ,[Server] ) select Job_GUID ,Job_Name ,LastFinishRunState ,LastDateTime ,LastRunDurationString ,LastOutcomeMessage ,[Server] from srv.ShortInfoRunJobs --order by LastRunDurationInt desc; if(exists(select top(1) 1 from @tbl)) begin set @body=' , , , 30 :<br><br>'+'<TABLE BORDER=5>'; set @body=@body+'<TR>'; set @body=@body+'<TD>'; set @body=@body+'№ /'; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+' '; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+''; set @body=@body+'</TD>'; set @body=@body+'</TR>'; while((select top 1 1 from @tbl)>0) begin set @body=@body+'<TR>'; select top 1 @ID = [ID] ,@Job_GUID = Job_GUID ,@Job_Name = Job_Name ,@LastFinishRunState = LastFinishRunState ,@LastDateTime = LastDateTime ,@LastRunDurationString = LastRunDurationString ,@LastOutcomeMessage = LastOutcomeMessage ,@Server = [Server] from @tbl order by LastRunDurationInt desc; set @body=@body+'<TD>'; set @body=@body+cast(@ID as nvarchar(max)); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+cast(@Job_GUID as nvarchar(255)); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+coalesce(@Job_Name,''); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+coalesce(@LastFinishRunState,''); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+rep.GetDateFormat(@LastDateTime, default)+' '+rep.GetTimeFormat(@LastDateTime, default);--cast(@InsertDate as nvarchar(max)); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+coalesce(@LastRunDurationString,''); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+coalesce(@LastOutcomeMessage, ''); set @body=@body+'</TD>'; set @body=@body+'<TD>'; set @body=@body+coalesce(@Server, ''); set @body=@body+'</TD>'; delete from @tbl where ID=@ID; set @body=@body+'</TR>'; end set @body=@body+'</TABLE>'; end else begin set @body=' , , , 30 , '; end set @body=@body+'<br><br> __.srv.ShortInfoRunJobs'; END GO
This stored procedure generates an HTML report on completed tasks that were performed for longer than 30 seconds or that ended with an error (in accordance with paragraph 3).
In the article above, an example of the implementation of the system of daily automatic collection of information about the Agent’s tasks performed was considered. Using this information, you can identify tasks that were performed for a long time or ended with an error. This allows the administrator to take timely measures to prevent errors in the future.
For example, you can improve the task so that it runs faster, or set the maximum time for a given task to be higher than the others.
This solution also helps to keep track of problems with creating backup copies (but more on that later, because it is not enough to notify you about critical errors once a day, you need to notify immediately and constantly repeat the notification after a certain period of time until the error is corrected ).
If you need to collect information from multiple servers, you can combine the result and send it in one message.
» Sysjobs
» Sysjobservers
Source: https://habr.com/ru/post/314628/