To control the situation in the system, make timely decisions, such as: quick shutdown of the system user who has created a serious load on the system and preventing other working users, or a signal to administrators about the launch of an uncontrolled process, it is necessary to track the information correctly and efficiently, maximally using all available tools provided by monitoring program.
The result of the search for problems in the system are clear conclusions about the unavailability of a resource with justifications, the selection of the most difficult list from the total number of queries to the database server, the call module and the application code line with an approximate contribution to the system load.
There are only a few typical ways to determine the presence of bottlenecks on the server, which can be used to determine the possible causes of high system load. The performance troubleshooting process involves a series of actions to localize and determine the cause of a decrease in application performance. To do this, you must first determine the source of the performance problem.
Practice shows that in most cases, the administrator can quickly identify the problem using a number of standard situation analysis algorithms.
USE OF STATISTICS AND ANALYSIS OF DATA GATE
After your server has been running for a while, it may have problems with performance degradation. The conditions in which the server operates usually change dynamically, therefore, in order to identify and then choose a method for solving the existing problems, it is necessary to collect data for some time in order to track the server status and identify bottlenecks in its functioning.
There are five potentially problematic server locations where the likelihood of problems is highest:
- CPU
- Memory
- Disk subsystem
- Network load
- Locks
At the same time, server operating conditions may vary depending on the time of day. Therefore, at the beginning of the survey, it is necessary to determine the time of the collected data - all the time, a certain period, working or non-working time. To do this, in the main monitoring window in the control panel, set the filter by period and working time.
To determine the server bottlenecks, first of all, it is necessary to visually, using the ruler counters (fixed and current) to analyze the monitoring schedules taking into account the filter. If the graphs you need are not shown in the main monitoring window, click the “Graphs selection” button and adjust the graphs display.
The above server trouble spots are represented by the following types of graphs, which are studied and analyzed first:
- Schedule Server: CPU Load . This graph shows the total load on the central processing unit.
If for a long time (from 10 minutes) the load on the CPU is 80-100%, then most likely you need to start analyzing information about the processes that load it.

- Server Schedule : The average queue length to disk . If for a long time (from 1-2 minutes) the average queue length to the disk exceeds 4-8, it is first necessary to establish its cause. In most cases, this may be a shortage of RAM, which may be due to the execution of heavy queries or, for example, high utilization of the disk subsystem by third-party processes.

- Schedule Server: Free RAM (MB) . Potentially, memory can be a bottleneck if a long time of free RAM is less than 1 GB
- SQL graph (Name): The expected lifetime of a memory page . Memory could potentially be a bottleneck if the long life expectancy of a memory page is less than 300 s
- SQL Chart (Name): Locks, SQL (Name): Total Locking Time ...
Locks are often a bottleneck in multi-user systems.
A lock is a mechanism by which synchronized access of several users to one piece of data is synchronized.
Waiting on a lock is a situation that occurs when two users attempt to simultaneously capture the same data. In this case, one of these users is blocked, that is, must wait until the end of the first user’s transaction.
For the analysis it is necessary to use the graphs of the number of locks and the waiting time on the locks.
Unfortunately, there is no criterion by which it can be determined that locking has a significant impact on system performance. In general, the fewer locks and the shorter the wait on locks, the better.
The most dangerous in terms of performance degradation are long-term transactions, within which resources and a high level of locks were blocked.
The MSSQL Session Panel shows the lock tree. The hierarchy is represented as follows: the parent process is the one who blocks; child process - the one who is blocked. Using the lock tree, you can quickly make decisions about disabling blocking users.
After identifying potential server bottlenecks, it is necessary to analyze the statistics of the processes in the TOP 10 processes panel for additional processes that load the processor and memory — antivirus, archiving, etc.
To optimize the load, you can try to transfer part of the resource-intensive processes to another server, or to change the time of their execution.
TRASS MS SQL ANALYSIS
Identification of problem areas in the operation of the information database by analyzing the results of traces and identifying problems encountered in the process of executing queries in most cases makes it possible to identify specific causes of poor performance.
It is advisable to proceed to this analysis after visual examination of graphs in the main monitoring window. Depending on the detected problems, Reads traces (if problems with disk queues or memory) or Duration (if problems with increased processor load) are being studied, the display window of which is called in the “Trails” submenu of monitoring with the same options:
“Duration” ( queries), or
"Reads" (number of readings).
In the analysis of the traces, the filter is set according to time and the range of working hours, depending on the analyzed area. Clicking the “Apply Filter” button will select data depending on the filter settings.
Grouping by the information system module and the line number in the Statistics tab, or applying custom filters in the Group By Fields group will form the selection required for the analysis.
Having sorted the table by column
“% CPU share” , or
“% reading share” , the constructions that contributed the most to the load of the central processor unit or create the most reads are determined.
It is advisable to consider only the first 3 - 8 records, the share of which does not exceed 3 - 5%.
Double clicking on the selected record allows you to see and analyze requests in the central part of the trace analysis window. Sorting out the columns
"Duration" ,
"Readings" and
"CPU time" , it is possible to identify the most "heavy" requests.
If necessary, you can switch to the Query Text tab to examine the query in detail, after selecting it in the central table part of the window.
For some information systems (1C, etc.), it is possible to match SQL queries with specific names of internal objects of the information system, which supplements standard SQL information with information about the module and the configuration code string that can be used by the IC developers to optimize them.
USE OF USER MEASURES IN THE ANALYSIS OF PERFORMANCE
Creating and using user measurements (markers) in evaluating the information system allows you to estimate the time costs of specific operations or code sections for which they were previously set up and the parameters for displaying values ​​using the "Marker settings" option in the "Settings" submenu.

The measurement results are displayed on the main monitoring window under the main form with regular graphs, in the form of a three-zone graph and specified in tabular form in the “Markers” tab of the “Additional Information” panel.
Taking into account that information about current user metering data corresponds to current time characteristics under the main line of graphs, the user can compare the measurement results with other graph values ​​and determine the reasons for slowing down: lack of RAM, queue for disks, increased load on the central processing unit, etc. d. Fixing the values ​​of the ruler makes it possible to study in detail the data obtained in the relevant tables.
In the example presented in Figures 83 and 84, the custom metering “Any Marker Name” is set to a value range from 1.5 to 3.0 seconds, i.e. time spent up to 1.5 seconds will be displayed in green (by default), falling within a specified period of time - yellow and exceeding 3.0 seconds - red. On the scale of markers, when hovering over the rulers through a fraction, the red / yellow / green markers (in this example: 3/4/1) are displayed respectively. When using the “Additional Information” panel located on the right in the main monitoring window (connected in the “View” submenu), the obtained values ​​can be studied in detail.
USE OF STATISTICAL INFORMATION ON TABLES AND INDEXES
The most common reasons for the non-optimal operation of the system are incorrect or untimely execution of routine operations at the DBMS level.
Therefore, with a sharp deterioration in performance, it is first recommended to thoroughly check the implementation of database maintenance procedures, and then proceed to analyze available software and hardware resources, application code, networks, disk arrays, etc.
In order to determine whether a database needs to improve its maintenance activities, it is necessary to analyze the status of its statistics and indexes, since, first of all, they are the basis for efficient query execution.
MS SQL Server will not be able to execute queries in a reasonable time if the databases have the following problems:
- Indexes are becoming very fragmented.
- Some of the data on rows that once participated in an index has already been deleted, and because of this, the index takes up more disk space and requires more I / O operations when executing queries.
- Table statistics become essentially inaccurate, and it becomes clear by the server at the exact moment when it is needed.
As a rule, the quality of service for indexes is monitored by tasks for rebuilding / reorganizing indexes on a schedule; however, this estimate is not always reliable, since the script may not work correctly, it may not take into account new tables and indexes.
A more accurate assessment of the degree of fragmentation of the index gives the parameter ScanDensity.
In PERFEXPERT, to determine this parameter, the index statistics window is used, which opens with the corresponding option of the Statistics submenu. Using this window, you can determine which indexes require additional maintenance, evaluate how efficiently they are maintained, and what state they are in.
Application of the described techniques is primarily advisable for tables with more than 10 thousand rows, since small tables with several hundred rows are quickly read completely, with the result that even strong fragmentation usually does not affect performance in any way.
By selecting the analyzed database in the upper part of the window, and the table in the middle, in the lower part of the window, you can analyze the degree of fragmentation of all indexes of the selected table. Enabling the "Display only the last measurement" option will allow you to view only the data with the latest poll results, disabling - examine their change.
Scan Density shows the percentage of perfect placement of pages. The closer the result to 100%, the less fragmentation. In the presented example, it is noticeable that this table is rather fragmented. Scanning constantly uses forward and backward switching from one extent to another, instead of using only the link from one page to the next within the extent.
As a rule, defragmentation of an index by its reorganization is expedient when the value of Scan Density is from 95% to 85%, restructuring - less than 85%.
Analysis of the distribution of data on the tables and on the basis of which the query optimizer determines the optimal execution plan of the query allows you to determine how much data has changed and how outdated the statistics are. This will allow you to understand the need to update statistics to equalize performance.
If the statistics are irrelevant, then an incorrect execution plan will be built for the execution of the SQL query, and accordingly the execution time of the same query may increase several times.
If, as a result of the analysis, it turned out that the speed of the execution of queries to a particular table is slowing down, then first of all it is necessary to check its statistics for the number of changes made.
In addition, such an analysis must be done if it is known in advance that a massive insertion took place in the table, or it was not served for a long time.
Using the "Statistics by Tables" window, you can determine how much data has changed and how outdated the statistics are. This will allow you to understand the need to update statistics to equalize performance.
The “Histogram” tab at the bottom of the window shows a graph that shows the total number of changes for all tables in the selected database. At the top of the window is a histogram of the first 5 most modified tables.
To determine the values ​​of 5 of these tables on the chart at a certain point in time, a ruler is used - a straight vertical line that appears when the mouse cursor moves over the chart window.
At the same time, in the TOP 5 statistics, the green color shows the number of current changes at a selected point in time (changes since the last statistics survey), the red number shows the number of cumulative changes (changes since the last update of statistics on these tables).

In the "Statistics" tab there is information for each table for each statistic. In the "Table" table is selected, in the "Statistics" - the statistics for which you want to track changes (see the number of changes). The statistics for indexed columns are named by the name of the corresponding index, and the statistics with names that start with WA_Sys are statistics that SQL Server automatically creates for columns that do not have an index.
In the right part there is a table where all updates are displayed since the last rebuild and a graph that dynamically shows the number of rows that have changed since the last recalculation of the statistics during the selected time period.
The table, in the corresponding columns, contains information about:
- Naming statistics for indexed columns
- Last recalculation time
- The number of changes since the last statistics recalculation
- The number of changes in the last 10 minutes
- Date and time of measurement
The information obtained will help in determining the rules of database maintenance.
So, if according to the results of checking the statistics, it will be established that about a million changes have accumulated, then it is advisable to make an additional recalculation of the statistics on the table being studied, since such an amount unambiguously adversely affects the decision-making process of the request handler.
Each time after updating the statistical set of the table, the counter with information on the amount of changes made over the fields from the index key or for which statistics were collected is reset.
In addition, analyzing information about tables helps to selectively determine the feasibility of updating statistics manually. As a rule, these are tables with a large number of rows, in which data changes very often.
In some cases, too frequent updating of table statistics, on the contrary, can significantly reduce database performance. Therefore, in a really working system, different tables require different statistics update frequency. The optimal frequency of updating the statistics depends on the size and nature of the load on the system and is determined experimentally.
Analyzing the collected information about the tables and correlating them with the data of the traces, you can determine which tables more often than others need to update the statistics frequently and set up several different procedural procedures: for example, for frequently updated tables and for all other tables. This approach can significantly reduce the time for updating statistics and the impact of updating statistics on the system as a whole.