"Do you see a gopher?" Not? And he is! "
A similar situation is quite possible with the uncontrolled growth of the number of virtual machines (Virtual Machine Sprawl) - until a certain point we do not believe that such a problem exists in our infrastructure, but in fact it exists, it just does not show itself in full size.
And what is the matter?
The problem of an uncontrolled increase in the number of virtual machines is quite typical for rapidly growing virtual infrastructures. It all starts with the very reasonable desire of one or another division of the company to work with its specialized applications on dedicated servers. Of course, thanks to virtualization, they get this opportunity. And now, dear readers, employees of the IT industry, developers and testers, think for at least approximately, how many virtual machines have you captured, say, over the past six months? And how many of them were forgotten and forgotten after a successful release?
It turns out that the more virtualization penetrates into the work environment and helps to solve production problems, the more “virtual waste” appears in this very environment (as, indeed, in almost any production). This is what leads to the problem of the uncontrolled growth of the number of virtual machines (VM sprawl).
Machines are virtual and money is real.
“Drawn” machines continue to consume resources: even being inactive, they take up space in the storage system, some of them are distracted by the processor - and yet these resources are sometimes not enough for the operation of important and necessary applications! It is likely that as a result, the company will have to spend a serious amount on additional storage. For the case of a data center in 1000 virtual machines, the possible financial losses from VM Sprawl are estimated at several tens of thousands of dollars per year (see the
VMware article ).
')
In addition, “virtual garbage” carries a threat to information security, since forgotten “virtual machines” may fall out of regular maintenance processes: installing patches, updating anti-virus databases, changing group policies, etc.
Is it possible to somehow deal with VM sprawl?
The answer is yes. For example, the Veeam Availability Suite, in particular, its component Veeam ONE, includes more than 80 reports on VMware and Hyper-V infrastructures, as well as on the Veeam backup infrastructure. Among them are those that allow to reveal the
hidden threat of signs of uncontrolled growth of the number of virtual machines and provide recommendations for effective planning and allocation of resources. About them will be discussed below.

“Everything has gone somewhere, nothing is left ...”
To assess the current situation, we will use the capabilities of the Veeam Availability Suite, designed to plan and forecast the use of storage, memory and processor resources.
Let's take a look at
Capacity Planning Dashboards, and lack of space (or memory, or both) will make you think hard - do you need to make an urgent request for purchasing additional storage right now or can you still “live until tomorrow”?

"Calm, only calm!"
Let's
not get depressed see how instead of “knocking out” additional funding, you can, on the contrary, help the company save money by following 5 simple steps to rationalize the use of existing resources.
Step number 1 Calculate the "zombies"
Let us call them so for brevity - although they don’t “devour the brain”, they “
take it out ” consume disk space and other vital infrastructure resources. These are virtual machines that are completely or almost not at all used, it is not clear why, as opposed to useful and in demand.
a) Run the
Idle VMs report and get a list of such virtual machines, and here we decide what to do with them: turn off, reduce the resources allocated to them or give them to other tasks. Before launching the report, do not forget to specify the parameters:
- for what period of time we want to see the data
- exactly which values ​​will be considered as a resource utilization threshold (processor, memory, disk space, network)
- how much time (in% of the selected interval) the car must spend in the Idle state to get into the report
b) To search for virtual machine templates, use the
Idle Templates report, and as the report parameter set the time of the last use of the template.
At the exit, we get a list of "orphan" objects with an indication of the size and location - these templates can be deleted or migrated to a larger space.
c) Using the
Inefficient Datastore Usage report on inefficiently used disk space,
we investigate “zombies” (barely living virtual machines) - we immediately see where these machines are located, when they were last used and how much space they took.

Step number 2. Find extra backups
What if the
Capacity Planning for Backup Repository report shows that a place in the backup repository is running out?
We recommend that you check if any machines are included in multiple backup jobs at once. To do this, run the
VMs Backed Up by Multiple Jobs report and see who and where the backups of such machines are stored.

Step number 3 Remove the "garbage"
The accumulating "garbage" is a side effect of the life of the virtual infrastructure, that is, a multitude of changes occurring in it every day. Temporary virtual machine files and configuration files can continue to exist on the storage system after the parent objects have been deleted - and this is an additional expense of disk space.
Here the report
Garbage Files helps - it determines which objects are no longer used, and where are the corresponding files, "waste".

Step # 4 Apply categorization
A useful option of the Veeam Availability Suite is also the ability to group infrastructure objects using business criteria. What is it and how will it help defeat the uncontrolled increase in the number of cars? Everything is very simple - for virtual machines we specify their “organizational data”: which department uses, on which project, in what capacity, etc., etc. If necessary, select the appropriate category in the view and perform a mass operation.
Suppose the R & D department involved a number of virtual machines to work on the
Temp project — after the completion of the project, we review the list of these machines and delete unnecessary ones.

Step number 5 Find unnecessary snapshots
It would seem, what have the snapshots, if we are talking about saving resources, redundant virtual machines and the like? In fact, an excessive amount of snapshots also negatively affects the infrastructure, so we recommend including this step in the process of dealing with the VM sprawl problem.
Consider a situation where snapshot “drops out” of the virtual machine snapshot chain - this can happen when the host crashes, fails when snapshots are consolidated or if the backup is not correctly created. Such an orphaned snapshot, however, continues to occupy disk space.
By the way,
VMware recommends monitoring the length of the snapshot chain of a virtual machine: it should consist of no more than 3 snapshots, the usage time of each should be no more than 3 days.
Let's try to build our own
Custom Infrastructure report, specifying a virtual machine and a virtual disk as an object type, and properties that are of interest to snapshots as properties we are interested in. To do this, in the
Select Columns dialog, select
Name, VMDK file, Virtual Disk: Label, Snapshot: File name, Snapshot: File size .
Then set the
Custom Filter value
filter as an expression:
VMDK file - Contains - 0000 .

At the output, we get a list of "orphan" snapshots detected in our VMware-infrastructure.
For more detailed instructions on generating such a report, welcome to the Veeam Knowledge Base: KB1757: Using Veeam ONE Reporter for Detecting Orphaned Snapshots in VMware .
It is also useful to monitor the age of snapshots, for which we use the
Active Snapshots report. It shows which snapshots are the largest and which ones are the oldest (most likely, you will hardly need to roll back the virtual machine to such an old state).

In conclusion, useful advice
In order to simplify the task of dealing with VM sprawl, I advise you to create a special folder for the above-mentioned reports in Veeam ONE Reporter, name it, say,
VM Sprawl Control , and put the whole “magnificent seven” in it. When you first generate a report, you need to remember to specify the required parameters and threshold values ​​(where they exist), and then you can set up automatic generation of the entire report folder on a schedule and delivery by mail (see the first picture at the beginning of this article).
Additional links: