Continuing the topic of host optimization for interoperating with NetApp
FAS storage systems, this article will be devoted to optimizing VMWare ESXi performance, previous articles were devoted to tuning
Linux and
Windows OS in
SAN environment. NetApp has been working closely with VMware for a long time, which can be confirmed by the fact that the sensational
vVOL technology
was implemented as one of the first in the release of Clustered Data ONTAP 8.2.1 (August 2014), while vSphere 6.0 has not even been released yet. Therefore, NetApp
FAS storage systems are extremely popular in this environment. The
Disk Alignment part will be useful not only for NetApp owners.
VMWare ESXi settings can be divided into the following parts:
- Hypervisor Optimization
- Guest OS Optimization ( GOS )
- Optimal SAN Settings ( FC / FCoE and iSCSI )
- NAS Settings ( NFS )
- Check compatibility of equipment, firmware and software

To search for a bottleneck, a sequential exception technique is usually performed. I suggest first thing to start with the
storage system . And move on to the
storage system -> Network (
Ethernet / FC) -> Host (
Windows /
Linux /
VMware ESXi 5.X and
ESXi 6.X ) -> Application.
There are a couple of basic documents you need to rely on when configuring VMware + NetApp:
')
TR-4068: VMware vSphere 5 on NetApp Clustered Data ONTAPTR 3839: Using NFS in VMware (7-Mode)TR-3749: A Guide to Best Practices for Using NetApp Systems with VMware vSphere (7-Mode)TR-3802: Ethernet for Storage: Best Practices (7-Mode)You don’t have to give the guest
OS all server resources; first, the hypervisor needs to leave at least 4GB of
RAM , and secondly, the opposite effect is sometimes observed when adding guest
OS resources, this needs to be selected empirically.
I will carry out this section in a
separate post .
Tuning settings is needed for two purposes:
- Optimization of guest OS work speed
- Normal work in HA pair, with the failure of one controller (takeover) and the resumption of its work (takeover)
To optimize performance, you may need
to eliminate disk misalignment .
Misalignment can be obtained in two cases:
- due to incorrectly chosen geometry of the moon when it was created in the storage system . Such an error can be created only in the SAN environment.
- inside virtual disks of virtual machines. It can be both in SAN and in NAS environment
Let's look at these cases.
First, consider fully aligned blocks on the
VMFS datastor and storage boundaries.

The first case is when there is a
VMFS datastor
misalignment regarding storage. To eliminate the first type of problem, you must create a moon with the correct geometry and move the virtual machines there.

The second situation, with displaced file system partitions within the guest
OS with respect to the
WAFL file structure, can be obtained in older Linux distributions and Windows 2003 and older. As a rule, this is due to the non-optimal allocation of the MBR partition table or to the machines that were converted from physical to virtual. You can check this in
the Windows guest OS using the
dmdiag.exe -v utility (the value of the Rel Sec field must be a multiple of 4KB per
WAFL ). More on
diagnosing misalignment for windows machines. The location of the file system on the disk can also be verified using the
mbralign utility
for the ESXi host included in
NetApp Host Utilities version 5.x and
VSC . For details on how to eliminate such situations are described in the
TR-3747 Best Practices for File Alignment in Virtual Environments .

And of course, you can get
misalignment on two levels at once: both at the
VMFS datastor level and at the guest
OS file system level. Learn more about
finding misalignment from the NetApp FAS repository .

Example for VMFS3 file system. In the newly created VMFS5 (not an upgrade from VMFS3), the block is 1MB in size with 8KB sub-blocks.
To complete the takeover / giveback in the
HA pair, you need to configure the correct guest
OS timeouts:
OS | OS Tuning for SAN: ESXi 3.x / 4.x and Data ONTAP 7.3 / 8.0 (SAN) | Updated Guest OS Tuning for SAN: ESXi 5 and later, or Data ONTAP 8.1 and later (SAN)
|
---|
Windows | disk timeout = 190 | disk timeout = 60 |
Linux | disk timeout = 190 | disk timeout = 60 |
Solaris | disk timeout = 190; busy retry = 300; not ready retry = 300; reset retry = 30; max. throttle = 32; min. throttle = 8 | disk timeout = 60; busy retry = 300; not ready retry = 300; reset retry = 30; max. throttle = 32; min. throttle = 8; corrected VID / PID specification |
The
OS default values for
NFS are satisfactory, and the settings for the guest
OS do not need to be changed.
These values are set manually or using scripts available in the
VSC .
Windows : Set the value of the disk access delay 60 seconds using the registry (set in seconds, in hexadecimal form).
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Disk] "TimeOutValue"=dword:0000003c
Linux : Set the disk access delay value to 60 seconds by creating a udev rule (specified in seconds, in hexadecimal form).
DRIVERS=="sd", SYSFS{TYPE}=="0|7|14", RUN+="/bin/sh -c 'echo 60 > /sys$$DEVPATH/timeout'"
(Linux distributions may have a different location for setting udev rules). VMware Tools for Linux guest
OS automatically sets the udev rule with a delay value for the virtual disk equal to 180 seconds. You can run the
grep command for “VMware” vendor ID in the folder with udev rules to find the script that sets this value and change it if necessary. Remember to check this value.
Solaris : Set the value of 60 sec delay (specified in seconds, in hexadecimal form) for a disk in the
/ etc / system file:
set sd:sd_io_time=0x3c
Additional settings can be made to the
/kernel/drv/sd.conf file:
Solaris 10.0 GA - Solaris 10u6:
sd-config-list="NETAPP LUN","netapp-sd-config", "VMware Virtual","netapp-sd-config"; netapp-sd-config=1,0x9c01,32,0,0,0,0,0,0,0,0,0,300,300,30,0,0,8,0,0;
Solaris 10u7 and newer and Solaris 11:
sd-config-list= "NETAPP LUN","physical-block-size:4096,retries-busy:300,retries-timeout:16,retries-notready:300,retries-reset:30,throttle-max:32,throttle-min:8", "VMware Virtual","physical-block-size:4096,retries-busy:300,retries-timeout:16,retries-notready:300,retries-reset:30,throttle-max:32,throttle-min:8";
Please note: there are two spaces between vendor ID NETAPP and ID LUN, as well as between the words "VMware" and "Virtual" in the config above.
Learn more about
zoning recommendations
for NetApp in pictures .
For NetApp
FAS systems with 7-Mode,
ALUA is recommended for
FC /
FCoE . And for Systems NetApp
FAS with
cDOT ALUA is recommended for use for all block protocols:
iSCSI /
FC /
FCoE .
ESXi will determine if
ALUA is
enabled . If
ALUA is enabled, the
Storage Array Type plug-in will be
VMW_SATP_ALUA , if
ALUA is disabled, it is recommended to use the
Fixed + path balancing policy you must manually specify the optimal / preferred paths. If
ALUA is used, the algorithm
Most Recently Used or
Round Robin is allowed to use - any.
Round Robin will be more productive if there are more than one path to the controller. In the case of using Microsoft Cluster +
RDM disks, the
Most Recently Used balancing mechanism is recommended.
Below is a table of recommended load balancing settings. Learn more about
NetApp FAS, ALUA logic and load balancing for block protocols .
Mode | ALUA | Protocol | ESXi Policy
| ESXi Path Balancing
|
---|
7-Mode 7.x / 8.x | Enabled | FC / FCoE | VMW_SATP_ALUA | Most Recently Used or Round Robin
|
7-Mode 7.x / 8.x | Disabled | FC / FCoE | AA SATP | Fixed PSP (choose optimal paths)
|
7-Mode 7.x / 8.x | Disabled | iSCSI | AA SATP | Round Robin PSP |
cDOT 8.x | Enabled | FC / FCoE / iSCSI | VMW_SATP_ALUA | Most Recently Used or Round Robin |
Check the applicable policy to the checked moon / datastore Make sure that the SATP policy applied to your moon has the reset_on_attempted_reserve option enabled:
For an ESXi host to work optimally, it is necessary to install the recommended options for it.
Parameter | Protocol (s) | ESXi 4.x with DataONTAP 8.1.x | ESXi 5.x with DataONTAP 7.3 / 8.x |
---|
Net.TcpipHeapSize | iSCSI / NFS | thirty | 32 |
Net.TcpipHeapMax | iSCSI / NFS | 120 | 512 (For vSphere 5.0 / 5.1 set 128) |
NFS.MaxVolumes | Nfs | 64 | 256 |
NFS41.MaxVolumes | NFS 4.1 | - |
NFS.HeartbeatMaxFailures | Nfs | ten |
NFS.HeartbeatFrequency | Nfs | 12 |
NFS.HeartbeatTimeout | Nfs | five |
NFS.MaxQueueDepth | Nfs | - | 64 |
Disk.QFullSampleSize | iSCSI / FC / FCoE | 32 (for 5.1 is configured on each LUNe ) |
Disk.QFullThreshold | iSCSI / FC / FCoE | 8 (for 5.1 is configured on each LUNe ) |
There are several ways to do this:
- Using the Command Line Interface (CLI) on ESXi 5.x hosts.
- Using vSphere Client / vCenter Server.
- Using the Remote CLI tool from VMware.
- Using the VMware Management Appliance (VMA).
- By adopting the Host Profile unwinding it from an already configured ESXi 5.x to other hosts.
An example of setting advanced parameters from ESX 4.x CLIThe esxcfg-advcfg utility used in these examples is located in the / usr / sbin folder for the ESXi host.
Checking advanced settings from ESX 4.x CLI An example of setting advanced parameters from ESX 5.x CLIThe esxcfg-advcfg utility used in these examples is located in the / usr / sbin folder for the ESXi host.
Checking advanced settings from ESX 5.x CLI An example of setting advanced parameters from ESX 5.1 CLI Checking advanced settings from ESX 5.1 CLI NetApp typically recommends using "default values" for
HBAs set by the adapter manufacturer for
FAS systems with an ESXi host. If they have been changed, you must return them to the factory settings. Check out relevant best practices. For example, if we are talking about DB2 virtualization in a VMware environment on NetApp, then it is recommended (
see page 21 ) to increase the queue length to 64 on ESXi (as written in
Vmware KB 1267 ).
Qlogic HBA setup example on ESXi
The NetApp VSC plugin (is free software ) sets the recommended settings on the ESXi host and HBA adapter: queue, delay, and others. The plugin itself integrates into vCenter. Saves time and eliminates the human factor during the test when configuring parameters on an ESXi host to work more effectively with NetApp. Allows you to perform basic operations to manage storage from vCenter, necessary for the administrator of virtualized environments. Access rights to the repository using VSC can be flexibly configured for multiple users using RBAC .

A version is available for both the “fat” (old) client and the new web client.

If
iSCSI is used, it is highly recommended to use Jumbo Frames on Ethernet with a speed higher than or equal to 1Gb. Read more in the article about
Ethernet with NetApp FAS .
Remember to create the right network adapter - VMware recommends using VMXNEE3. Starting with ESXi 5.0, VMXNET3 supports Jumbo Frames. The E1000e network adapter supports speed of 1GB networks and MTU 9000 - it is installed for all created VMs by default (except Linux). Standard virtual network adapter type "Flexible" supports MTU 1500. More.

Also, do not forget that the port group installed for the virtual network adapter of your virtual machine must be connected to a virtual switch with the MTU 9000 setting set for the entire switch.

NetApp
FAS systems support VMware
VAAI primitives by downloading some of the routine data management tasks on a datastore from host to storage, where it is more logical to do this. In a
SAN environment with ESXi 4.1+ and higher with NetApp
FAS Data ONTAP 8.0 and above,
VAAI is automatically supported and does not require any manipulations. For the
NAS environment, NetApp has released a plugin that allows you to perform similar optimization for the
NFS protocol. This requires the installation of a
NetAppNFSVAAI kernel
module for each ESXi host.
VSC can install the
NFS VAAI plugin automatically from vCenter.
VASA is free
software that allows vCenter through the
API to learn about the storage capabilities and more intelligently use them.
VASA integrates into
VSC and allows you to create datastore profiles with specific storage capabilities via a
GUI interface (for example, presence / absence of Thing Provitioning, disk type:
SAS /
SATA /
SSD , availability of a second-level cache, etc.) and include notifications on reaching level (for example, occupancy or load). Starting from version 6.0,
VASA is a mandatory component of
VSC and is an important part of the VMware 6
vVOL paradigm .
Space Reservation - UNMAP
Starting with ESXi 5.1, the release of released blocks from a thin moon (datastor) is supported. This is configured by default on ESXi 5.1, disabled by default on all other ESXi versions 5.X & 6.X (requires manual start), for ESXi 6.X it works automatically with vVOL, and on the ONTAP side this functionality is always the default is off, to
enable it, you need to run several uncomplicated commands on the storage .
Make extensive
use of the compatibility matrix in your practice to reduce potential problems in the
data center infrastructure. For troubleshooting, contact
KB NetApp and
VMware .
I am sure that over time I will have something to add to this article on optimizing the ESXi host, so look here from time to time.
I ask to send messages on errors in the text to the LAN .Notes and additions on the contrary please in the comments