Continuing the topic of optimizing the ESXi host for interoperating with NetApp ONTAP
storage systems, this article will be enlightened on optimizing VMWare ESXi 6.X performance, previous articles were devoted to tuning
Linux ,
Windows and
VMware ESXi 5.X OS in
SAN . NetApp has been working closely with VMware for a long time, which can be confirmed by the fact that the sensational
vVOL technology
was implemented as one of the first in the release of Clustered Data ONTAP 8.2.1 (August 2014), while vSphere 6.0 has not even been released yet. NetApp was the first to announce support for vVol from NFS (Perhaps NetApp is still the only one here, I don’t follow). In this connection, ONTAP storage systems are extremely popular in this environment.
This article will be useful to owners of storage systems with ONTAP, and the part about
Disk Alignment will be useful not only to owners of NetApp.
VMWare ESXi 6.X settings can be divided into the following parts:
')
- Hypervisor Optimization
- Guest OS Optimization ( GOS )
- Optimal SAN Settings ( FC / FCoE and iSCSI )
- NAS Settings ( NFS )
- Check compatibility of equipment, firmware and software

To search for a bottleneck, a sequential exception technique is usually performed. I suggest first thing to start with the
storage system . And move on to the
storage system -> Network (
Ethernet / FC) -> Host (
Windows /
Linux /
VMware ESXi ) → Application.
There are a couple of basic documents you need to rely on when configuring VMware + NetApp:
»
How to configure VMware vSphere 6.x on ONTAP 8.x»
Virtual Storage Console 6.0 for VMware vSphere Installation»
TR-4128: vSphere 6 on NetApp MetroCluster 8.3You don’t have to give the guest
OS all server resources; first, the hypervisor needs to leave at least 4GB of
RAM , and secondly, the opposite effect is sometimes observed when adding guest
OS resources, this needs to be selected empirically.
This section is in a
separate post .
Tuning settings is needed for two purposes:
- Optimization of the speed of the guest OS
- Normal work in HA pair, with the failure of one controller (takeover) and the resumption of its work (giveback)
To optimize performance, you may need
to eliminate disk misalignment .
Misalignment can be obtained in two cases:
- due to incorrectly chosen geometry of the moon when it was created in the storage system . Such an error can be created only in the SAN environment.
- inside virtual disks of virtual machines. Maybe both in SAN and in NAS environment
Let's look at these cases.
Fully aligned blocks on a VMFS datastoreFirst, consider fully aligned blocks on the
VMFS datastor and storage boundaries.

The first case - Misalignment with VMFSThe first case is when there is a
VMFS datastor
misalignment regarding storage. To eliminate the first type of problem, you must create a moon with the correct geometry and move the virtual machines there.

The second case is the offset inside the guest OSThe second situation, with displaced file system partitions within the guest
OS with respect to the
WAFL file structure, can be obtained in older Linux distributions and Windows 2003 and older. Since the problem is “inside the virtual machine”, it can be observed on both NFS and VMFS datastores, as well as in RDM and vVOL. As a rule, this is due to the non-optimal allocation of the MBR partition table or to the machines that were converted from physical to virtual. You can check this in
the Windows guest OS using the
dmdiag.exe -v utility (the value of the Rel Sec field must be a multiple of 4KB per
WAFL ). More on
diagnosing misalignment for windows machines. For details on how to eliminate such situations, see the
TR-3747 Best Practices for File Alignment in Virtual Environments .

Misalignment on two levelsAnd of course, you can get
misalignment on two levels at once: both at the
VMFS datastor level and at the guest
OS file system level. Read more about
finding misalignment from the ONTAP repository .

In the newly created VMFS5 (not an upgrade from VMFS3), the block is 1MB in size with 8KB sub-blocks.
To work out with takeover / giveback in the
HA pair, you need to configure the correct guest
OS timeouts. For disk FAS systems this time is 60 seconds, and for All Flash FAS (AFF) it is 2-15 seconds. Since the cluster can contain different models of storage systems, disk, hybrid and AFF systems, and data can migrate between these systems, it is recommended to use the worst timeout value (for disk systems), namely 60 seconds:
OS | Updated Guest OS Tuning for SAN: ESXi 5 and later, or ONTAP 8.1 and later (SAN)
|
---|
Windows | disk timeout = 60 |
Linux | disk timeout = 60 |
Solaris | disk timeout = 60; busy retry = 300; not ready retry = 300; reset retry = 30; max. throttle = 32; min. throttle = 8; corrected VID / PID specification |
The
OS default values ​​for
NFS are satisfactory, and the settings for the guest
OS do not need to be changed.
These values ​​are set manually or using scripts available in the VSC:
WindowsSet the value of the disk access delay 60 seconds using the registry (set in seconds, in hexadecimal form).
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Disk] "TimeOutValue"=dword:0000003c
LinuxSet the disk access delay value to 60 seconds by creating a udev rule (specified in seconds, in hexadecimal form).
DRIVERS=="sd", SYSFS{TYPE}=="0|7|14", RUN+="/bin/sh -c 'echo 60 > /sys$$DEVPATH/timeout'"
(Linux distributions may have a different location for setting udev rules). VMware Tools for Linux guest
OS automatically sets the udev rule with a delay value for the virtual disk equal to 180 seconds. You can run the
grep command for “VMware” vendor ID in the folder with udev rules to find the script that sets this value and change it if necessary. Remember to check this value.
SolarisSet the value of 60 sec delay (specified in seconds, in hexadecimal form) for the disk in the
/ etc / system file:
set sd:sd_io_time=0x3c
Additional settings can be made to the
/kernel/drv/sd.conf file:
Solaris 10.0 GA - Solaris 10u6:
sd-config-list="NETAPP LUN","netapp-sd-config", "VMware Virtual","netapp-sd-config"; netapp-sd-config=1,0x9c01,32,0,0,0,0,0,0,0,0,0,300,300,30,0,0,8,0,0;
Solaris 10u7 and newer and Solaris 11 sd-config-list= "NETAPP LUN","physical-block-size:4096,retries-busy:300,retries-timeout:16,retries-notready:300,retries-reset:30,throttle-max:32,throttle-min:8", "VMware Virtual","physical-block-size:4096,retries-busy:300,retries-timeout:16,retries-notready:300,retries-reset:30,throttle-max:32,throttle-min:8";
Note: there are two spaces between vendor ID NETAPP and ID LUN, as well as between the words "VMware" and "Virtual" in the config above.
Learn more about
zoning recommendations
for NetApp in pictures .
For ONTAP 8.X and 9.X,
ALUA is always enabled for all block protocols:
iSCSI /
FC /
FCoE .
If the host has correctly defined the
ALUA , then the
Storage Array Type plug-in will display the
VMW_SATP_ALUA . For
ALUA ,
either the
Most Recently Used or
Round Robin algorithm is allowed to use - any.
Round Robin will be more productive if there are more than one path to the controller. In the case of using Microsoft Cluster +
RDM disks, the
Most Recently Used balancing mechanism is recommended.
Below is a table of recommended load balancing settings. Learn more about the
NetApp ONTAP, ALUA logic and load balancing for block protocols .
Mode | ALUA | Protocol | ESXi Policy
| ESXi Path Balancing
|
---|
ONTAP 9.x / 8.x (Clustered) | Enabled | FC / FCoE / iSCSI | VMW_SATP_ALUA | Most Recently Used |
ONTAP 9.x / 8.x (Clustered) | Enabled | FC / FCoE / iSCSI | VMW_SATP_ALUA | Round robin |
Check the applicable policy to the checked moon / datastore For the ESXi host to work optimally, it is necessary to set recommended parameters for it.
Parameter | Protocol (s) | ESXi 6.x with DataONTAP 8.x | ESXi 6.x with DataONTAP 9.x |
---|
Net.TcpipHeapSize | iSCSI / NFS | 32 |
Net.TcpipHeapMax | iSCSI / NFS | 1536 |
NFS.MaxVolumes | Nfs | 256 |
NFS41.MaxVolumes | NFS 4.1 | 256 |
NFS.HeartbeatMaxFailures | Nfs | ten |
NFS.HeartbeatFrequency | Nfs | 12 |
NFS.HeartbeatTimeout | Nfs | five |
NFS.MaxQueueDepth | Nfs | 64 (if you only have AFF, then 128 or 256) |
Disk.QFullSampleSize | iSCSI / FC / FCoE | 32 |
Disk.QFullThreshold | iSCSI / FC / FCoE | eight |
There are several ways to do this:
- Using the Command Line Interface (CLI) on ESXi 6.x hosts.
- Using vSphere Client / vCenter Server.
- Using the Remote CLI tool from VMware.
- Using the VMware Management Appliance (VMA).
- By applying the Host Profile, deploying it from an already configured ESXi 6.x to other hosts.
An example of setting advanced parameters from ESX 6.x CLIThe esxcfg-advcfg utility used in these examples is located in the / usr / sbin folder for the ESXi host.
Checking advanced settings from ESX 6.x CLI NetApp typically recommends using "default values" for
HBAs set by the adapter manufacturer for ONTAP systems with an ESXi host. If they have been changed, you must return them to the factory settings. Check out relevant best practices. For example, if we are talking about DB2 virtualization in a VMware environment on NetApp, then it is recommended (
see page 21 ) to increase the queue length to 64 on ESXi (as written in
Vmware KB 1267 ).
Qlogic HBA setup example on ESXi The NetApp
VSC plugin (is free
software ) sets the recommended settings on the ESXi host and
HBA adapter: queue, delay, and others. The plugin itself integrates into vCenter. Saves time and eliminates the human factor during the test when setting parameters on an ESXi host to work more effectively with NetApp. Allows you to perform basic operations to manage storage from vCenter, necessary for the administrator of virtualized environments. Access rights to the repository using
VSC can be flexibly configured for multiple users using
RBAC . VSC is required to configure vVOL.
A version of the plugin is available only for the web client. Version 6 and newer is supported.

If
iSCSI is used, it is highly recommended to use Jumbo Frames on Ethernet with a speed higher than or equal to 1Gb. Read more in the article about
Ethernet with NetApp ONTAP . Do not forget about the recommendations of VMware settings LACP, Port-channel, Spanning Tree, PortFast, Flowcontrol.
Remember to create the right network adapter - VMware recommends using VMXNEE3. Starting with ESXi 5.0, VMXNET3 supports Jumbo Frames. The E1000e network adapter supports speed of 1GB networks and MTU 9000 - it is installed for all created VMs by default (except Linux). Standard virtual network adapter type "Flexible" supports MTU 1500.
More.
Also, do not forget that the
port group installed for the virtual network adapter of your virtual machine must be connected to a virtual switch with the MTU 9000 setting set for the entire switch.

ONTAP systems support VMware
VAAI primitives by downloading part of the routine data management tasks on a datastore from host to storage, where this is more logical to do. In a
SAN environment with ESXi 4.1+ and higher with ONTAP 8.0 and higher,
VAAI is automatically supported and does not require any manipulation. For the
NAS environment, NetApp has released a plugin that allows such optimization for the
NFS protocol. This requires the installation of a
NetAppNFSVAAI kernel
module for each ESXi host.
VSC can install the
NFS VAAI plugin automatically from vCenter. In order for it to function, you need to
properly configure the NFS ball for VAAI , for which you must meet several requirements:
- Configure ESXi server access (RO, RW and Superuser must be in SYS or ANY state, and NFS3 And NFS4 access must be activated). Even if NFS4 will not be used, it should be on the access list.
- All parent entries in the junction path must allow root read access and NFSv4 access. In most cases, this means that the root volume for the Storage Virtual Server (Vserver) should have the minimum superuser setting set to SYS for the corresponding client that will use the VAAI access to one of the nested volums. It is recommended to deny write access directly to Vserver root.
- It is necessary to enable vStorage support on the volume.
Example of setting VAAI on ONTAP cm3240c-rtp::> export-policy rule show -vserver vmware -policyname vmware_access -ruleindex 2 (vserver export-policy rule show) Vserver: vmware Policy Name: vmware_access Rule Index: 1 Access Protocol: nfs3 <---- needs to be 'nfs' or 'nfs3,nfs4' Client Match Spec: 192.168.1.7 RO Access Rule: sys RW Access Rule: sys User ID To Which Anonymous Users Are Mapped: 65534 Superuser Security Flavors: sys Honor SetUID Bits In SETATTR: true
cm3240c-rtp::> export-policy rule show -vserver vmware -policyname root_policy -ruleindex 1 (vserver export-policy rule show) Vserver: vmware Policy Name: root_policy Rule Index: 1 Access Protocol: nfs <--- like requirement 1, set to nfs or nfs3,nfs4 Client Match Spec: 192.168.1.5 RO Access Rule: sys RW Access Rule: never <--- this can be never for security reasons User ID To Which Anonymous Users Are Mapped: 65534 Superuser Security Flavors: sys <--- this is required for VAAI to be set, even in the parent volumes like vsroot Honor SetUID Bits In SETATTR: true Allow Creation of Devices: true
cm3240c-rtp::> nfs modify -vserver vmware -vstorage enabled
VASA is free
software that allows vCenter through the
API to learn about the storage capabilities and more intelligently use them.
VASA integrates into
VSC and allows you to create datastores with specific storage capabilities through a
GUI interface (for example, presence / absence of Thing Provitioning, disk type:
SAS /
SATA /
SSD , availability of a second-level cache, etc.) and include notifications upon reaching level (for example, occupied space or load). Starting from version 6.0,
VASA is a mandatory component of
VSC and is an important (and mandatory) part of the VMware
vVOL paradigm.
Space Reservation - UNMAP
Starting with ESXi 5.0, it is possible to return released blocks from a thin moon (datastor) back to the repository. In ESXi 5.X / 6.0 with VMFS, manual start is required to return space, for ESXi 6.X it works automatically with vVOL, and since version 6.5 it works automatically (with a delay) on VMFS-6 datastores. On the ONTAP side, this functionality is always turned off by default; to
enable it, you need to execute several uncomplicated commands on the storage system .
vVOL
This topic deserves special attention and is
included in a separate article .
Make extensive
use of the compatibility matrix in your practice to reduce potential problems in the
data center infrastructure. For troubleshooting, contact
KB NetApp and
VMware .
I am sure that over time I will have something to add to this article on optimizing the ESXi host, so look here from time to time.
findings
The correct settings for the VMWare virtualization environment will not only improve the performance of your infrastructure, but also increase its resiliency. Be sure to follow the recommendations of VMware and NetApp when starting up your infrastructure. During start-up, be sure to create a test plan consisting of both load testing and fault tolerance testing, in order to eliminate possible configuration errors and have an idea of ​​the capabilities and behavior of your infrastructure in normal operation mode, and in case of failures.
Translation to English:
ONTAP & ESXi 6.x tuningThis may contain links to Habra articles that will be published later. I ask to send messages on errors in the text to the
LAN . Notes and additions on the contrary please in the comments.