📜 ⬆️ ⬇️

We test VPS based on KVM server with volumes connected via iSCSI from NexentaStor filer

We recently launched a new NexentaStor-based network filer for virtualization purposes at HOSTKEY and it was time to test the operation of virtual machines based on the KVM virtualization system on Saturday evening.
We will check the speed of the processor, the network and the disk with pictures and reflections, knowing what the infrastructure is behind it.


So, our test node KVM is built on the basis of the Intel SR1690WBR server, there are 2 Xeon E5607 2.26GHz processors, 64GB of memory and SSD hypervisor boot disk and a hard disk where KVM likes to swap rarely used memory blocks. All this is connected via 1Gbps Ethernet to our filer on the NexentaStor, the parameters of which are described in detail in this post . The processors are not specially selected, but were in spare parts. We usually use nodes three to four times larger for these tasks. A typical node has 196GB of memory and two 6 core E5645 or E5-2630 processors. The second gigabit port looks into the world of the Internet.

On the node is the good old KVM, which is managed by SolusVM.
Let's deploy a large virtual machine:
')


Pay attention to the thin basic volume of 7TB - this is from Nexenta mounted on iSCSI virtual volume of 7TB, and on it is made using LVM volume for storing virtualoks. If something goes wrong with the node, then we will remount the disk to a free server of the same size and run all virtual machines with minimal downtime.

So, we have 1G of memory, 4 cores and 1TB of a disk (sic!).
check:

[root@testio ~]# free -m total used free shared buffers cached Mem: 996 712 283 0 125 460 -/+ buffers/cache: 126 870 Swap: 99 0 99 [root@testio ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 985G 901G 34G 97% / tmpfs 499M 0 499M 0% /dev/shm 


Now let's see how it looks on Nexent:


those. we allocated terrabyte, but in fact we did not occupy the disk. Neksenta can not reserve a place, take it only when it is actually needed.

measure the speed of the disks with the help of fio, as described here by the distinguished amarao .
 [root@testio ~]# cat fio.ini [readtest] blocksize=4k filename=/dev/vda rw=randread direct=1 buffered=0 ioengine=libaio iodepth=16 [writetest] blocksize=4k filename=/tmp/xxx size=900G rw=randwrite direct=1 buffered=0 ioengine=libaio iodepth=16 


Please note - we have 1TB disk and 900GB test file for recording. It does not fit into the cache Nexenta in any way, even sideways. If the file is smaller, the numbers will increase.

we get an amazing result (removed some letters):
 readtest: (groupid=0, jobs=1): err= 0: pid=4075: Sun Jan 2 01:52:14 2000 read : io=1307.1MB, bw=28797KB/s, iops=7199 , runt= 46509msec clat (usec): min=58 , max=474382 , avg=2202.60, stdev=9679.77 bw (KB/s) : min= 1280, max=40624, per=100.00%, avg=29170.78, stdev=10972.45 lat (usec) : 100=0.01%, 250=0.01%, 500=0.04%, 750=0.28%, 1000=0.73% lat (msec) : 2=89.14%, 4=6.86%, 10=1.20%, 20=1.10%, 50=0.51% lat (msec) : 100=0.03%, 250=0.03%, 500=0.05% cpu : usr=4.79%, sys=18.70%, ctx=146562, majf=0, minf=41 writetest: (groupid=0, jobs=1): err= 0: pid=4076: Sun Jan 2 01:52:14 2000 write: io=473612KB, bw=10190KB/s, iops=2547 , runt= 46478msec clat (usec): min=247 , max=1000.7K, avg=6225.70, stdev=20751.20 lat (msec): min=1 , max=1000 , avg= 6.28, stdev=20.89 bw (KB/s) : min= 13, max=14792, per=100.00%, avg=10431.00, stdev=4163.53 lat (usec) : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.07%, 4=20.12%, 10=76.81%, 20=0.93%, 50=1.05% lat (msec) : 100=0.69%, 250=0.15%, 500=0.12%, 750=0.02%, 1000=0.01% lat (msec) : 2000=0.01% 


what happens:
Queue 64: Read 6200 IOPS and 10ms, Write 1200/44
Turn 16: read 7200 IOPS and 2.4ms, write 2500/6
Queue 1: read 1260 IOPS and 4ms, write 727/8

Let's not forget that our drives are mounted on iSCSI via regular Gigabit Ethernet - this is the right 1ms one way.

Downloading a file from Yandex:
 [root@testio ~]# wget ftp://ftp.yandex.ru/centos/6.3/isos/x86_64/CentOS-6.3-x86_64-bin-DVD1.iso --2000-01-02 02:18:54-- ftp://ftp.yandex.ru/centos/6.3/isos/x86_64/CentOS-6.3-x86_64-bin-DVD1.iso Connecting to ftp.yandex.ru|213.180.204.183|:21... connected. Logging in as anonymous ... Logged in! ==> PASV ... done. ==> RETR CentOS-6.3-x86_64-bin-DVD1.iso ... done. Length: 4289386496 (4.0G) (unauthoritative) 1% [> ] 69,638,574 10.7M/s eta 6m 8s ^C 

our interface is 100M, swinging exactly in the shelf.

In terms of the speed of a multi-core processor on Linux, it is difficult to unambiguously set the coordinate system for comparison, if anyone knows how to do this - tell me. In Windows, I usually focus on the Passmark Performance test, since there are benches of almost all processors of this century.

Output / proc / cpuinfo:
 processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : QEMU Virtual CPU version (cpu64-rhel6) stepping : 3 cpu MHz : 2266.638 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 ht syscall nx lm unfair_spinlock pni cx16 hypervisor lahf_lm bogomips : 4533.27 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: 

and so on 3 more times. A virtual kernel can be made up to 8.

That is, in the dry residue:

1) The disk subsystem performance of the Nexenta iSCSI virtual machine is approximately 70 or more times faster than a pair of disks in RAID1 on SATA. We often like to rent servers on X3440 or E3-1230, they put there 32GB of memory and 2 disks of 2GB each. Cut finely on 30-50 virtualok and sell cheap. There can be no more than 200 IOPS at all, beware of fakes. Ask for the test and check with fio.

2) Subtle selection and deduplication - the use of modern filers allows us to greatly save on disk arrays, which affects the price for the user in the best possible way.

3) the use of KVM eliminates dirty over-comment system. How much memory is allocated to the virtual machine, so much she got. Memory is now in the store inexpensive and you can put it a lot, it makes no sense to save.

4) The power and speed of modern processors allows us to put up to 200 machines on one node - 6 nuclear 12 on-line processors rake up a queue of tasks so quickly that it is rare to see a regular load of more than 50-60% of processors. This also affects the price in the most direct way.

5) with the help of iostat on the node, you can see at a glance, which virtual machine drives its own piece of LVM. Violators and abuzers that create a parasitic load for a long time can quickly be identified and transferred to a special pen where they do not harm neighbors.

6) more IOPS - less specific load on the filer and node. Applications run faster, more processor ticks remain free.

I hope this will be useful for the reader if he has a need to take a few virtual locks for himself. Do you want to test yourself? Contact, test drive for a week for free.

Source: https://habr.com/ru/post/171397/


All Articles