📜 ⬆️ ⬇️

ARM processors? Practice. Marvel Armada XP

First of all, I would like to thank Rikor and Oleg personally for providing platforms for testing. As before, you can take on the Marvell Armada XP test for yourself and later, as a tester, keep them in use for a nominal fee. As promised in the last article Server on ARM? Made in Russia! - I give server performance tests on ARM processors. Since ARM is a processor made by System-On-Chip technology, we will focus our attention on processor performance.
There are many tests in the World Wide Web, but they are all for us, like a fantastic trip to Mars - how they were carried out, what kind of revisions the processor (or just another manufacturer), what software was used is not clear. And here - the server in front of us, it remains only to test.



Test benches:
')
1) Core2 Quad:

OS: CentOS release 5.7
Processor: 4-core Intel® Core (TM) 2 Quad CPU Q9450 @ 2.66GHz
RAM: MemTotal 3939800kB

2) Atom D510:

OS: Ubuntu 04.04.3
Processor: 2-core Intel® Atom (TM) CPU D510 @ 1.66GHz, 4-threads
RAM: MemTotal 4032060kB

3) Marvell Armada XP:

OS: Ubuntu 04.04.3
Processor: 4-core Marvell PJ4Bv7 Processor rev 2 (v7l)
RAM: MemTotal 8019640 kB

Hard drives in all three WD black booths. The RAM in Core 2 Quad and Atom D510 costs the maximum amount.

sysbench - processor test



The test was launched in 4 threads and in 10,000 requests to the processor.

Core2 quad
Test execution summary:
total time: 2.5107s
total number of events: 10,000
total time taken by event execution: 10.0303
per-request statistics:
min: 0.95ms
avg: 1.00ms
max: 487.26ms
approx. 95 percentile: 0.95ms



Atom d510
Test execution summary:
total time: 37.6966s
total number of events: 10,000
total time taken by event execution: 150.7424
per-request statistics:
min: 9.18ms
avg: 15.07ms
max: 39.05ms
approx. 95 percentile: 15.09ms


ARM Armada XP
Test execution summary:
total time: 67.4705s
total number of events: 10,000
total time taken by event execution: 269.7890
per-request statistics:
min: 26.66ms
avg: 26.98ms
max: 57.05ms
approx. 95 percentile: 27.10ms





OLTP testing - MySQL performance with sysbench



First we create an innoDB table for 10,000 records. By team

sysbench —test=oltp —mysql-table-engine=innodb —oltp-table-size=10000 —mysql-user=root —mysql-password=root —db-driver=mysql —test=oltp prepare

Then the team

sysbench --num-threads=8 --max-requests=500 --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp run

Perform a test with 8 clients (the maximum number of queries is 500) on the table that was created in the previous step.

Test output:

Atom d510
Running the test with the following options:
Number of threads: 8
Random number generator seed is 0 and will be ignored

Threads started!

OLTP test statistics:
queries performed:
read: 7028
write: 2008
other: 1004
total: 10040
transactions: 502 (93.89 per sec.)
deadlocks: 0 (0.00 per sec.)
read / write requests: 9036 (1690.02 per sec.)
other operations: 1004 (187.78 per sec.)

General statistics:
total time: 5.3467s
total number of events: 502
total time taken by event execution: 42.2692s
response time:
min: 57.34ms
avg: 84.20ms
max: 129.67ms
approx. 95 percentile: 100.21ms

Threads fairness:
events (avg / stddev): 62.7500 / 0.66
execution time (avg / stddev): 5.2836 / 0.03


Marvell Armada XP
Running the test with the following options:
Number of threads: 8

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations, 1 pct of values ​​are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Maximum number of requests for OLTP test is limited to 500
Threads started!
Done.

OLTP test statistics:
queries performed:
read: 7000
write: 2500
other: 1000
total: 10500
transactions: 500 (361.28 per sec.)
deadlocks: 0 (0.00 per sec.)
read / write requests: 9500 (6864.24 per sec.)
other operations: 1000 (722.55 per sec.)

Test execution summary:
total time: 1.3840s
total number of events: 500
total time taken by event execution: 11.0083
per-request statistics:
min: 8.47ms
avg: 22.02ms
max: 55.15ms
approx. 95 percentile: 39.44ms

Threads fairness:
events (avg / stddev): 62.5000 / 1.87
execution time (avg / stddev): 1.3760 / 0.00


For comparison, we take three indicators: transactions (transactions), read / write requests (read / write requests), other operations (other operations).
The value of deadlocks is not taken, because it is the same on both platforms and is zero.



Archiving information with 7zip



Archive will be 1GB of randomly created data.

du -sh /tmp/ramfs/file
1.0G /tmp/ramfs/file
time 7za a dummy -mmt=4 -txz -so /tmp/ramfs/file | dd of=/dev/null


Core 2 quad
7-Zip (A) [64] 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = C, Utf16 = off, HugeFiles = on, 4 CPUs)
Scanning

Creating archive stdout

Everything is ok
2093146 + 15 records in
2093159 + 1 records out
1071697500 bytes (1.1 GB) copied, 160.375 seconds, 6.7 MB / s

real 2m40.376s
user 8m11.635s
sys 0m5.290s


Atom d510
7-Zip (A) [64] 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = C, Utf16 = off, HugeFiles = on, 4 CPUs)
Scanning

Creating archive stdout

Everything is ok
2097252 + 15 records in
2097263 + 1 records out
1073798860 bytes (1.1 GB) copied, 557.429 s, 1.9 MB / s

real 9m17.434s
user 34m46.120s
sys 0m26.012s


Marvell Armada XP
7-Zip (A) 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = C, Utf16 = off, HugeFiles = on, 4 CPUs)
Scanning

Creating archive stdout

Everything is ok
2097249 + 17 records in
2097263 + 1 records out
1073798860 bytes (1.1 GB) copied, 578.709 s, 1.9 MB / s

real 9m38.713s
user 32m26.630s
sys 0m21.290s




Compression and decompression tests in 7zip



Core 2 quad
7-Zip (A) [64] 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = en_US.UTF-8, Utf16 = on, HugeFiles = on, 4 CPUs)

RAM size: 3847 MB, # CPU hardware threads: 4
RAM usage: 850 MB, # Benchmark threads: 4

Dict Compressing | Decompressing
Speed ​​Usage R / U Rating | Speed ​​Usage R / U Rating
KB / s% MIPS MIPS | KB / s% MIPS MIPS

22: 8950 339 2569 8706 | 117613 397 2675 10611
23: 7510 294 2604 7651 | 115900 398 2666 10606
24: 8044 326 2653 8649 | 113672 398 2651 10546
25: 7424 311 2728 8477 | 112064 399 2642 10538
- Avr: 317 2638 8371 398 2659 10575
Tot: 358 2648 9473


Atom d510
7-Zip (A) [64] 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = en_US.UTF-8, Utf16 = on, HugeFiles = on, 4 CPUs)

RAM size: 3937 MB, # CPU hardware threads: 4
RAM usage: 850 MB, # Benchmark threads: 4

Dict Compressing | Decompressing
Speed ​​Usage R / U Rating | Speed ​​Usage R / U Rating
KB / s% MIPS MIPS | KB / s% MIPS MIPS

22: 1895 312 591 1843 | 34517 398 782 3114
23: 1871 317 602 1906 | 34142 399 783 3124
24: 1845 325 610 1984 | 33684 399 783 3125
25: 1794 331 618 2048 | 30397 354 806 2858
- Avr: 321 605 1945 388 788 3055
Tot: 354 697 2500


Marvell Armada XP
7-Zip (A) 9.20 Copyright © 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale = en_US.UTF-8, Utf16 = on, HugeFiles = on, 4 CPUs)

RAM size: 7831 MB, # CPU hardware threads: 4
RAM usage: 850 MB, # Benchmark threads: 4

Dict Compressing | Decompressing
Speed ​​Usage R / U Rating | Speed ​​Usage R / U Rating
KB / s% MIPS MIPS | KB / s% MIPS MIPS

22: 1662 282 573 1616 | 45116 394 1034 4070
23: 1645 286 587 1676 | 44412 393 1033 4064
24: 1636 291 604 1759 | 43816 394 1032 4065
25: 1626 296 628 1856 | 43331 396 1029 4074
- Avr: 288 598 1727 394 1032 4068
Tot: 341 815 2898


For the graph, take the average value of the speed of compression and decompression.



Performance testing with Openssl speed



Due to multiprocessing (4 processes running at the same time), the benchmark output is not readable, but the essence lies in the calculation of various types of encryption. Here we need the total time of the assignment and the last part of the program output.

Core 2 quad
OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008
built on: Tue Feb 7 05:45:53 EST 2012
options: bn (64,64) md2 (int) rc4 (ptr, int) des (idx, cisc, 16, int) aes (partial) blowfish (ptr2)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_D-45-a-d or so-so-so A-45-a-so-45 = 2 -fexceptions -fstack-protector --param = ssp-buffer-size = 4 -m64 -mtune = generic -Wa, - noexect DAES_ASM
available timing options: TIMES TIMEB HZ = 100 [sysconf value]
timing function used:

real 7m21.644s
user 0m0.002s
sys 0m0.001s


Atom d510
OpenSSL 1.0.1 Mar 14 2012
built on: Tue Jun 4 07:26:06 UTC 2013
options: bn (64,64) rc4 (16x, int) des (idx, cisc, 16, int) aes (partial) blowfish (idx)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -a -D_REENTRANT -DDSO_DLFCN -DHAVE_D-_--64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector -param = ssp -p-protei -D64MI = format-security -D_FORTIFY_SOURCE = 2 -Wl, -Bsymbolic-functions -Wl, -z, relro -Wa, - noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH = 50 -DMD32_REG_T = int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM

real 16m37.958s
user 0m0.016s
sys 0m0.000s


Marvell Armada XP
OpenSSL 1.0.1 Mar 14 2012
built on: Tue Jun 4 07:43:19 UTC 2013
options: bn (64,32) rc4 (ptr, char) des (idx, cisc, 16, long) aes (partial) blowfish (ptr)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param = ssp-ch-h--ph-it-h -MI -security -D_FORTIFY_SOURCE = 2 -Wl, -Bsymbolic-functions -Wl, -z, relro -Wa, - noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH = 50 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DGHASH_ASM

real 16m39.221s
user 0m0.010s
sys 0m0.010s




With the help of Phoronix Test Suite we will test the processors of our platforms



Atom d510
Phoronix Test Suite v3.6.1

Installed: pts / polybench-c-1.0.2

PolyBench-C Test Configuration

Test:

1: 3 Matrix Multiplications
2: Correlation Computation
3: Covariance Computation
4: Test All Options

Enter Your Choice:
System information

Hardware:
Processor: Intel Atom D510 @ 1.66GHz (4 Cores), Motherboard: Intel D510MO, Chipset: Intel N10 Family DMI, Memory: 2 x 2048 MB DDR2-800MHz, Disk: 320GB Hitachi HTS54323, Graphics: Intel N10 Family IGP, Audio: Realtek ALC662 rev1, Network: Realtek RTL8111 / 8168B

Software:
OS: Ubuntu 12.04, Kernel: 3.8.0-29-generic (x86_64), Display Driver: intel, Compiler: GCC 4.6, File-System: ext4, Screen Resolution: 1280x800

Would you like to save these test results (Y / n):
Estimated Run-Time: 10 Minutes

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: 3 Matrix Multiplications]
Test 1 of 3
Estimated Trial Run Count: 3
Estimated Test Run-Time: 4 Minutes
Estimated Time To Completion: 10 Minutes
Started Run 1 @ 15:58:43
Started Run 2 @ 16:06:02
Started Run 3 @ 16:13:16 [Std. Dev: 0.08%]

Test Results:
432.50632214546
431.82482385635
432.36951303482

Average: 432.23 Seconds

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: Correlation Computation]
Test 2 of 3
Estimated Trial Run Count: 3
Estimated Test Run-Time: 4 Minutes
Estimated Time To Completion: 7 Minutes
Started Run 1 @ 16:20:34
Started Run 2 @ 16:21:03
Started Run 3 @ 16:21:32 [Std. Dev: 0.80%]

Test Results:
27.11058306694
26.767813205719
26.717456102371

Average: 26.87 Seconds

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: Covariance Computation]
Test 3 of 3
Estimated Trial Run Count: 3
Estimated Time To Completion: 4 Minutes
Started Run 1 @ 16:22:05
Started Run 2 @ 16:22:36
Started Run 3 @ 16:23:05 [Std. Dev: 4.04%]
Started Run 4 @ 16:23:34 [Std. Dev: 3.70%]
Started Run 5 @ 16:24:05 [Std. Dev: 3.66%]
Started Run 6 @ 16:24:37 [Std. Dev: 4.24%]

Test Results:
29.076143026352
26.863905906677
27.619282960892
28.831228017807
29.238312959671
26.504108190536

Average: 28.02 Seconds

real 55m28.781s
user 25m44.476s
sys 0m1.728s


Marvell Armada XP
Phoronix Test Suite v3.6.1

Installed: pts / polybench-c-1.0.2

PolyBench-C Test Configuration

Test:

1: 3 Matrix Multiplications
2: Correlation Computation
3: Covariance Computation
4: Test All Options

Enter Your Choice:
System information

Hardware:
Processor: Marvell PJ4Bv7 rev 2 (4 Cores), Motherboard: Marvell Armada XP GP Board, Memory: 8192MB, Disk: 640GB JMicron H / W RAID

Software:
OS: Ubuntu 12.04, Kernel: 3.2.40-1-armadaxp (armv7l), Compiler: GCC 4.6, File-System: ext4

Would you like to save these test results (Y / n):
Estimated Run-Time: 7 Minutes

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: 3 Matrix Multiplications]
Test 1 of 3
Estimated Trial Run Count: 3
Estimated Test Run-Time: 3 Minutes
Estimated Time To Completion: 7 Minutes
Started Run 1 @ 15:58:14
Started Run 2 @ 16:05:08
Started Run 3 @ 16:11:51 [Std. Dev: 1.06%]

Test Results:
409.0100030899
400.97810292244
407.59055900574

Average: 405.86 Seconds

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: Correlation Computation]
Test 2 of 3
Estimated Trial Run Count: 3
Estimated Test Run-Time: 3 Minutes
Estimated Time To Completion: 5 Minutes
Started Run 1 @ 16:18:44
Started Run 2 @ 16:19:26
Started Run 3 @ 16:20:08 [Std. Dev: 0.10%]

Test Results:
39.603915929794
39.637764930725
39.679361104965

Average: 39.64 Seconds

PolyBench-C 3.2:
pts / polybench-c-1.0.2 [Test: Covariance Computation]
Test 3 of 3
Estimated Trial Run Count: 3
Estimated Time To Completion: 3 Minutes
Started Run 1 @ 16:20:53
Started Run 2 @ 16:21:35
Started Run 3 @ 16:22:17 [Std. Dev: 0.03%]

Test Results:
39.610389947891
39.589015960693
39.614406108856

Average: 39.60 Seconds

real 53m46.279s
user 24m14.830s
sys 0m1.350s




Conclusions from tests everyone can make for himself. Our findings are that ARM is at the Atom D510 level. At the same time, ARM consumes significantly less energy and has quite a good potential, since the adaptation to the architecture is just beginning, and we all look forward to Aarch64 (ARM64).

In the next article we plan to bring the results of testing the work of various applications on the ARM platform. The plans of various hosting control panels, various CMS. If you want to test something, write to zbg@globatel.ru

Source: https://habr.com/ru/post/213819/


All Articles