UnixBenchmark 5.1.2 a study measuring performance across different spec VPS’s

Linode

Linode

The idea was pretty simple. See how my VPS benchmarked using the unixbench script.

Then I took the idea further by upgrading my VPS to different sizes to see how performance tracked against different classes of VPS.

Linode, has a simple way to reconfigure VPS instances with more memory and space.

Upgrading services is completely symmetric that means bandwidth, memory, disk size and price all scale linearly.

In the symmetric Linode world, which relies on the XEN virtualisation platform, only a certain number of ‘nodes’ can reside on each box, say 40 nodes for a 512mb VPS offering, and therefore 20 nodes on a 1024mb VPS. The larger the VPS, the less users, therefore, potentially more CPU time and better disc IO.

Unixbench should help us understand what the potential benefits are in upgrading our VPS service in terms of disc IO and CPU. Lets take a look at the results.

First off we started with a clean 512mb VPS with Centos 5.5 (32bit), yum updated and gcc/make installed so that we could run unixbench. It is important to note that Linode CPU was identical across all 4 machines (L5630 2.13GHz), with 4 virtual cores enabled.

The latest unixbench was then downloaded from the google code repository and set off to work on our 512mb instance, here are the results:

512mb Linode

4 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables        9880922.0 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     1932.9 MWIPS (10.3 s, 7 samples)
Execl Throughput                               1524.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        323047.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           84232.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        864676.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                              449392.8 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  22118.8 lps   (10.0 s, 7 samples)
Process Creation                               2462.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3618.6 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1145.2 lpm   (60.0 s, 2 samples)
System Call Overhead                         452957.5 lps   (10.0 s, 7 samples)
 
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    9880922.0    846.7
Double-Precision Whetstone                       55.0       1932.9    351.4
Execl Throughput                                 43.0       1524.7    354.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     323047.5    815.8
File Copy 256 bufsize 500 maxblocks            1655.0      84232.5    509.0
File Copy 4096 bufsize 8000 maxblocks          5800.0     864676.5   1490.8
Pipe Throughput                               12440.0     449392.8    361.2
Pipe-based Context Switching                   4000.0      22118.8     55.3
Process Creation                                126.0       2462.9    195.5
Shell Scripts (1 concurrent)                     42.4       3618.6    853.4
Shell Scripts (8 concurrent)                      6.0       1145.2   1908.7
System Call Overhead                          15000.0     452957.5    302.0
                                                                   ========
System Benchmarks Index Score                                         473.0
 
------------------------------------------------------------------------
Benchmark Run: Fri Nov 19 2010 01:31:40 - 02:00:07
4 CPUs in system; running 4 parallel copies of tests
 
Dhrystone 2 using register variables       39295363.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     7670.9 MWIPS (10.3 s, 7 samples)
Execl Throughput                               5676.3 lps   (29.4 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        311812.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           82966.6 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1048007.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1795144.2 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 212468.4 lps   (10.0 s, 7 samples)
Process Creation                               8793.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   9378.8 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1283.3 lpm   (60.1 s, 2 samples)
System Call Overhead                        1620850.7 lps   (10.0 s, 7 samples)
 
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   39295363.6   3367.2
Double-Precision Whetstone                       55.0       7670.9   1394.7
Execl Throughput                                 43.0       5676.3   1320.1
File Copy 1024 bufsize 2000 maxblocks          3960.0     311812.9    787.4
File Copy 256 bufsize 500 maxblocks            1655.0      82966.6    501.3
File Copy 4096 bufsize 8000 maxblocks          5800.0    1048007.7   1806.9
Pipe Throughput                               12440.0    1795144.2   1443.0
Pipe-based Context Switching                   4000.0     212468.4    531.2
Process Creation                                126.0       8793.6    697.9
Shell Scripts (1 concurrent)                     42.4       9378.8   2212.0
Shell Scripts (8 concurrent)                      6.0       1283.3   2138.9
System Call Overhead                          15000.0    1620850.7   1080.6
                                                                   ========
System Benchmarks Index Score                                        1230.9

I’ll summarise the important numbers

1 parallel copy of test
File Copy 1024 bufsize 2000 maxblocks        311812.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           82966.6 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1048007.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1795144.2 lps   (10.0 s, 7 samples)
 
1 parallel copy of test Score                                         473.0
4 parallel copies of test Score                                        1230.9

473 is a pretty awesome score. The big points here to note is the wicked disk IO, 311Mb/s for 1024 buff, and 1048Mb/s for 4096!!!! That is some pretty amazing performance, those number would indicate that Linode are packing SSD’s to cope with the load generated by 40 odd users.

Lets have a look at the numbers from a 1gb Linode;

4 CPUs in system; running 1 parallel copy of tests
 
Dhrystone 2 using register variables        9625775.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     1912.9 MWIPS (10.2 s, 7 samples)
Execl Throughput                               1246.5 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         76893.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           19415.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        276594.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                               86488.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  16362.5 lps   (10.0 s, 7 samples)
Process Creation                               2301.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   3109.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    953.0 lpm   (60.0 s, 2 samples)
System Call Overhead                         446224.4 lps   (10.1 s, 7 samples)
 
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    9625775.6    824.8
Double-Precision Whetstone                       55.0       1912.9    347.8
Execl Throughput                                 43.0       1246.5    289.9
File Copy 1024 bufsize 2000 maxblocks          3960.0      76893.8    194.2
File Copy 256 bufsize 500 maxblocks            1655.0      19415.0    117.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     276594.9    476.9
Pipe Throughput                               12440.0      86488.9     69.5
Pipe-based Context Switching                   4000.0      16362.5     40.9
Process Creation                                126.0       2301.2    182.6
Shell Scripts (1 concurrent)                     42.4       3109.2    733.3
Shell Scripts (8 concurrent)                      6.0        953.0   1588.3
System Call Overhead                          15000.0     446224.4    297.5
                                                                   ========
System Benchmarks Index Score                                         271.8
 
------------------------------------------------------------------------
Benchmark Run: Thu Nov 18 2010 20:41:48 - 21:09:45
4 CPUs in system; running 4 parallel copies of tests
 
Dhrystone 2 using register variables       38290705.2 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     7572.1 MWIPS (10.3 s, 7 samples)
Execl Throughput                               4521.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        103936.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           26407.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        395275.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                              158949.9 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  76362.5 lps   (10.0 s, 7 samples)
Process Creation                               7095.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   7789.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   1065.2 lpm   (60.1 s, 2 samples)
System Call Overhead                        1605531.3 lps   (10.0 s, 7 samples)
 
System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   38290705.2   3281.1
Double-Precision Whetstone                       55.0       7572.1   1376.7
Execl Throughput                                 43.0       4521.7   1051.6
File Copy 1024 bufsize 2000 maxblocks          3960.0     103936.6    262.5
File Copy 256 bufsize 500 maxblocks            1655.0      26407.4    159.6
File Copy 4096 bufsize 8000 maxblocks          5800.0     395275.9    681.5
Pipe Throughput                               12440.0     158949.9    127.8
Pipe-based Context Switching                   4000.0      76362.5    190.9
Process Creation                                126.0       7095.2    563.1
Shell Scripts (1 concurrent)                     42.4       7789.2   1837.1
Shell Scripts (8 concurrent)                      6.0       1065.2   1775.4
System Call Overhead                          15000.0    1605531.3   1070.4
                                                                   ========
System Benchmarks Index Score                                         657.3

Summary

1 parallel copy of test
File Copy 1024 bufsize 2000 maxblocks         76893.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           19415.0 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        276594.9 KBps  (30.0 s, 2 samples)
Pipe Throughput                               86488.9 lps   (10.0 s, 7 samples)
 
 
1 parallel copy of test Score                                         271.0
4 parallel copies of test Score                                        657.9

WOW! Disc speed on the 1gb Linode is not as impressive as the 512mb system.

The tests from the 2 and 4 gb systems were almost mirror images of the 1gb system, i’ll summarise them:

2gb

File Copy 1024 bufsize 2000 maxblocks         77849.3 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           19654.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        279506.5 KBps  (30.0 s, 2 samples)
Pipe Throughput                               87483.1 lps   (10.0 s, 7 samples)
 
1 parallel copy of test Score                                         273.0
4 parallel copies of test Score                                        667.9

4gb

File Copy 1024 bufsize 2000 maxblocks         68375.9 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           17805.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        252472.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                               79473.8 lps   (10.0 s, 7 samples)
 
1 parallel copy of test Score                                         241.0
4 parallel copies of test Score                                        569.

Conclusion:
Linode is not slacking off when it comes to providing a top level service. My current ‘node only benches 273, and I never notice a problem with Disc I/O. For those lucky users that get a Linode that is probably Raid-1/0 SSD based they certainly won’t be complaining about other users chewing up all the Disc I/O.

It would be interesting to know how many active users were on the box when I was running that test, it would be interesting to see how those numbers changed in a full house. I get the impression that performance would be as good, if not better than the other Linode services.

Hats off to Linode!

Comments are closed.