Virtualization Benchmark Test: Introduction

1
Virtualization Benchmark Test: Introduction

Virtualization Benchmark Test - Parallels Desktop for Mac, VMWare Fusion,and Sun VirtualBox
Don't try this at home. Parallels, Fusion, and VirtualBox running simultaneously on the Mac Pro host.

Virtualization environments have been hot commodities for the Mac user ever since Apple started using Intel processors in its computers. Even before Intel arrived, emulation software was available that allowed Mac users to run Windows and Linux.

But emulation was slow, using an abstraction layer to translate x86 programming code to the code used by the PowerPC architecture of earlier Macs. This abstraction layer not only had to translate for CPU type, but also all of the hardware components. In essence, the abstraction layer had to create software equivalents of video cards, hard drives, serial ports, etc. The result was an emulation environment that could run Windows or Linux, but was severely restricted in both performance and the operating systems that could be used.

With the advent of Apple’s decision to use Intel processors, the entire need for emulation was swept away. In its place came the ability to run other OSes directly on an Intel Mac. In fact, if you want to run Windows directly on a Mac as an option at bootup, you can use Boot Camp, an application that Apple provides as a handy way to install Windows in a multi-boot environment.

But many users need a way to run the Mac OS and a second OS simultaneously. Parallels, and later VMWare and Sun, brought this capability to the Mac with virtualization technology. Virtualization is similar in concept to emulation, but because Intel-based Macs use the same hardware as standard PCs, there’s no need to create a hardware abstraction layer in software. Instead, the Windows or Linux software can run directly on the hardware, producing speeds that can be nearly as fast as if the guest OS was running natively on a PC.

And that’s the question our benchmarks tests seek to answer. Do the three major players in virtualization on the Mac - Parallels Desktop for Mac, VMWare Fusion, and Sun VirtualBox - live up to the promise of near-native performance?

We say ‘near native’ because all virtualization environments have some overhead that can’t be avoided. Since the virtual environment is running at the same time as the native OS (OS X), there has to be sharing of hardware resources. In addition, OS X has to provide some services to the virtualization environment, such as windowing and core services. The combination of these services and resource sharing tends to limit how well the virtualized OS can run.

To answer the question, we are going to perform benchmark tests to see how well the three major virtualization environments fare running Windows.

2
Virtualization Benchmark Test: Testing Method

Virtualization Benchmark Test - Parallels Desktop for Mac, VMWare Fusion,and Sun VirtualBox
GeekBench 2.1.4 and CineBench R10 are the benchmark applications we will use in our tests.

We’re going to use two different, popular, cross-platform benchmark test suites. The first, CineBench 10, performs a real-world test of a computer’s CPU, and its graphics card’s ability to render images. The first test uses the CPU to render a photorealistic image, using CPU-intensive computations to render reflections, ambient occultation, area lighting and shading, and more. The test is performed with a single CPU or core, and then repeated using all available CPUs and cores. The result produces a reference performance grade for the computer using a single processor, a grade for all CPUs and cores, and an indication of how well multiple cores or CPUs are utilized.

The second CineBench test evaluates the performance of the computer’s graphics card using OpenGL to render a 3D scene while a camera moves within the scene. This test determines how fast the graphics card can perform while still accurately rendering the scene.

The second test suite is GeekBench 2.1.4, which tests the processor’s integer and floating-point performance, tests memory using a simple read/write performance test, and performs a streams test that measures sustained memory bandwidth. The results of the set of tests are combined to produce a single GeekBench score. We will also break out the four basic test sets (Integer Performance, Floating-Point Performance, Memory Performance, and Stream Performance), so we can see the strengths and weaknesses of each virtual environment.

GeekBench uses a reference system based on a PowerMac G5 @1.6 GHz. GeekBench scores for the reference systems are normalized to 1000. Any score higher than 1000 indicates a computer that performs better than the reference system.

Since the results of both benchmark suites are somewhat abstract, we will start by defining a reference system. In this case, the reference system will be the host Mac being used to run the three virtual environments (Parallels Desktop for Mac, VMWare Fusion, and Sun Virtual Box). We’ll run both benchmark suites on the reference system and use that figure to compare how well the virtual environments perform.

All testing will be performed after a fresh startup of both the host system and the virtual environment. Both the host and the virtual environments will have all anti-malware and antivirus applications disabled. All virtual environments will be run within a standard OS X window, since this is the most common method used in all three environments. In the case of the virtual environments, no user applications will be running other than the benchmarks. On the host system, with the exception of the virtual environment, no user applications will be running other than a text editor to take notes before and after testing, but never during the actual test process.

3
Virtualization Benchmark Test: Benchmark Results for Host System Mac Pro

Virtualization Benchmark Test - Parallels Desktop for Mac, VMWare Fusion,and Sun VirtualBox
The results of the benchmark test on the host system can serve as a reference when comparing the performance of a virtual environment.

The system that will host the three virtual environments (Parallels Desktop for Mac, VMWare Fusion, and Sun VirtualBox) is a 2006 edition of a Mac Pro:

Mac Pro (2006)

Two Dual-core 5160 Zeon processors (4 cores total) @ 3.00 GHz

4 MB per core L2 cache RAM (16 MB total)

6 GB RAM consisting of four 1 GB modules and four 512 MB modules. All modules are matched pairs.

A 1.33 GHz front side bus

An NVIDIA GeForce 7300 GT graphics card

Two 500 GB Samsung F1 Series hard drives. OS X and the virtualization software are resident on the startup drive; the guest OSes are stored on the second drive. Each drive has its own independent SATA 2 channel.

The results of the GeekBench and CineBench tests on the host Mac Pro should provide the practical upper limit of performance we should see from any of the virtual environments. That being said, we want to point out that it’s possible for a virtual environment to exceed the performance of the host in any single test. The virtual environment may be able to access the underlying hardware and bypass some of OS X’s OS layers. It’s also possible for the benchmark test suites to be fooled by the performance caching system built into the virtual environments, and produce results that are wildly beyond the performance that’s actually possible.

Benchmark Scores

GeekBench 2.1.4

GeekBench Score: 6830

Integer: 6799

Floating Point: 10786

Memory: 2349

Stream: 2057

CineBench R10

Rendering, Single CPU: 3248

Rendering, 4 CPU: 10470

Effective speed up from single to all processors: 3.22

Shading (OpenGL): 3249

Detailed results of the benchmark tests are available in the Virtualization Benchmark Test gallery.

4
Virtualization Benchmark Test: Benchmark Results for Parallels Desktop for Mac 5

Parallels Desktop for Mac 5.0 was able to run all of our benchmark tests without a hiccup.

We used the latest version of Parallels (Parallels Desktop for Mac 5.0). We installed fresh copies of Parallels, Windows XP SP3, and Windows 7. We chose these two Windows OSes for testing because we think Windows XP represents the vast majority of current Windows installations on OS X, and that in the future, Windows 7 will be the most common guest OS running on the Mac.

Before testing began, we checked for and installed all available updates for both the virtual environment and the two Windows operating systems. Once everything was up to date, we configured the Windows virtual machines to use a single processor and 1 GB of memory. We shut down Parallels, and disabled Time Machine and any startup items on the Mac Pro not needed for the testing. We then restarted the Mac Pro, launched Parallels, started one of the Windows environments, and performed the two sets of benchmark tests. Once the tests were complete, we copied the results to the Mac for later reference.

We then repeated the restart and launch of Parallels for the benchmark tests of the second Windows OS.

Finally, we repeated the above sequence with the guest OS set to use 2 and then 4 CPUs.

Benchmark Scores

GeekBench 2.1.4

Windows XP SP3 (1,2,4 CPU): 2185, 3072, 4377

Windows 7 (1,2,4 CPU): 2223, 2980, 4560

CineBench R10

Windows XP SP3

Rendering (1,2,4 CPU): 2724, 5441, 9644

Shading (OpenGL) (1,2,4 CPU): 1317, 1317, 1320

CineBench R10

Windows 7

Rendering (1,2,4 CPU): 2835, 5389, 9508

Shading (OpenGL) (1,2,4 CPU): 1335, 1333, 1375

Parallels Desktop for Mac 5.0 successfully completed all benchmark tests. GeekBench saw only minor differences in performance between Windows XP and Windows 7, which is what we expected. GeekBench concentrates on testing processor and memory performance, so we expect it to be a good indicator of the underlying performance of the virtual environment and how well it makes the host Mac Pro’s hardware available to the guest OSes.

CineBench’s rendering test likewise showed consistency across the two Windows OSes. Once again, this is to be expected since the rendering test makes extensive use of the processors and memory bandwidth as seen by the guest OSes. The shading test is a good indicator of how well each virtual environment has implemented its video driver. Unlike the rest of the Mac’s hardware, the graphics card isn’t made available directly to the virtual environments. This is because the graphics card must continuously take care of the display for the host environment, and can’t be diverted to display only the guest environment. This is true even if the virtual environment offers a full-screen display option.

Detailed results of the benchmark tests are available in the Virtualization Benchmark Test gallery.

5
Virtualization Benchmark Test: Benchmark Results for VMWare Fusion 3.0

Virtualization Benchmark Test - Parallels Desktop for Mac, VMWare Fusion,and Sun VirtualBox
We marked the Windows XP single processor results in Fusion's benchmark test as invalid, after memory and stream results scored 25 times better than the host.

We used the latest version of VMWare Fusion (Fusion 3.0). We installed fresh copies of Fusion, Windows XP SP3, and Windows 7. We chose these two Windows OSes for testing because we think Windows XP represents the vast majority of current Windows installations on OS X, and that in the future, Windows 7 will be the most common guest OS running on the Mac.

Before testing began, we checked for and installed any available updates for both the virtual environment and the two Windows operating systems. Once everything was up to date, we configured the Windows virtual machines to use a single processor and 1 GB of memory. We shut down Fusion, and disabled Time Machine and any startup items on the Mac Pro not needed for the testing. We then restarted the Mac Pro, launched Fusion, started one of the Windows environments, and performed the two sets of benchmark tests. Once the tests were complete, we copied the results to the Mac for later use.

We then repeated the restart and launch of Fusion for the benchmark tests of the second Windows OS.

Finally, we repeated the above sequence with the guest OS set to use 2 and then 4 CPUs.

Benchmark Scores

GeekBench 2.1.4

Windows XP SP3 (1,2,4 CPU): *, 3252, 4406

Windows 7 (1,2,4 CPU): 2388, 3174, 4679

CineBench R10

Windows XP SP3

Rendering (1,2,4 CPU): 2825, 5449, 9941

Shading (OpenGL) (1,2,4 CPU): 821, 821, 827

CineBench R10

Windows 7

Rendering (1,2,4 CPU): 2843, 5408, 9657

Shading (OpenGL) (1,2,4 CPU): 130, 130, 124

We ran into problems with Fusion and the benchmark tests. In the case of Windows XP with a single processor, GeekBench reported memory stream performance at a rate better than 25 times the rate of the host Mac Pro. This unusual memory result bumped the GeekBench score for the single CPU version of Windows XP to 8148. After repeating the test many times and getting similar results, we decided to mark the test as invalid and consider it an interaction issue between the benchmark test, Fusion, and Windows XP. As best as we can tell, for the single CPU configuration, Fusion was not reporting the correct hardware configuration to the GeekBench application. However, GeekBench and Windows XP performed flawlessly with two or more CPUs selected.

We also had a problem with Fusion, Windows 7, and CineBench. When we ran CineBench under Windows 7, it reported a generic video card as the only available graphics hardware. While the generic graphics card was able to run OpenGL, it did so at a very poor rate. This may have been the result of the host Mac Pro having an old NVIDIA GeForce 7300 graphics card. Fusion’s system requirements suggest a more modern graphics card. We found it interesting, however, that under Windows XP, the CineBench shading test ran without any issues.

Other than the two quirks mentioned above, Fusion’s performance was on par with what we expected from a well-designed virtual environment.

Detailed results of the benchmark tests are available in the Virtualization Benchmark Test gallery.

6
Virtualization Benchmark Test: Benchmark Results For Sun VirtualBox

Virtualization Benchmark Test - Parallels Desktop for Mac, VMWare Fusion,and Sun VirtualBox
VirtualBox was unable to detect more than a single CPU when running Windows XP.

We used the latest version of Sun VirtualBox (VirtualBox 3.0). We installed fresh copies of VirtualBox, Windows XP SP3, and Windows 7. We chose these two Windows OSes for testing because we think Windows XP represents the vast majority of current Windows installations on OS X, and that in the future, Windows 7 will be the most common guest OS running on the Mac.

Before testing began, we checked for and installed any available updates for both the virtual environment and the two Windows operating systems. Once everything was up to date, we configured the Windows virtual machines to use a single processor and 1 GB of memory. We shut down VirtualBox, and disabled Time Machine and any startup items on the Mac Pro not needed for the testing. We then restarted the Mac Pro, launched VirtualBox, started one of the Windows environments, and performed the two sets of benchmark tests. Once the tests were complete, we copied the results to the Mac for later use.

We then repeated the restart and launch of Fusion for the benchmark tests of the second Windows OS.

Finally, we repeated the above sequence with the guest OS set to use 2 and then 4 CPUs.

Benchmark Scores

GeekBench 2.1.4

Windows XP SP3 (1,2,4 CPU): 2345, *, *

Windows 7 (1,2,4 CPU): 2255, 2936, 3926

CineBench R10

Windows XP SP3

Rendering (1,2,4 CPU): 7001, *, *

Shading (OpenGL) (1,2,4 CPU): 1025, *, *

CineBench R10

Windows 7

Rendering (1,2,4 CPU): 2570, 6863, 13344

Shading (OpenGL) (1,2,4 CPU): 711, 710, 1034

Sun VirtualBox and our benchtest applications ran into a problem with Windows XP. Specifically, both GeekBench and CineBench were unable to see more than a single CPU, regardless of how we configured the guest OS.

When we tested Windows 7 with GeekBench, we noticed that multi-processor utilization was poor, resulting in the lowest scores for 2 and 4 CPU configurations. Single-processor performance seemed to be on par with the other virtual environments.

CineBench was also unable to see more than a single processor when running Windows XP. In addition, the rendering test for the single-CPU version of Windows XP produced one of the fastest results, exceeding even the Mac Pro itself. We tried rerunning the test a few times; all results were within the same range. We think it’s safe to chalk up the Windows XP single-CPU rendering results to a problem with VirtualBox and how it makes use of CPUs.

We also saw a strange bump in rendering results for 2 and 4 CPU tests with Windows 7. In each case, rendering more than doubled in speed when going from 1 to 2 CPUs and from 2 to 4 CPUs. This type of performance increase is unlikely, and once again we will chalk it up to VirtualBox’s implementation of multiple CPU support.

With all the problems with VirtualBox benchmark testing, the only valid test results may be the ones for a single CPU under Windows 7.

Detailed results of the benchmark tests are available in the Virtualization Benchmark Test gallery.

7
Virtualization Benchmark Test: The Results

With all the benchmark tests done, it’s time to revisit our original question.

Do the three major players in virtualization on the Mac (Parallels Desktop for Mac, VMWare Fusion, and Sun VirtualBox) live up to the promise of near-native performance?

The answer is a mixed bag. None of the virtualization candidates in our GeekBench tests were able to measure up to the performance of the host Mac Pro. The best result was recorded by Fusion, which was able to achieve nearly 68.5% of the host’s performance. Parallels was close behind at 66.7%. Bringing up the rear was VirtualBox, at 57.4%.

When we looked at the results of CineBench, which uses a more real-world test for rendering images, they were very close to the host’s score. Once again, Fusion was at the top of the rendering tests, achieving 94.9% of the host’s performance. Parallels followed at 92.1%. VirtualBox couldn’t reliably complete the rendering test, knocking it out of contention. In one iteration of the rendering test, VirtualBox reported that it performed 127.4% better than the host, while in others, it was unable to start or finish.

The shading test, which looks at how well the graphics card performs using OpenGL, fared the worst among all of the virtual environments. The best performer was Parallels, which reached 42.3% of the capabilities of the host. VirtualBox was second at 31.5%; Fusion came in third at 25.4%.

Picking an overall winner is something we will leave to the end user. Each product has its pluses and minuses, and in many cases, the benchmark numbers are so close that repeating the tests could change the standings.

What the benchmark test scores do show is that universally, the ability to make use of the native graphics card is what holds the virtual environment back from being a full replacement for a dedicated PC. That being said, a more modern graphics card than we have here could produce higher performance figures in the shading test, especially for Fusion, whose developer suggests higher performance graphics cards for best results.

You will notice that some test combinations (virtual environment, Windows version, and benchmark test) displayed problems, either unrealistic results or failure to complete a test. These types of results should not be used as indicators of problems with a virtual environment. Benchmark tests are unusual applications to try to run in a virtual environment. They are designed to measure the performance of physical devices, which the virtual environment may not allow them to access. This is not a failure of the virtual environment, and in real-world use, we have not experienced problems with the vast majority of Windows applications running under a virtual system.

All of the virtual environments we tested (Parallels Desktop for Mac 5.0, VMWare Fusion 3.0, and Sun VirtualBox 3.0) provide good performance and stability in daily use, and should be able to serve as your primary Windows environment for most day-to-day applications.