After investing in the most powerful components, PC owners often use benchmarking tools to demonstrate their system’s “ranking.” But it’s not that simple!
In general, benchmark scores are used for two main purposes: objective comparisons between systems and measuring optimal performance levels. You might not be competing for top honors, but you definitely want to know how “powerful” your PC is after spending a significant amount of money on high-end components or laboring over “overclocking.”
Preparation
The first step is to select a benchmarking tool according to the component you wish to evaluate; these tools are often specifically designed for certain components. In addition to tools that specialize in measuring CPU (Central Processing Unit) and GPU (Graphics Processing Unit) performance, there are tools that help evaluate multiple components simultaneously.
Generally, benchmarking tools can be categorized into two groups: simulation and real application. Simulation benchmarking tools attempt to mimic the operation of applications in a special way to obtain detailed scores. Real applications, such as games, provide evaluation results based on actual performance, but they often do not provide the detailed scoring that simulation tools do. Each type of tool has its own role in performance evaluation, and sometimes you may need to use both to obtain accurate results.
To achieve stable results that can be compared, refer to “Basic Benchmarking Techniques” before installing benchmarking software. The fundamental principle of benchmarking is to establish a “baseline” that allows for comparisons across components and architectures.
MOBILE DEVICE SCORING |
Just like PC users, mobile device users surely want to know how powerful their PDA or mobile phone is. What about its 2D/3D graphics processing? How long does the battery last?… If you have a smartphone running the Symbian operating system version 6.1 or 7, you can use Futuremark’s SPMark04 or SiSoftware’s Sandra 2005 (which has more limited features) to benchmark your device. Futuremark also offers MobileMark 2005 to evaluate mobile devices that are “related” to PCs – laptops. |
Graphics Card Evaluation
Advanced graphics architectures drive the strongest competition in benchmark scores. There are many ways to evaluate graphics card performance. The easiest method is to use timed demos in games. To follow this method, the game must have a mechanism to record gameplay and replay it at the maximum speed allowed by the graphics card, providing results in fps (frames per second). Some games, like Doom 3, provide a demo for evaluation, while others do not, requiring you to create your own gameplay script for benchmarking.
However, not many games feature timed demos. Often, measuring the speed of a replay or a specific scene in the game is easier. IL-2 Sturmovik: Forgotten Battles from Ubisoft is an example of a game using the replay method. The only issue is that this game lacks a speed monitoring feature, so you will need to download a utility to record frame rate such as FRAPS (www.fraps.com). You can also play the game yourself and use FRAPS to measure the speed, but this method can yield unstable results.
According to Greg Ellis, head of performance evaluation at ATI, graphics card benchmark scores must have realistic game scenarios and be stable across multiple tests. The most important aspect is to maximize graphics tasks to reveal the threshold related to graphics processing components rather than the CPU. One implementation method is to run the benchmark at a resolution of 1600×1200 and enable both filtering and anti-aliasing modes.
CPU Evaluation
To evaluate CPU performance, you need to run various application software. While each CPU architecture suits a different application model, it is entirely possible to create comparative measurements for CPUs. To achieve accurate, unbiased benchmark scores, you need to diversify your measurements and benchmarking tools.
Office applications and content creation tools are commonly used to evaluate CPU performance; however, this method often does not accurately reflect performance. For instance, Microsoft Outlook or Excel alone are not capable of fully utilizing the power of a Pentium 3GHz processor. BAPCo (www.bapco.com) has developed SYSmark 2004 (distributed by Futuremark, http://www.futuremark.com/products/sysmark2004/) which fully simulates everyday user tasks and measures the time taken to complete these tasks. However, installing SYSmark 2004 can be time-consuming, and the commercial version costs nearly $500. Recently, BAPCo released SYSmark 2004 SE, which includes the ability to evaluate 64-bit and multi-core systems.
Games are a more suitable benchmarking tool and can also be used for entertainment. When running a benchmark using games, to focus the results on CPU capability, you need to reduce the screen resolution and graphics effects. Note that some games are not very dependent on CPU speed. Commonly used games for CPU evaluation include: Doom 3, Unreal Tournament 2004, Far Cry…
Several professional benchmarking tools help evaluate the performance of systems specialized in graphics (such as film editing systems, video encoding, image processing) and are particularly aimed at multi-threaded systems. The professional 3D benchmarking tool SPECviewperf from SPEC (Standard Performance Evaluation Corporation, www.spec.org) offers various measurements to obtain benchmark scores and assess 3D rendering performance. Typical measurements in the Specviewperf 8 version (now available in version 9) include: 3dsmax, Catia, EnSight, Lightscape, Maya, Pro/Engineer, SolidWorks, and Unigraphics. If you are interested in video encoding functionality, try encoding a video clip with DivX, XviD, or Windows Media Encoder 9.
You should also pay attention to multi-threaded benchmarking tools. Intel’s Hyper-Threading (HT) technology and dual-core CPUs can significantly boost benchmark scores. Multi-threaded CPU evaluation tools include Futuremark 3DMark05, PCMark 05, SPECviewperf 9, SYSmark 2004 SE,…
BASIC BENCHMARKING TECHNIQUES | |
There are some basic techniques for benchmarking to ensure stable and comparable results: |
Evaluating Motherboards and Other Components
If you want outstanding scores to impress others, benchmarking motherboards (BMC) may disappoint you. This is simply because benchmark scores between motherboards with the same chipset are not significantly different. However, the benchmark scores of each functional component can vary quite distinctly.
Most CPU benchmarking tools can also be used to evaluate chipsets. Results from real applications (games, video encoding programs, etc.) naturally provide the most valuable scores, but emulation programs can also help identify standout products among very similar ones. One of the most popular tools today is SiSoftware’s Sandra (www.sisoftware.co.uk). This tool allows for performance evaluation of the CPU, memory bandwidth, network bandwidth, graphics, and hard drive. Notably, SiSoftware offers a free standard version of Sandra 2004 (the 2005 version is now available).
Are you interested in audio quality? You can use RightMark Audio Analyzer (audio.rightmark.org) to evaluate the quality of integrated audio controls versus dedicated sound cards (if available). Next, conduct tests with games, toggling audio on and off to identify sound effects.
Hard drive performance is also worth considering; although current SATA drives are designed with speeds up to 150MBps, actual speeds are often lower. Benchmark tools like Iometer (www.iometer.org) can determine the actual speeds of these hard drives. You can use Iometer to check the difference between single drive configurations and RAID 0 configurations, or the impact of enabling NCQ (Native Command Queuing).
If you find yourself “overwhelmed” by numerous benchmark scores of component parts, you may use PC World’s WorldBench 5 (www.worldbench.com) to evaluate overall performance. WorldBench 5 utilizes a suite of real applications (ACDSee, 3dsmax, Adobe Photoshop, Microsoft Office, etc.) and runs actual tasks to obtain benchmark scores.
USER-FRIENDLY BENCHMARK |
Alongside existing methods, Intel has recently proposed two new evaluation methods for the gaming and digital home entertainment sectors at the Intel Capabilities Forum (www.intelcapabilitiesforum.com). You can learn, download tools, and provide feedback for development here. The statistical analysis theories of Threshold and Bayesian have been incorporated into the Intel Gaming Capabilities Assessment Tool (GCAT) to provide understandable results that closely reflect player perceptions. Based on the fps results recorded over 3 minutes of actual gameplay, the Intel Gaming Capabilities Assessment Tool provides a user-perceived score based on both analysis methods. Scores increase from 1 to 5, corresponding to ratings from poor to excellent. Currently, GCAT supports 3 games: Doom 3, Half-Life 2, and Unreal Tournament 2004. In addition to developing tools to support more games, Intel has created the Ice Storm Fighters game with a multi-threaded engine to fully assess the capabilities of multi-core processors. The game is designed to support at least two threads and is optimized for 4 processing threads (of which 3 are allocated for artificial intelligence processing) and uses the D* Lite pathfinding algorithm (currently used in Futuremark’s 3DMark05) to maximize processor performance. In evaluating digital home capabilities, Intel has developed a scenario closely aligned with real-world usage needs: simulating console interfaces, simulating multimedia content server (ID: A0509_77), increasing the number of background applications, increasing the number of multimedia content transmission tasks, and adding functionality groups based on high-definition (HD) standards. To implement a hardware-independent method, Intel has developed two software tools that simulate TV cards and Digital Media Adapters for multimedia content delivery services. Test results are presented visually and simply to clearly indicate pass/fail status for each specific function. When comparing video quality visually, opinions may vary significantly; hence, the Intel Digital Home Capabilities Assessment Tool opts for analysis based on collected data, including response times, video quality, and compatibility levels to produce a clear result. The evaluation method was developed by Intel’s User-Centered Design Group and Psytechnics (UK), which has expertise in perceptual modeling. Duy Khánh |
Conclusion
Most of the benchmarking software introduced in this article is commercial products; however, there are also many free software options that are very useful (see “Free Benchmark Software” table). Of course, you do not have to use all of these software programs; choose benchmarking software based on the component you want to evaluate.
You should conduct benchmarking at least 2 or 3 times and take the average or highest score. Note that many factors can unpredictably affect benchmark scores, such as differences in driver versions, BIOS, background services, and hardware. Therefore, be careful to document your configuration details for future reference and comparison.
Benchmark scores are sometimes used by some users to “show off,” but for knowledgeable individuals, they represent very useful information when making purchasing or upgrade decisions. Benchmarking requires adherence to strict procedures; it demands time and experience to recognize true values. However, this is also an interesting task, and the results can bring you pride (in both your system and yourself).
Phương Uyên
(compiled)
REFERENCES
|