OS Preparation and Benchmark Installation

Windows 10 Pro

As we started to use Windows 10 Pro in our last update, there's a large opportunity for something to come in and disrupt our testing. Windows 10 is known to kick-in and check for updates at any hour of the day (and we’re testing 24hr), so anything that can interrupt or take CPU time away from benchmarking is a bit of a hassle. There’s also the added element of Windows silently adjusting the update schedule and moving places in the registry without warning.

During building this latest suite, Microsoft launched Windows 10 version 2004. There is always a question as to what we should do in this regard – move to the absolute latest, or take a step back to something more stable and fewer bugs but it might not be as relevant. In order to not create any level of programming debt, by which lots of work is needed to fix the smallest issues that might arise, we often choose the latter. In this regard, we are using Windows 10 version 1909 (18363.900). It has since transpired, from talking to peers, that 2004 has a number of issues that would affect benchmarking consistency, which validates our concerns.

Naturally, the first thing an OS wants to do when it starts up is connect to the internet and update. We install the OS without the internet connected, and our install image automatically sets the update period to the maximum period possible. The scripts we run are continuously updated to ensure that when the benchmark starts, the ‘don’t restart’ period for the OS is resynchronized to the latest possible time. There’s nothing worse than a restart in the middle of a scripted run to wake up in the morning to find that the system rebooted at 1am.

The OS is installed manually with most of the default settings, and disabling all the extra monitoring features offered on install. On entering the OS, our default strategy is multiple: disable the ability to update as much as possible in the registry, disable Windows Defender, uninstall OneDrive, disable Cortana as much as possible, implement the high performance mode in the power options and disable the platform from turning off the display. We also pull the latest version of CPU-Z from network storage, in case we are testing a very new system. Another script is in place to run when the OS loads, to check the CPU and GPU is what we expect, as well as the GPU drivers that we needed are in place, as Windows has a habit of updating those without saying anything. Windows Defender is also disabled, as it (personally) has historically seems to eat CPU time if the network changes for no reason, even when the system is in use.

Some of these strategies are designed to be redundant. The goal here is to attack the option needed in as many different ways as possible. There’s nothing lost by being thorough at this point and hammering the point home. This means executing registry files that adjust settings, executing batch files which do the same while installing files, and reiterating these commands before every benchmark run in order to be crystal clear. Simply put, do not implicitly trust Windows to leave the settings alone. Something always invariably changes (or moves somewhere else) if it is not monitored. Some of these commands that are in place are also old/legacy, but are kept as they don’t otherwise adjust the system (and can take effect if options that are continually moved around suddenly move back).

It is worth noting that some of the options, when run through a batch file, require the file to be run as Administrator. Windows 10 makes a frustrating task to do so manually recently without implementing user access elevation. The best way to ensure that the batch file always runs in admin mode seems to be to create a shortcut to the batch file, and adjusting the properties of the shortcut to always enable the ‘run as admin’ mode. It is an interesting kludge for that to work, and it is frustrating I cannot just adjust the batch file properties directly to run as admin every time.

Benchmark Installs

When choosing a benchmark, it often falls under two headers – standalone, such that it can be run as is, or ones that need installation. With installation, these are subdivided further into those with silent installers, and those who have to have the installation done manually.

Installing benchmarks can either be done before running the main script, or be integrated directly into the main testing script. As time has progressed, we have moved from the former to the latter, so we can wrap uninstall commands into the script if we only get limited access to a system. For the manually installed benchmarks this isn’t possible, and technically calling an install/uninstall from the script does make total testing time longer, but it also reduces requirements for SSD capacity by not having everything installed at once. Experience of doing this scripting over the past few years, and making the benchmark scripts as portable as possible, have pointed to making the install/uninstall part of the benchmark run.

Benchmarks that could be run without installing, known as ‘standalone’ benchmarks, are the holt grail. Cinebench and others are great for this. But for the others, these are probed for silent install methods. Certain benchmarks in the past, such as PCMark8, also have additional features to enable online registration to enable DRM through the command line. Other installers, such as .msi files, seem to be unable to be installed if they are not in the directory from which the batch file was called without the right commands. When scripting successive installs, it becomes important to check the previous one has finished before another one starts, otherwise the script might jump straight to the next installer before the previous ones were finished, making it tricky as well.

For msi files, our install code relies heavily on the following command to ensure that installs are finished before tackling the next one:

cmd /c start /wait msiexec /qb /i <file>

Most .msi files have the same flags for silent installs, however install executables can vary significantly and require probing the vendor documentation. For the most part, a ‘/S’ flag is the silent install flag, while others require /norestart to ensure the system doesn’t restart immediately, or /quiet, to get going in a silent fashion. Some installations use none of these and rely on their own definitions of what constitutes a silent install flag. I’m looking at you, Adobe. However ultimately, most software packages that can install silently, or require additional commands to enable licenses, and are ready to be called for their respective tests.

One benchmark is a special case: Chrome. Chrome has the amazing ability to update itself as soon as it is installed – even without opening it or when the system is booted. To stop this from happening is more than just a simple software adjustment, purely because Google no longer offers an option to delay updates. We initially found an undocumented way to stop it from updating, which requires the install script to gut some of the files after installing the software in order to stop this happening, however the quick update cycle of Chrome means that our v56 version from last year is now out of date. To get over this, we are using a standalone version of Chromium.

The final benchmark in our install is Steam, which is a fully manual only install. Valve has created Steam with a really odd interface interaction mechanism type, even for AHK scripting, which makes installing Steam a bit of a hassle. Valve does not offer a complete standalone installer here, so the base program opens after installation to download ~200MB of updates on a fresh system. We install the software over the Steam directory already present on the benchmark partition from a previous OS install, so the games do not need to be re-downloaded. (When an OS is installed, it’s installed on a specific OS partition, and all benchmarks are kept on a second partition).

One other point to be aware of is when software checks for updates. Loading AIDA, for example, means that it will probe online for the latest version and leave a hanging message box to be answered before a script can continue. There are often two ways to do this, and the best is if the program allows the user to set the ‘no updates’ automatically in the configuration files. The fall back tactic that works is to disable the internet connectivity (often by disabling all network adaptors through PowerShell) while the application is running.

Benchmark Automation The CPU Overload 2020 Suite
Comments Locked

110 Comments

View All Comments

  • 29a - Monday, July 20, 2020 - link

    Please remove Egomark from the benchmark list.
  • Meteor2 - Monday, August 3, 2020 - link

    Why?
  • Mr Perfect - Monday, July 20, 2020 - link

    Reading through the OS preparation section, I kind of wonder if setting up a domain would be helpful?

    Joining a test PC to a domain would allow all of those settings to be configured through GPO instead of running tons of batch files and scripts. You'd also gain the ability to point Windows Update at a WSUS server, where you control what updates are even shown to the PC (in your case, probably none). Throw in the ability to remotely run scripts with Domain Administrator accounts, and you could probably skip around those UAC prompts too.

    It would be a lot of setup the first time around, but it does point to that automation-eventually-pays-off thing.
  • Icehawk - Monday, July 20, 2020 - link

    Very cool!

    Would like to see your handbrake HEVC encoding done via software with no vendor encoder - it’s the only way you guys can be getting those crazy fps numbers. I don’t want to see how a vendor encoder runs, I want to see how the CPU runs - and those hardware ones are still worse than software so I do not use them even though it is a massive speed boost.
  • extide - Monday, July 20, 2020 - link

    Using vector instructions like AVX is still "software" encoding. It's fully CPU, and not at all a lower quality hardware encoder.
  • faizoff - Monday, July 20, 2020 - link

    Until I upgraded from an HD 6870 to an RX 580 recently I had no idea GPUs had dedicated encoders. I've tried them and they are definitely faster than the CPU, the same file that I tried got well over 40 fps compared to the 5 fps when choosing the CPU encoder.

    The caveat was that the GPU encoded files were much larger in size with comparable quality.
  • lmcd - Tuesday, July 21, 2020 - link

    There's ways to push file size back down afaik.
  • Meteor2 - Monday, August 3, 2020 - link

    Not with hardware encoding.
  • jaminvi - Monday, July 20, 2020 - link

    Looks great from here. Good cross section of test. Looking forward to it.
  • catavalon21 - Monday, July 20, 2020 - link

    This is outstanding. Very much like the stuff on this site back in this site's early days, like comparing Pentium performance with and without MMX. Comparing the performance between VX and HX chipsets. Tip of the hat, old man.

Log in

Don't have an account? Sign up now