CPU Encoding Tests

One of the interesting elements on modern processors is encoding performance. This includes encryption/decryption, as well as video transcoding from one video format to another. In the encrypt/decrypt scenario, this remains pertinent to on-the-fly encryption of sensitive data - a process by which more modern devices are leaning to for software security. Video transcoding as a tool to adjust the quality, file size and resolution of a video file has boomed in recent years, such as providing the optimum video for devices before consumption, or for game streamers who are wanting to upload the output from their video camera in real-time. As we move into live 3D video, this task will only get more strenuous, and it turns out that the performance of certain algorithms is a function of the input/output of the content.

All of our benchmark results can also be found in our benchmark engine, Bench.

7-Zip 9.2: link

One of the freeware compression tools that offers good scaling performance between processors is 7-Zip. It runs under an open-source licence, is fast, and easy to use tool for power users. We run the benchmark mode via the command line for four loops and take the output score.

Encoding: 7-Zip Combined Score

Encoding: 7-Zip CompressionEncoding: 7-Zip Decompression

At the request of a few users, we've gone back through our saved benchmark data and pulled out compression/decompression numbers for 7-zip. AMD clearly makes a win here in decompression by a long way with all the threads, and the 1800X beats the 1950X in Game Mode due to frequency.

WinRAR 5.40: link

For the 2017 test suite, we move to the latest version of WinRAR in our compression test. WinRAR in some quarters is more user friendly that 7-Zip, hence its inclusion. Rather than use a benchmark mode as we did with 7-Zip, here we take a set of files representative of a generic stack (33 video files in 1.37 GB, 2834 smaller website files in 370 folders in 150 MB) of compressible and incompressible formats. The results shown are the time taken to encode the file. Due to DRAM caching, we run the test 10 times and take the average of the last five runs when the benchmark is in a steady state.

Encoding: WinRAR 5.40

WinRAR encoding is another test that doesn't scale up especially well with thread counts. After only a few threads, most of its MT performance gains have been achieved. The balance here is with memory and frequency, to which the 1800X wins. The 1800X takes a sizeable gain over the 1950X in Game Mode too, likely due to far memory latency.

AES Encoding

Algorithms using AES coding have spread far and wide as a ubiquitous tool for encryption. Again, this is another CPU limited test, and modern CPUs have special AES pathways to accelerate their performance. We often see scaling in both frequency and cores with this benchmark. We use the latest version of TrueCrypt and run its benchmark mode over 1GB of in-DRAM data. Results shown are the GB/s average of encryption and decryption.

Encoding: AES

HandBrake v1.0.2 H264 and HEVC: link

As mentioned above, video transcoding (both encode and decode) is a hot topic in performance metrics as more and more content is being created. First consideration is the standard in which the video is encoded, which can be lossless or lossy, trade performance for file-size, trade quality for file-size, or all of the above can increase encoding rates to help accelerate decoding rates. Alongside Google's favorite codec, VP9, there are two others that are taking hold: H264, the older codec, is practically everywhere and is designed to be optimized for 1080p video, and HEVC (or H265) that is aimed to provide the same quality as H264 but at a lower file-size (or better quality for the same size). HEVC is important as 4K is streamed over the air, meaning less bits need to be transferred for the same quality content.

Handbrake is a favored tool for transcoding, and so our test regime takes care of three areas.

Low Quality/Resolution H264: Here we transcode a 640x266 H264 rip of a 2 hour film, and change the encoding from Main profile to High profile, using the very-fast preset.

Encoding: Handbrake H264 (LQ)

High Quality/Resolution H264: A similar test, but this time we take a ten-minute double 4K (3840x4320) file running at 60 Hz and transcode from Main to High, using the very-fast preset.

Encoding: Handbrake H264 (HQ)

HEVC Test: Using the same video in HQ, we change the resolution and codec of the original video from 4K60 in H264 into 4K60 HEVC.

Encoding: Handbrake HEVC (4K)

 

Benchmarking Performance: CPU Web Tests Benchmarking Performance: CPU Office Tests
Comments Locked

104 Comments

View All Comments

  • peevee - Friday, August 18, 2017 - link

    Compilation scales even on multi-CPU machines. With much higher communication latencies.
    In general, compilers running in parallel on MSVC (with MSBuild) run in different processes, they don't write into each other's address spaces and so do not need to communicate at all.

    Quit making excuses. You are doing something wrong. I am doing development for multi-CPU machines and ON multi-CPU machines for a very long time. YOU are doing something wrong.
  • peevee - Friday, August 18, 2017 - link

    BTW, when you enable NUMA on TR, does Windows 10 recognize it as one CPU group or 2?
  • gzunk - Saturday, August 19, 2017 - link

    It recognizes it as two NUMA nodes.
  • Alexey291 - Saturday, September 2, 2017 - link

    They aren't going to do anything.

    All their 'scientific benchmarking' is running the same macro again and again on different hardware setups.

    What you are suggesting requires actual work and thought.
  • Arbie - Thursday, August 17, 2017 - link

    As noted by edzieba, the correct phrase (and I'm sure it has a very British heritage) is "The proof of the pudding is in the eating".

    Another phrase needing repair: "multithreaded tests were almost halved to the 1950X". Was this meant to be something like "multithreaded tests were almost half of those in Creator mode" (?).

    Technically, of course, your articles are really well-done; thanks for all of them.
  • fanofanand - Thursday, August 17, 2017 - link

    Thank you for listening to the readers and re-testing this, Ian!
  • ddriver - Thursday, August 17, 2017 - link

    To sum it up - "game mode" is moronic. It is moronic for amd to push it, and to push TR as a gaming platform, which is clearly neither its peak, nor even its strong point. It is even more moronic for people to spend more than double the money just to have half of the CPU disabled, and still get worse performance than a ryzen chip.

    TR is great for prosumers, and represents a tremendous value and performance at a whole new level of affordability. It will do for games if you are a prosumer who occasionally games, but if you are a gamer it makes zero sense. Having AMD push it as a gaming platform only gives "people" the excuse to whine how bad it is at it.

    Also, I cannot shake the feeling there should be a better way to limit scheduling to half the chip for games without having to disable the rest, so it is still usable to the rest of the system.
  • Gothmoth - Thursday, August 17, 2017 - link

    first coders should do their job.. that is the main problem today. lazy and uncompetent coders.
  • eriohl - Thursday, August 17, 2017 - link

    Of course you could limit thread scheduling on software level. But it seems to me that there is a perfectly reasonable explanation why Microsoft and the game developers haven't been spending much time optimizing for running games on systems with NUMA.
  • HomeworldFound - Thursday, August 17, 2017 - link

    You can't call a coder that doesn't anticipate a 16 core 32 thread CPU lazy. The word is incompetent btw. I'd like to see you make a game worth millions of dollars and account for this processor, heck any processor with more than six cores.

Log in

Don't have an account? Sign up now