Power Management Features

Real-world client storage workloads leave SSDs idle most of the time, so the active power measurements presented earlier in this review only account for a small part of what determines a drive's suitability for battery-powered use. Especially under light use, the power efficiency of a SSD is determined mostly be how well it can save power when idle.

For many NVMe SSDs, the closely related matter of thermal management can also be important. M.2 SSDs can concentrate a lot of power in a very small space. They may also be used in locations with high ambient temperatures and poor cooling, such as tucked under a GPU on a desktop motherboard, or in a poorly-ventilated notebook.

Toshiba XG6
NVMe Power and Thermal Management Features
Controller Toshiba TC58NCP090GSB
Firmware AGXA4001
NVMe
Version
Feature Status
1.0 Number of operational (active) power states 3
1.1 Number of non-operational (idle) power states 3
Autonomous Power State Transition (APST) Supported
1.2 Warning Temperature 78°C
Critical Temperature 82°C
1.3 Host Controlled Thermal Management Supported
 Non-Operational Power State Permissive Mode Not Supported

The Toshiba XG6 supports most of the NVMe power and thermal management features save for the relatively obscure and recent non-operational power state permissive mode to control background processing during idle time. The XG6 is a bit unusual in providing three idle states instead of just two, but the deepest state is probably not worth using very often due to its higher transition latency and minimal power power savings relative to the intermediate PS4 idle state.

Toshiba XG6
NVMe Power States
Controller Toshiba TC58NCP090GSB
Firmware AGXA4001
Power
State
Maximum
Power
Active/Idle Entry
Latency
Exit
Latency
PS 0 6.0 W Active - -
PS 1 2.4 W Active - -
PS 2 1.9 W Active - -
PS 3 50 mW Idle 1.5 ms 1.5 ms
PS 4 5 mW Idle 6 ms 14 ms
PS 5 3 mW Idle 50 ms 80 ms

Note that the above tables reflect only the information provided by the drive to the OS. The power and latency numbers are often very conservative estimates, but they are what the OS uses to determine which idle states to use and how long to wait before dropping to a deeper idle state.

Idle Power Measurement

SATA SSDs are tested with SATA link power management disabled to measure their active idle power draw, and with it enabled for the deeper idle power consumption score and the idle wake-up latency test. Our testbed, like any ordinary desktop system, cannot trigger the deepest DevSleep idle state.

Idle power management for NVMe SSDs is far more complicated than for SATA SSDs. NVMe SSDs can support several different idle power states, and through the Autonomous Power State Transition (APST) feature the operating system can set a drive's policy for when to drop down to a lower power state. There is typically a tradeoff in that lower-power states take longer to enter and wake up from, so the choice about what power states to use may differ for desktop and notebooks.

We report two idle power measurements. Active idle is representative of a typical desktop, where none of the advanced PCIe link or NVMe power saving features are enabled and the drive is immediately ready to process new commands. The idle power consumption metric is measured with PCIe Active State Power Management L1.2 state enabled and NVMe APST enabled if supported.

Active Idle Power Consumption (No LPM)Idle Power Consumption

The active idle power consumption of the Toshiba XG6 is slightly higher than the XG5 but still in the normal range for high-end NVMe drives. However, Silicon Motion and Phison have both managed to reach active idle power levels that are well under 1W instead of slightly higher, so Toshiba does have some room for improvement in power efficiency. The 110mW idle power we measured is decent for a NVMe drive, but substantially higher than the Silicon Motion controllers that manage the rare feat of successfully making use of their deepest idle state even on our desktop testbed.

Idle Wake-Up Latency

The idle wake-up latency of the Toshiba XG6 is substantially higher than the XG5 and is comparable to that of the Silicon Motion-based drives that lead in our idle power consumption measurements. As long as the Toshiba XG6 is no slower to wake up in a notebook system where it successfully reaches its deepest power state, this latency shouldn't be a problem. Desktop systems would probably not be using the deepest idle state very often, so the wake-up should usually be much lower than the 52ms measured here.

Mixed Read/Write Performance Conclusion
Comments Locked

31 Comments

View All Comments

  • Valantar - Friday, September 7, 2018 - link

    AFAIK they're very careful which patches are applied to test beds, and if they affect performance, older drives are retested to account for this. Benchmarks like this are never really applicable outside of the system they're tested in, but the system is designed to provide a level playing field and repeatable results. That's really the best you can hope for. Unless the test bed has a consistent >10% performance deficit to most other systems out there, there's no reason to change it unless it's becoming outdated in other significant areas.
  • iwod - Thursday, September 6, 2018 - link

    So we are limited by PCI-e interface again. Since the birth of SSD, we pushed past SATA 3Gbps / 6Gbps, than PCI-E 2.0 x4 2GB/S and now PCI-E 3.0, 4GB/s.

    When are we going to get PCI-E 4.0, or since 5.0 is only just around the corner may as well wait for it. That is 16GB/s, plenty of room for SSD maker to figure out how to get there.
  • MrSpadge - Thursday, September 6, 2018 - link

    There's no need to rush there. If you need higher performance, use multiple drives. Maybe on a HEDT or Enterprise platform if you need extreme performance.

    But don't be surprised if that won't help your PC as much as you thought. The ultimate limit currently is a RAMdisk. Launch a game from there or install some software - it's still surprisingly slow, because the CPU becomes the bottleneck. And that already applies to modern SSDs, which is obvious in benchmarks which test copying, installing or application launching etc.
  • abufrejoval - Friday, September 7, 2018 - link

    Could also be the OS or the RAMdisk driver. When I finished building my 128GB 18-Core system with a FusionIO 2.4 TB leftover and 10Gbit Ethernet, I obviously wanted to bench it on Windows and Linux. I was rather shocked to see how slow things generally remained and how pretty much all these 36 HT-"CPU"s were just yawning.

    In the end I never found out, if it was the last free version (3.4.8) version of SoftPerfect's RAM disk that didn' seem to make use of all four memory Xeon E5 memory channels, or some bottleneck in Windows (never seen Windows update user more than a single core), but I never got anywhere near the 70GB/s Johan had me dream of (https://www.anandtech.com/show/8423/intel-xeon-e5-... Don't think I even saturated the 10Gbase-T network, if I recall correctly.

    It was quite different in many cases on Linux, but I do remember running an entire Oracle database on tmpfs once, and then an OLTP benchmark on that... again earning myself a totally bored system under the most intensive benchmark hammering I could orchestrate.

    There are so many serialization points in all parts of that stack, you never really get the performance you pay for until someone has gone all the way and rewritten the entire software stack from scratch for parallel and in-memory.

    Latency is the killer for performance in storage, not bandwidth. You can saturate all bandwidth capacities with HDDs, even tape. Thing is, with dozens (modern CPUs) or thousands (modern GPGPUs) SSDs *become tape*, because of the latencies incurred on non-linear access patterns.

    That's why after NVMe, NV-DIMMs or true non-volatile RAM is becoming so important. You might argue that a cache line read from main memory still looks like a tape library change against the register file of an xPU, but it's still way better than PCIe-5-10 with a kernel based block layer abstraction could ever be.

    Linear speed and loops are dead: If you cannot unroll, you'll have to crawl.
  • halcyon - Monday, September 10, 2018 - link

    Thank you for writing this.
  • Quantum Mechanix - Monday, September 10, 2018 - link

    Awesome write up- my favorite kind of comment, where I walk away just a *tiny* less ignorant. Thank you! :)
  • DanNeely - Thursday, September 6, 2018 - link

    We've been 3.0 x4 bottlenecked for a few years.

    From what I've read about the implementing 4.0/5.0 on a mobo I'm not convinced we'll see them on consumer boards, at least not in its current form. The maximum PCB trace length without expensive boosters is too short, AIUI 4.0 is marginal to the top PCIe slot/chipset and 5.0 would need signal boosters even to go that far. Estimates I've seen were $50-100 (I think for an x16 slot) to make a 4.0 slot and several times that for 5.0. Cables can apparently go several times longer than PCB traces while maintaining signal quality, but I'm skeptical about them getting snaked around consumer mobos.

    And as MrSpadge pointed out in many applications scale out wider is an option, and what I've read that Enterprise Storage is looking at. Instead of x4 slots that have 2/4x the bandwidth of current ones that market is more interested in 5.0 x1 connections that have the same bandwidth as current devices but which would allow them to connect 4 times as many drives. That seems plausible to me since enterprise drive firmware is generally tuned for steady state performance not bursts and most of them don't come as close to saturating buses as high end consumer drives do for shorter/more intermitant workloads.
  • abufrejoval - Friday, September 7, 2018 - link

    I guess that's why they are working on silicon photonics: PCB voltage levels, densities, layers, trace lengths... Whereever you look there are walls of physics rising into mountains. If only PCBs weren't so much cheaper than silicon interposers, photonics and other new and rare things!
  • darwiniandude - Sunday, September 9, 2018 - link

    Any testing under windows on current MacBook Pro hardware? Those SSD's I would've thought are much much faster, but I'd love to see the same test on them.
  • halcyon - Monday, September 10, 2018 - link

    Thanks for the review. For future, could you consider segregating the drives into different tiers based on results, e.g. video editing, dB, generic OS/boot/app drive, compilation, whatnot.

    Now it seems that one drive is better in ine thing, and another drive in anither scenario. But not having your in-depth knowledge, makes it harder to assess which drive would be closest to optimal in which scenario.

Log in

Don't have an account? Sign up now