One of the more surprising facts we learned from ARM when we sat down with them last year to discuss their Mali GPU architecture was just how successful the Mali-400 family was and is; not just in the past tense but in the present tense as well. Although ARM has been selling Mali T600/700/800 designs based on their Midgard architecture for a few years now, the Utgard based Mali-400/450 have continued to sell well due Utgard’s “no-frills” OpenGL ES 2.0 design. This has made the Mali-400/450 popular parts in low cost and low power environments, where a complete Mali design takes up a minimal amount of die space and similarly small amounts of power.

Though chasing the low-end of the market doesn’t come with the glamour of the high-end (and for that there’s Midgard), for ARM it has in many ways the textbook low-price/high-volume success story. The company’s estimates put them in plenty of low-end tablets and smartphones, but more unexpectedly they have a lion’s share of the smart TV market, with 75% of TVs implementing a Mali design. With TVs only requiring enough graphics power for basic UI functions and video decoding at 1080p – few smart TVs play games or do fancy transparency/animation effects – even after having been eclipsed by more powerful GPUs at the high-end, the Mali-400 family has found a second home in these sort of simple devices that need little more than a no-frills GPU, be it for cost reasons, power reasons, or both.

Given the continued success of the Mali-400 family it should come as little surprise then that ARM has continued to develop the family, as the need for OpenGL ES 2.0 won’t be coming to an end any time soon. To that end, today ARM is announcing the release of the next iteration of the Mali-400 family IP, the Mali-470. Mali-470 serves as something of a decedent of both Mali-400 and Mali-450, with ARM intending to make it a direct successor to Mali-400 while also integrating some of the improvements that original went in to Mali-450 as well.

With the Mali-470 ARM is specifically targeting the IoT and embedded market, even more so than they were already going after it with the Mali-400 and Mali-450. In recent times the Mali-450 has been ARM’s more powerful Mali-400 family GPU, specifically going after low-end devices and the aforementioned TV market. Meanwhile thanks in part to the introduction of smaller processes since its introduction in 2008, the Mali-400 has already been something of a darling for low-power applications, appearing in wearable devices such as Samsung’s Gear S2. ARM and their licensees continue to be bullish on wearables and IoT in general, and this has lead to ARM setting out to develop a more direct update to the Mali-400 to further improve on its energy efficiency.

Overall ARM is aiming to double Mali-400’s energy efficiency with Mali-470, in accordance with their goals to reach further into the wearables and IoT market. To do so, ARM has taken some of Mali-450’s energy optimizations and further built on them; chief among these changes are far more extensive use of power gating to shut off transistors when they’re not needed, and improved use of clock gating when power gating isn’t an option. In order to accomplish this ARM has implemented separate power domains on Mali-470 – each pixel/fragment block can be on its own domain – and working with these domains is an on-GPU power controller, which works with the system power controller to better manage the GPU.

Improving the use of power and clock gating can greatly impact overall energy efficiency by cutting down on the amount of energy wasted by the GPU waiting for its next job, but it’s especially potent for bringing down idle power consumption.  In turn, given that idle power consumption continues to be one of the chief issues dogging wearables vis-a-vis battery life, the Mali-470 should help out battery life to a degree thanks to these improvements.

Along with these immediate power optimizations, ARM has also made some smaller optimizations to the Mali-400 family architecture to cut down on wasted power at a workload level. Mali-470’s vertex processor has been optimized to better handle fixed-point arithmetic, a use case that is again very common in simple UIs and can be even more power efficient than floating-point arithmetic. The pixel/fragment processor on the other hand has not gone untouched, with ARM working to cut down on the amount of time spent by the fragment processors changing between states.

ARM Mali-400 Family
  400 450 470
Fragment Shader Blocks (MP#) 1-4 1-8 1-4
Vertex Shader Blocks 1 1 1
Architecture Utgard Utgard Utgard
OpenGL ES ES 2.0 ES 2.0 ES 2.0
Year Released (IP) 2008 2012 2015

Moving on, in terms of scalability Mali-470 retains the multi-processor capabilities of the previous Mali designs, allowing Mali-470 to be scaled up from 1 fragment processor (MP1) to as many as 4 (MP4). Mali-470MP1 will in turn be the intended design for wearables given its spot as the lowest-power design, while the more powerful designs will allow Mali-470 to be used in other environments where more performance is needed. That ARM would retain their scalability for Mali-470 is a very intentional choice; as we mentioned earlier Mali-470 is intended to be the successor to Mali-400, with ARM going so far as to keep Mali-400’s MMU and bus interfaces and holding Mali-470’s die area closer to Mali-400’s. This in turn means that Mali-470 needs to be able to carry on from where Mali-400 has left off, not only in the wearables/IoT market but in the embedded market as well.

Wrapping things up, as with past ARM IP launches today’s announcement coincides with the general release of new IP to ARM’s customers. ARM tells us that they expect chips integrating Mali-470 designs to start leaving the fabs in the second-half of 2016, which would in turn have Mali-470 start appearing in consumer devices in the first-half of 2017. Ultiamtely ARM doesn’t control which products use which IP, but given their intentions with Mali-470, there’s a good chance that those consumer devices will include Android/Tizen wearables launching early that year.

Gallery: ARM Mali-470

Comments Locked

11 Comments

View All Comments

  • DanNeely - Tuesday, October 20, 2015 - link

    Is "no-fills" supposed to be "no-frills"? (You used the former both times.) If not, what does it mean?
  • Ryan Smith - Tuesday, October 20, 2015 - link

    Yes, no-frills. Thanks for pointing that out. Apparently Word decided it needed replaced without telling me...
  • Pissedoffyouth - Tuesday, October 20, 2015 - link

    Soon to be released as the GPU in the next Mediatek flagship SoC!
  • jjj - Tuesday, October 20, 2015 - link

    A bit odd to announce this for 2017 without a new small core for wearables. A7 is rather old and A53 is not that efficient.
  • DanNeely - Tuesday, October 20, 2015 - link

    Arm v7 is an old instruction set at this point, they're probably well past the point of tweaking hardware to get better power consumption levels outside of process improvements. At some point Arm v8 will trickle down to smaller parts; but it's benefits are only when doing things. Wearables/IoT devices spend virtually all of their time at idle where a better instruction set won't help.
  • name99 - Tuesday, October 20, 2015 - link

    To take an example, what little we know about the Apple Watch is that it is running a single core much like (identical to?) the A5 core (so 32-bit, ARMv7), severely underclocked, and probably most of the code as Thumb.
    Relevant to this topic, the watch GPU is apparently a PowerVR SGX543 (single core, the iOS A5's had an SGX543MP2).
    So basically the Apple Watch is about the equivalent of an A4/iPhone4 (slightly better CPU, but running quite a bit slower; better GPU but again running a lot slower).
  • MikhailT - Tuesday, October 20, 2015 - link

    I think this is just the first basic GPU ARM is announcing now for 2017 but it is not meant for expensive watches, mostly for cheap embedded stuff like TVs and low/med-end watches. For these, having 2x pref/w and low power is all ARM plans to do and it would be using the same nm process node as today, it will be the cheapest SoC in 2017 on a mature 2x process.

    I suspect they have a design already in planning for 2017+ that will be more efficient then Mali-470 and cost more to produce on a newer 1x nm process.
  • jjj - Tuesday, October 20, 2015 - link

    Never mentioned v7 or v8 that's not the point ,the point was perf, power and area. Wearables are not just watches, glasses are that too and those are far more difficult than watches. Been waiting for years to see if they properly address the market, apparently not just yet and that might just give others(Intel, Imagination) a window of opportunity.
  • extide - Tuesday, October 20, 2015 - link

    Well, the Cortex A5 is actually the most power efficient Cortex core, so I am surprised we haven't seen it in wearables. Although I don't think it can be used in a multi-processor config, which may or may not matter to some manufacturers, and could be a reason we see them utilizing A7's.
  • mczak - Tuesday, October 20, 2015 - link

    Cortex-A5 can be easily used in multiprocess configs - there's actually snapdragons out there which do exactly that (4 Cortex-A5 in msm8225q). All Cortex-A chips can, with the exception of the very first one (Cortex-A8).
    Albeit I'm not sure it's really the most energy efficient one, at least once you factor in the necessary external (memory etc.) stuff Cortex-A7 might well be just as good.

Log in

Don't have an account? Sign up now