AMD's 2010/2011 Roadmap from the IT Professional’s Perspective
by Johan De Gelas on November 23, 2009 12:00 AM EST- Posted in
- IT Computing
Server CPUs in 2010
AMD’s best core in 2010 is a slightly improved revision of the current six-core Opteron “Istanbul” with the following additions:
• Finally a “real” C1E state which reduces power for each core that is idleing
• Support for DDR-3
In theory, DDR-3 1333 offers 66% higher bandwidth, but in practice the Stream benchmark does not measure more than a 25% boost in bandwidth. The latency of going off-die is about the same. That means that the performance increase in most server applications will not be tangible. Only the most bandwidth intensive HPC applications will get a boost of 10 to 20%.
Currently, AMD's six-core Opteron can match the performance of Intel’s quadcore Xeon 5500 at the same clockspeed in some important server applications: OLAP databases, virtualization and web applications. Intel’s best Xeon wins with a significant margin in OLTP, ERP and rendering. A large part of the HPC market is a lost cause: a quadcore Intel Xeon 5570 at 2.93 GHz is about twice as fast as a AMD Opteron 2389 at 2.9 GHz. The fact that we could not find any Opteron 2435 results in LS-Dyna is another indication of what to expect: the 10-20% higher performance in HPC applications will not be a large step forward.
Intel is going to increase performance by 20-30% per CPU (50% more cores), while AMD’s CPUs will see only marginal increases. So basically, Intel’s performance advantage is going to grow by 20 to 30%, except in HPC workloads where it is already running circles around the competition. Not an enviable position to be in for AMD.
Suppose that you are the strategic brain behind AMD. The competition offers better “per chip” and “per core” performance. The last thing you want to do is to offer the same kind of server platform. If a six-core Opteron (“Lisbon") goes head to head with a six-core Xeon (“westmere EP”), it will not be pretty: the Intel chip will beat the AMD chip in performance and performance/watt (remember, westmere EP is a 32 nm CPU). Despite this, AMD found some clever ways to make their server platforms interesting…
Cheaper 4-Socket Servers
“Know your enemies and know yourself”.
In which usage scenario’s are Intel’s offerings less compelling? The Nehalem-EX is a powerful platform, but it is also a completely different one than the “Westmere EP” platform. The Nehalem-EX's most important market is the 4-socket/8-socket x86 market, where about 400,000 servers are sold per year, or about 5% of the total x86 server market. It is also a pretty complex platform with two I/O hubs and 16 (!) memory buffers chips on a 4-socket board. The Nehalem EX platform does not only want to conquer the high end 4 and 8-socket x86 server market, it also wants to convince the more paranoid RISC and Itanium buyers:
Our first impression is that AMD will find it hard to win the high end database and ERP market. The quadcore Nehalem 5500 already outperforms the six-core Opteron “Istanbul” by a large margin (30-50%). The Opteron 6100 also has 50% more cores, but it is likely that a “native octalcore” will scale a bit better than a two times 6-core design. For the virtualization market, the higher amount of DIMM slots are an advantage for the Nehalem EX. At first sight, it looks like it will be pretty tough for AMD to regain market share in this part of the server market.
107 Comments
View All Comments
Zool - Thursday, November 26, 2009 - link
The desktop Phenom II X4 925 in 1000 quantities from amd site is 145 USD. The opteron 8300 series (simiral cache and die area than phenon II) lowest priced model 523 USD , highest priced model Quad-Core AMD Opteron 8393 SE is costing 2649 USD.The wafer cost for the 145 USD cpu is same than for the 2649 USD cpu.If the die areas are similar than the actual manufacturing(same machine usage,same workforce, etc) costs should be almost identical.
So now they are selling the Phenom II X4 925 for 145 USD and i asume that they have some margins even on these models. So let we say 25 USD are the margins and 120 USD the costs.
So for the Quad-Core AMD Opteron 8393 SE the margins will be 2529 USD. Now wait a moment biatch. THATS 101 TIMES more than for the almost same Phenom II X4 925. For a average Opteron they get around 50 times more money the same low end desktop. The same story for intel server cpu-s.
No wonder they can SHIT on low cost desktop cpu-s. The whole roadmap is a mess about cores and manufacturing proces for server cpu derivates.
vsary6968 - Thursday, November 26, 2009 - link
Show me the benchmark that the Nehalem-EX beat Magny-Cours. So don't stated something that is not out yet.This is hurting other forum threadjames775 - Tuesday, November 24, 2009 - link
is now up and available at:http://www.amdzone.com/phpbb3/viewtopic.php?f=52&a...">http://www.amdzone.com/phpbb3/viewtopic...amp;star...
Chlorus - Tuesday, November 24, 2009 - link
I'm sure a website titled "AMDZone" will be objective and nonbiased.james775 - Tuesday, November 24, 2009 - link
sure, its unbiased just like this article.http://bit.ly/8BX9UG">http://bit.ly/8BX9UG
happy? =))
james775 - Tuesday, November 24, 2009 - link
http://bit.ly/6Id6y0">http://bit.ly/6Id6y0Zool - Tuesday, November 24, 2009 - link
Huh "The extra integer core (schedulers, D-cache and pipelines) adds only 5% die space".They finaly found out that the amount of owerhead that they add to each execution core which actualy makes the real work ( something like 1/5 of the core logic die size) is not worth duplicate x times with each core.
Maybe if they would make the pipelines much shorter and add only very basic prefetch , decode , branch prediction logic the amount of performance for the transistor budget would be quite shocking.
I mean how much slower would be a amd thunderbird core on 4 GHz to curent single nehalem core.
If u download this cpu test program with the results ( link : http://testcpu.webz.cz/index.htm">http://testcpu.webz.cz/index.htm ) u can compare your result with old cpus. The program is quite old but that means its quite fair too.
A single core wolfdale 3.2 ghz Dhry=10712575
Whet=2372478
Mips=7160629
Mflops=995667
amd athlon 1100 Mhz(22mil transistors) Dhry=2220351
Whet=692956
Mips=2382066
Mflops=300902
Thats around 60% faster wolfdale on same clocks than the 22mil transistor (need to note that the L2 cache was on the cpu board :) )
Just want to say that the several times more complex logic and die size increase gives you quite disapointing results.
So someone out there could finaly make real low power high frequency cpu-s and dont chase cpu cores.
freezervv - Wednesday, November 25, 2009 - link
"The program is quite old but that means its quite fair too.""Just want to say that the several times more complex logic and die size increase gives you quite disapointing results."
Umm, isn't that why people in the real world use efficient ISA extensions?
Zool - Wednesday, November 25, 2009 - link
"Umm, isn't that why people in the real world use efficient ISA extensions?"Pentium 3 had already SSE with 128bit registers. Upgrading to SSE3 wouldnt be a big deal. Intel atom supports everything up to SSSE3.
Zool - Wednesday, November 25, 2009 - link
"The program is quite old but that means its quite fair too."The problem is that that testing old cpu-s to curent ones is only working in old programs that have minimal external bandwith requirments or some minimal command promt tests. If u would test the amd 1100 MHz and the core duo wolfdale in for example Cinebench10 the diference would be much bigger. The amd 1100 cant keep up the 10+ times external memory bandwith in core2 duo. The situation would be same in real world aplications, with such slow external bandwith the old cpu-s are very slow but that doesnt mean the IPC is that much slower.
I just want to say that amd and intel had several years of time to release a normal low power cpu without the insane die overhead of current cpu-s. And they did a big nothing. It could reach 70-80 percent of core performance for fraction of current die area.(the rest could be gained trough 30% frequency increase :) )
The curent cpu designs increase IPC by 20-30 percent trough insane amount of compications and die size when they could just increase frequency by that amount with the right cheap design.