AMD PHENOM 9550 x4 vs Intel Core 2 Quad Q6600 Review

This article compares the AMD Phenom 9550 to the Core2 Quad Q6600. These are first generation entry level quad-core processors from both AMD and Intel.

It’s kind of hard to believe how old these CPU’s are now, at least in computer time. Intel’s “Conroe” architecture hit the market around August of 2006 taking the performance crown that AMD held for quite a long time. Although Intel was the first to release a quad-core (4-Cores) CPU via the MCM approach, AMD was the first to release a quad-core based on a native design. While they both have their advantages/disadvantages, with some mentioned below, in the end performance and price is what mattered to the end user.

AMD Phenom “Agena” Multicore Architecture

AMD decided their first generation quad-core would be a native design wrapped around an updated core architecture. AMD codenamed the consumer version “Agena” and the server version “Barcelona”. This was the first native quad core design for the x86 platform. A big advantage of the native design is all the cores can communicate on die rather than having to communicate some other, usually less optimal method like a shared bus.

Overview of the Phenom “Agena/Barcelona” Architecture with a new to x86 shared 3rd level cache

Phenom introduced a new concept to x86 with a shared Last Level Victim Cache for multicore CPUs. Each core have smaller, private primary caches and a large pool of last level cache shared by all the cores. This setup is what Intel would also incorporate in their “Nehalem” and “Lynnfield” designs two or three years later. A big weakness for AMD’s first quad core is that 65nm allowed very little Level 3 cache. There was 2 MB of cache for all four cores to share. This of course can cause contention anytime the cores needed to head out beyond their private caches and grab data from the relatively small L3 cache.  To give an idea, the Phenoms last level cache that must be shared by all 4 cores is half the size of Conroe’s level 2 cache that only two cores has to share. So on the surface a big weakness in AMD’s overall quad core design is the small L3 cache that all 4 cores must share. Another very important side effect for AMD themselves was each die has 450 million transistors. The more transistors you have increases the die size and the chance for defects to appear. So yields and thus costs with profit margins suffer. That’s a big reason why AMD released multiple Tri-core and Dual Core processors. They could simply disable portions of the die and sell them as triple and dual core versions instead of trashing them.

While history shows that the Phenom had the basic overall architectural design correct from the start and were ahead of their time, the 65nm process just would not allow AMD to properly outfit their new design the way they needed.


Intel Core 2 Quad “Kentsfield” Multicore Architecture

As you can see from the image, Kentsfield uses two Conroe (4MB L2 Cache each) dies. (Cloverdale: Cloverdale is the Xeon Workstation/Server codename)

Instead of starting from scratch and creating a native quad core architecture from the ground up like AMD did, Intel decided to take a more conservative approach. They basically took 2x Conroe dual core dice (dies), and placed them on a single processor package called an MCM (Multi-chip Module). If you remember a couple years or so before that, Intel took this same approach with the Socket 775 Pentium 4 based Pentium D (“Smithfield” and “Presler”) dual core processors. They used two single core “Prescott” (or 65nm “Cedar Mill”) dice on a single processor package.

Both the quad core and dual core processors using this MCM design have advantages and disadvantages. A big advantage is time to market. It would not take as long to design a circuit to join an extra die on the same package. Another advantage was yields. It would be much better for yields to use a design that requires two cores with less transistors and smaller die sizes.

On the other hand this design has some weaknesses as well. Since each core could not communicate on die like AMD’s design the overall performance and performance scaling could suffer. To communicate with each other the cores used the comparably weak Front Side Bus connecting the chipsets Northbridge. Intel’s FSB (Front Side Bus) is a shared bus design that connects the processor to the chipset’s Northbridge and then the rest of the components. The Northbridge itself houses the AGP/PCI Express Hub and Memory controller. The bus frequency for the original Kentsfield CPUs is 266 MHz and uses a quad pumped design that in theory provides the bandwidth of 1066 MHz which is the number Intel uses in their marketing.  So the FSB already having to handle the Memory, AGP/PCI Express, and other traffic now has to handle the cores communicating with each other.

Another possible limitation of the cores using the FSB to navigate is the latency. Not only the clock speed of the FSB but also the distance the messages has to travel could limit the quad cores performance and scaling. The longer it takes for data to be received that a core needs, the longer the core will be stalled.

Leave a Reply

Your email address will not be published. Required fields are marked *