The Great Race, Glitches in the Matrix, and Tiny Bubbles
07/25/03

We ended last time asking the burning question: How can PowerPC “G4” processors in Macintosh computers top out at only 1.4 GHz, yet provide performance equal to their Pentium counterparts having twice the clock speed? And, in some applications, even outperform the Pentium? As I hinted, it’s all about the pipeline.

Remember we said that data and instructions flow through the processor, much as water flows through a pipe. Pentiums and Athlons have narrow, long pipelines. For simplicity’s sake, let’s say their pipelines have twenty stages, a stage being a spot for one piece of data as it flows through. The G4 processor has thicker and fewer pipeline stages – let’s say seven stages. For this analogy, the data has to go all the way through the pipeline, one stage at a time, to come out the other end as results we as users can see – a completed resizing of a photo, search results from a database, etc. Also, picture the data moving to the next pipeline stage with every tick of the processor clock.

Let’s follow the two processors as they do their thing, in a side-by-side drag race. For the race, we’ll assume that both the G4 and Pentium are running at the same clock speed of 1 GHz.

Ready, set, go! Data and instructions start flowing down each processor’s pipeline. The G4, with only seven pipeline stages, begins delivering usable data while the Pentium is not quite halfway through filling its twenty-stage pipeline. So, if you’re Intel or AMD, how do you compensate? You increase the clock speed. If your Pentium or Athlon is now running at 2 GHz, the data moves through the pipeline faster, and the processors deliver usable data at roughly the same time. And, if you bump the clock speed up to 3 GHz, the longer pipeline, once it gets going, should win the race handily.

Theoretically. In a perfect world. However, in processing data, every processor runs into “glitches in the Matrix”, “bubbles”, if you will, that disrupt the data in the pipeline. Some bubbles just create an empty space that moves through the pipeline. Others wipe out all data in the pipeline, requiring the pipeline to fill up completely again before delivering usable data out the other end. You can see that with a shorter pipeline, these bubbles will have less effect, because the shorter pipeline fills up and recovers faster. Intel’s “next generation” Itanium processor (already a generation behind the Mac’s new G5 processor, but I digress) has roughly half the pipeline stages of the Pentium 4.

Increasing clock speed creates two problems for processors – heat and power consumption. In a desktop, all that heat requires noisy, power-intensive fans to prevent overheating. Not a problem when you’re plugged into an electrical outlet. In a laptop, however, a fan draws lots of precious battery power, not to mention the effect all that hot air has on your lap. Higher clock speeds require more electricity, again draining the battery faster. In fact, Intel’s new Centrino mobile processors all run at slower clock speeds than their desktop siblings to prolong battery life and decrease heat.

Historically, busses, which handle data flow into and out of the processor, have been a data “bottleneck”. Processor clock speed means little if the data is slow getting into and out of it. Now, bus speeds are up to one-half the processor clock speed, and the bottleneck is widening if not disappearing. RAM memory that can receive and deliver data faster also speeds thing up. As in life itself, all the players on the CPU team make a difference.

© 2003 Peter F. Zimowski