why-are-newer-generations-of-processors-faster-at-the-same-clock-speed-00

You might be curious as to how newer generations of processors are able to be faster at the same clock speeds as older processors. Is it just changes in physical architecture or is it something more? Today’s SuperUser Q&A post has the answers to a curious reader’s questions.

Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-driven grouping of Q&A web sites.

Photo courtesy of Rodrigo Senna (Flickr).

The Question

SuperUser reader agz wants to know why newer generations of processors are faster at the same clock speed:

Why, for example, would a 2.66 GHz dual-core Core i5 be faster than a 2.66 GHz Core 2 Duo, which is also dual-core?

Is this because of newer instructions that can process information in fewer clock cycles? What other architectural changes are involved?

Why are newer generations of processors faster at the same clock speed?

The Answer

SuperUser contributors David Schwartz and Breakthrough have the answer for us. First up, David Schwartz:

Usually, it is not because of newer instructions. It is just because the processor requires fewer instruction cycles to execute the same instructions. This can be for a large number of reasons:

  1. Large caches mean less time wasted waiting for memory.
  2. More execution units means less time waiting to start operating on an instruction.
  3. Better branch prediction means less time wasted speculatively executing instructions that never actually need to be executed.
  4. Execution unit improvements mean less time waiting for instructions to complete.
  5. Shorter pipelines means pipelines fill up faster.

And so on.

Followed by the answer from Breakthrough:

The absolute definitive reference is the Intel 64 and IA-32 Architectures Software Developer Manuals. They detail the changes between architectures and they are a great resource to understand the x86 architecture.

I would recommend that you download the combined volumes 1 through 3C (first download link on the page linked above). Volume 1, Chapter 2.2 has the information you want.

Some general differences listed in that chapter, going from the Core to the Nehalem/Sandy Bridge micro-architectures are:

  • Improved branch prediction, quicker recovery from misprediction
  • HyperThreading Technology
  • Integrated memory controller, new cache hierarchy
  • Faster floating-point exception handling (Sandy Bridge only)
  • LEA bandwidth improvement (Sandy Bridge only)
  • AVX instruction extensions (Sandy Bridge only)

The complete list can be found in the link provided above (Volume 1, Chapter 2.2).

Make sure to read through more of this interesting discussion via the link below!


Have something to add to the explanation? Sound off in the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.