Quick Links

The advent of economical consumer grade multi-core processors raises the question for many users: how do you effectively calculate the real speed of a multi-core system? Is a 4-core 3Ghz system really 12Ghz? Read on as we investigate.

Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-drive grouping of Q&A web sites.

The Question

SuperUser reader NReilingh was curious how to the processor speed for a multi-core system is actually calculated:

Is it correct to say, for example, that a processor with four cores each running at 3GHz is in fact a processor running at 12GHz?

I once got into a "Mac vs. PC" argument (which by the way is NOT the focus of this topic... that was back in middle school) with an acquaintance who insisted that Macs were only being advertised as 1Ghz machines because they were dual-processor G4s each running at 500MHz.

At the time I knew this to be hogwash for reasons I think are apparent to most people, but I just saw a comment on this website to the effect of "6 cores x 0.2GHz = 1.2Ghz" and that got me thinking again about whether there's a real answer to this.

So, this is a more-or-less philosophical/deep technical question about the semantics of clock speed calculation. I see two possibilities:

  1. Each core is in fact doing x calculations per second, thus the total number of calculations is x(cores).
  2. Clock speed is rather a count of the number of cycles the processor goes through in the space of a second, so as long as all cores are running at the same speed, the speed of each clock cycle stays the same no matter how many cores exist. In other words, Hz = (core1Hz+core2Hz+...)/cores.

So what is the appropriate way to denote the total clock speed and, more importantly, is it even possible to use single-core speed nomenclature on a multi-core system?

The Answer

SuperUser contributors Mokubai helps clear things up. He writes:

The main reason why a quad-core 3GHz processor is never as fast as a 12GHz single core is to do with how the task running on that processor works, i.e. single-threaded or multi-threaded. Amdahl's Law is important when considering the types of tasks you are running.

If you have a task that is inherently linear and has to be done precisely step-by-step such as (a grossly simple program)

        10: a = a + 1
    
        20: goto 10 
    

Then the task depends highly on the result of the previous pass and cannot run multiple copies of itself without corrupting the value of 'a' as each copy would be getting the value of 'a' at different times and writing it back differently. This restricts the task to a single thread and thus the task can only ever be running on a single core at any given time, if it were to run on multiple cores then the synchronisation corruption would happen. This limits it to 1/2 of the cpu power of a dual core system, or 1/4 in a quad core system.

Now take a task such as:

        10: a = a + 1
    
        20: b = b + 1
    
        30: c = c + 1
    
        40: d = d + 1
    
        50: goto 10 
    

All of these lines are independent and could be split into 4 separate programs like the first and run at the same time, each one able to make effective use of the full power of one of the cores without any synchronisation problem, this is where Amdahl's Law comes into it.

So if you have a single threaded application doing brute force calculations the single 12GHz processor would win hands down, if you can somehow make the task split into separate parts and multi-threaded then the 4 cores could come close to, but not quite reach, the same performance, as per Amdahl's Law.

The main thing that a multi CPU system gives you is responsiveness. On a single core machine that is working hard the system can seem sluggish as most of the time could be being used by one task and the other tasks only run in short bursts in between the larger task, resulting in a system that seems sluggish or juddery. On a multi-core system the heavy task gets one core and all the other tasks play on the other cores, doing their jobs quickly and efficiently.

The argument of "6 cores x 0.2GHz = 1.2Ghz" is rubbish in every situation except where tasks are perfectly parallel and independant. There are a good number of tasks that are highly parallel, but they still require some form of synchronsation. Handbrake is a video trancoder that is very good at using all the CPUs available but it does require a core process to keep the other threads filled with data and collect the data that they are done with.

  1. Each core is in fact doing x calculations per second, thus the total number of calculations is x(cores).

Each core is capable of doing x calculations per second, assuming the workload is suitable parallel, on a linear program all you have is 1 core.

  1. Clock speed is rather a count of the number of cycles the processor goes through in the space of a second, so as long as all cores are running at the same speed, the speed of each clock cycle stays the same no matter how many cores exist. In other words, Hz = (core1Hz+core2Hz+...)/cores.

I think it is a fallacy to think that 4 x 3GHz = 12GHz, granted the maths works, but you're comparing apples to oranges and the sums just aren't right, GHz can't simply be added together for every situation. I would change it to 4 x 3GHz = 4 x 3GHz.

Have something to add to the explanation? Sound off in the the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.