• ARTICLES
• SUBSCRIBE
SEARCH

How Do You Calculate Processor Speed on Multi-core Processors?

The advent of economical consumer grade multi-core processors raises the question for many users: how do you effectively calculate the real speed of a multi-core system? Is a 4-core 3Ghz system really 12Ghz? Read on as we investigate.

Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-drive grouping of Q&A web sites.

The Question

SuperUser reader NReilingh was curious how to the processor speed for a multi-core system is actually calculated:

Is it correct to say, for example, that a processor with four cores each running at 3GHz is in fact a processor running at 12GHz?

I once got into a “Mac vs. PC” argument (which by the way is NOT the focus of this topic… that was back in middle school) with an acquaintance who insisted that Macs were only being advertised as 1Ghz machines because they were dual-processor G4s each running at 500MHz.

At the time I knew this to be hogwash for reasons I think are apparent to most people, but I just saw a comment on this website to the effect of “6 cores x 0.2GHz = 1.2Ghz” and that got me thinking again about whether there’s a real answer to this.

So, this is a more-or-less philosophical/deep technical question about the semantics of clock speed calculation. I see two possibilities:

1. Each core is in fact doing x calculations per second, thus the total number of calculations is x(cores).
2. Clock speed is rather a count of the number of cycles the processor goes through in the space of a second, so as long as all cores are running at the same speed, the speed of each clock cycle stays the same no matter how many cores exist. In other words, Hz = (core1Hz+core2Hz+…)/cores.

So what is the appropriate way to denote the total clock speed and, more importantly, is it even possible to use single-core speed nomenclature on a multi-core system?

SuperUser contributors Mokubai helps clear things up. He writes:

The main reason why a quad-core 3GHz processor is never as fast as a 12GHz single core is to do with how the task running on that processor works, i.e. single-threaded or multi-threaded. Amdahl’s Law is important when considering the types of tasks you are running.

If you have a task that is inherently linear and has to be done precisely step-by-step such as (a grossly simple program)

``10: a = a + 1``
``20: goto 10 ``

Then the task depends highly on the result of the previous pass and cannot run multiple copies of itself without corrupting the value of `'a'` as each copy would be getting the value of `'a'` at different times and writing it back differently. This restricts the task to a single thread and thus the task can only ever be running on a single core at any given time, if it were to run on multiple cores then the synchronisation corruption would happen. This limits it to 1/2 of the cpu power of a dual core system, or 1/4 in a quad core system.

Now take a task such as:

``10: a = a + 1``
``20: b = b + 1``
``30: c = c + 1``
``40: d = d + 1``
``50: goto 10 ``

All of these lines are independent and could be split into 4 separate programs like the first and run at the same time, each one able to make effective use of the full power of one of the cores without any synchronisation problem, this is where Amdahl’s Law comes into it.

So if you have a single threaded application doing brute force calculations the single 12GHz processor would win hands down, if you can somehow make the task split into separate parts and multi-threaded then the 4 cores could come close to, but not quite reach, the same performance, as per Amdahl’s Law.

The main thing that a multi CPU system gives you is responsiveness. On a single core machine that is working hard the system can seem sluggish as most of the time could be being used by one task and the other tasks only run in short bursts in between the larger task, resulting in a system that seems sluggish or juddery. On a multi-core system the heavy task gets one core and all the other tasks play on the other cores, doing their jobs quickly and efficiently.

The argument of “6 cores x 0.2GHz = 1.2Ghz” is rubbish in every situation except where tasks are perfectly parallel and independant. There are a good number of tasks that are highly parallel, but they still require some form of synchronsation. Handbrake is a video trancoder that is very good at using all the CPUs available but it does require a core process to keep the other threads filled with data and collect the data that they are done with.

1. Each core is in fact doing x calculations per second, thus the total number of calculations is x(cores).

Each core is capable of doing x calculations per second, assuming the workload is suitable parallel, on a linear program all you have is 1 core.

1. Clock speed is rather a count of the number of cycles the processor goes through in the space of a second, so as long as all cores are running at the same speed, the speed of each clock cycle stays the same no matter how many cores exist. In other words, Hz = (core1Hz+core2Hz+…)/cores.

I think it is a fallacy to think that 4 x 3GHz = 12GHz, granted the maths works, but you’re comparing apples to oranges and the sums just aren’t right, GHz can’t simply be added together for every situation. I would change it to 4 x 3GHz = 4 x 3GHz.

Have something to add to the explanation? Sound off in the the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.

1. mitcoes

This aproach is still old. Since the 64bit CPUs arrived, almost every CPU is multicore, and almost every OS and program is programmed to use all of them.

if you put all your cores in a conky or any utility that shows its use, the % of use and even the frecuency is balanced.

But as you say 4 x 3 Ghz is not 12 Ghz abd now with ARM, Intel and AMD CPUs you must be careful reading the benchmarks and hierarchy in order to know what CPU is faster.

MIPS and PI benchmarks are normally good enough to compare a CPU vs other CPU, and of course the 3dmark ones and others but when it comes to frecuency you can only say it will be faster if it is THE SAME processor but at higher frecuency if not it almost mean nothing.

2. Justin

So if I see a processor that has 4 cores and is rated at 3.5Ghz, how do I interpret that? Does it mean that each core is doing roughly 900MHz? Or will a fully linear application run at the full 3.5Ghz?

3. Mark

While it’s true the operating system can see all of the cores and use them, most applications can’t take full advantage of them.

@Justin if it’s rated at 3.5 Ghz, then all the cores are doing that speed. Rarely is it reported otherwise.

4. Techminator

What is the difference between Core 2 Duo and Dual Core?

Furthermore, what is Core 3 Quad?

5. Alistair

@Techminator
Dual Core is the generic term for a single physical CPU package (the actual chip that you buy from the store) that has 2 separate processing cores in one ‘Die’ (single piece of silicon)

Core2Duo was intel’s branding for their second generation of dual core CPU’s (Pentium D was their first, essentially 2 Pentium 4′s shoved back to back)’ Core2′ being the architecture brand, and ‘Duo’ to indicate dual core. there was also core solo (a laptop only precursor to core2) and Core2Quad – which is the quad core (4 processing core in one package)version of the same thing. Although in reality a C2Q was essentially 2 C2Duo’s slapped together in one package.

Core i3 (and i5 and i7) is intel’s next line of architecture branding (branding because it’s had 2 distinct architectures within it’s lifetime – the most talked about on tech sites being known as “sandy bridge”

There has never been a core i3 Quad – i3′s are all Dual core. some i3′s use “Hyperthreading” which is a way of pretending that there are double the amount of processing threads by utilising unused portions of the chip. one ‘virtual’ core is approx half the processing power of a proper one. So in reality it’s more like 2 + 2 halfs.
core i5′s are generally (but not always) true quad cores and i7′s are generally (again not always) quad cores WITH hyperthreading giving a total 8 processing threads (again 4 at around 50% processing power)

6. Al

I feel like this has been posted on the site before.

7. Zac

My AMD FX-8150 is running at 4.3GHz on all 8 cores, which means not each core is 4.3GHz / 8.

8. rKiller

I always knew that

9. Nickm

What about windows hyper-v, this tells me I have x processors, and they are available to each virtual server?

10. john3347

Zac, Mokubai explains, in different wording, that while your processor is capable of running at 4.3 GHz on each of 8 cores (assuming your numbers are correct) it depends on all the right circumstances to actually perform 8 calculations at once. As he states, if the result from one calculation is needed to perform a second, or subsequent, calculation all 8 cores cannot be utilized simultaneously on that particular function. In actual practice, there are few, if any, circumstances when the full capability of an 8 core processor can be used to produce the result of a series of calculations. Thus an 8 core processor will almost universally be faster than a single core processor running at the same clock speed but very seldom, if ever, 8 times faster.

11. Yu

@Alistair: For Notebooks all Core i5 I know are dual-core and only Core i7 processors with the postfix “QM” are quadcore.

As for the general topic:

TL;DR: More cores mean less effective performance per “summed up GHz” but, as stated in “The Answer”, better responsiveness.

During my diploma thesis I have been using scientific software on computation servers running on up to 32 cores, sometimes several in parallel running on up to 128 cores in total. Interesting enough most of my calculations were actually faster running on 2 server nodes (16 cores) than on 3 nodes (24 cores) — because the overhead of synchronization became too big.

And this is a case where, except for synchronization penalties, each of the processes was working on equal junks of calculations at equal speed. In typical user scenarios (desktop, smartphone…) you have many somehow independent tasks running (e.g. synchronization client, internet browser, Skype, background services) that can hinder each other by requiring the same resources (such as network bandwidth, disk access…) making scheduling kind of a rocket science causing the average cpu utilization under full load to drop compared to a system with less cores. The positive side though is that even under full load there’s probably some core running at sufficiently low load to ensure a timely reaction to your mouse click (“responsiveness”).

Just as a simple example, take World of Warcraft. It gives you the option to decouple the interface rendering from 3D rendering. This means a slight drop of overall fps, but significantly better usability of the interface when your system can’t provide 30+ fps. Having walked through Orgrimmar at 2-3 fps I can tell you, this makes one hell of a difference ;)

PS: Is there some way to subscribe to comments for a blog post? I don’t want to keep the page open waiting for answers and sometimes I want to read follow-up comments for a post, where I didn’t have to say anything myself (i.e. requiring a post for subscribing wouldn’t be a solution). Many blogs I frequent use separate RSS feeds for each blog-post’s comment and/or email-subscription to the comments. I can’t find either here.

12. john3347

Yu, all you have to do is right click on a blank section of the web page you want to return to and click “Create Shortcut”. This places a shortcut on your desktop. This only works if you are using a real browser. If you are using a play browser like Firefox or Chrome, you have to drag the icon on the left end of the title bar to the desktop. This is one of several reasons I don’t use any browser but Internet Explorer.

13. Chris

Johnn3347 – I’m using Firefox and it has no problem placing a short cut to a web page on my desktop using the method you employ in IE. On the issue of multiple cores – I have a single core Sempron 145 PC and an Intel core 2 duo one. Regardless of what the benchmarks might say for the sort of tasks that ordinary people perform most of the time I can’t see any great difference in observed speed. Even the Sempron manages to run several programs at once. Perhaps it’s all down to the speed of one’s brain.

14. Chin