Thank you for purchasing through links and ads on this site.
OWC / MacSales.com...
diglloyd Deal Finder...
Buy other stuff at Amazon.com...
Upgrade the memory of your 2019 iMac up to 128GB
Handpicked deals...
$1999 $1399
SAVE $600

$1399 $1399
SAVE $0

$999 $949
SAVE $50

$3998 $3498
SAVE $500

$569 $469
SAVE $100

$249 $214
SAVE $35

$1299 $999
SAVE $300

$1498 $1198
SAVE $300

$2299 $1599
SAVE $700

$1598 $1598
SAVE $0

$5500 $5000
SAVE $500

$2600 $2100
SAVE $500

$2798 $2798
SAVE $0

$1299 $950
SAVE $349

$1999 $1399
SAVE $600

$1299 $1124
SAVE $175

$2299 $1799
SAVE $500

$1299 $999
SAVE $300

$3397 $2697
SAVE $700

$3297 $2797
SAVE $500

$1999 $1199
SAVE $800

$3399 $2199
SAVE $1200

$400 $300
SAVE $100

Apple MacBook Pro 16"

Most Powerful MacBook Pro ever!.

CLICK THIS AD to pre-order. !

Review planned, consult with Lloyd on what to buy!


√ B&H Photo PAYS THE SALES TAX FOR YOU More info...

Scalability — multi-core performance

Last updated 2009-06-01 - Send Feedback
Related: CPU cores, Mac Pro, macOS, memory, Photoshop

This pages discusses what it means to “scale” in the context of multiple CPU cores.

In a perfect world, 8 cores would complete a task in exactly half the time that 4 cores requires.

In the real world that almost never happens: every useful task accesses memory and/or a hard drive or the network. There is also the overhead of coordinating multiple “threads” (workers), generally one per CPU core. Yet there are well-written programs that can approach perfect scaling.

For examples of scalability or non-scalability, see:

What full CPU usage looks like

We will use Genuine Fractals 6.0 as the “poster child” for very good scalability from a commercial software application

Genuine Fractals 6.0 CPU usage on 8-core Mac Pro

On Mac OS X, Activity Monitor can be used to view CPU usage How.

This graph shows almost total usage of the CPU cores (800% at 100% per core), but understand that full use doesn’t mean full efficiency.

That’s right: even though the CPU cores are busy, the cores could actually be mostly idle, twiddling their thumbs (so to speak), competing for access to the same memory.

OWC Sizzling Savings

Hot Deals all Week Long

4TB SSD only $348.

LIMITED TIME OFFERS, quantities limited, new deals every day!.

Genuine Fractals scalability

Fortunately, Genuine Fractals 6.0 does mostly computing (calculation), with minimal disk access and moderate memory access requirements, so the CPU cores are actually being used with high efficiency. The black inverse spikes indicate times when the CPU cores are idle; monitoring shows that these are times when the disk is being used.

Genuine Fractals 6.0 CPU usage on 8-core Mac Pro

To determine the scalability of Genuine Fractals, I timed the same task with 2, 4 and 8 cores, easy to do with Apple’s developer tools (CHUD) via the menu or the Processor Palette. This is not a perfect test; the program still thinks there are 8 cores, even if some of them are disabled, so it’s going to create 8 “threads” instead of the number of threads for the actual active cores available to do work.

The job was to scale an image to 40X30" at 360 dpi. Times were recorded on a 2.8GHz 8-core Mac Pro (2008) with 32GB memory on Mac OS X 10.5.6. Scalability is very good, but not perfect; disk I/O causes brief pauses on a regular basis (the black areas in the CPU usage graph above).

With some engineering effort, the Genuine Fractals engineers might be overlap the disk I/O with computation so as to eliminate the regular (though short) periods of idle CPU usage; visually that disk I/O appears to be responsible for a significan part of the less-than-perfect scalability.

Bottom line: with Genuine Fractals 6.0, an 8-core Mac Pro is effectively a 7.4-core machine in actual results. That’s not perfect, but it’s very very good.

# Cores Seconds Comments
1 1474 1.0X With a single core, inefficiencies do creep in; disk I/O potentially stops all computing activity until done. Also, any background activity (Mac OS X itself, other programs, etc) take away from the single active core’s ability to get its job done.
2 729 2.02X With two cores, one core might continue computing while the other is idle doing disk I/O, and contention for memory access is still relatively low.
4 385 3.82X Best possible is 4.0X. With 4 cores, contention for memory access rises.
8 200 7.37 Best possible is 8.0X. With 8 cores, contention for memory access rises further.
Organic Lab Tested Full Spectrum CBD

20% off every day with coupon code diglloyd20 at NuLeafNaturals.com

100% organic non-GMO, no additives or preservatives, lab tested for purity and quality.

MemoryTester test-compute-speed

The MemoryTester test-compute-speed command computes a SHA1 hash, using all active CPU cores, with a mix of pure computation and moderate memory access (most of which can be cached by the CPU on-chip cache). MemoryTester is smart enough to recognize how many active CPU cores are present.

Let’s see how it scales, keeping in mind that with a single core, background system activity scarfs up a wee bit of processing power, which is larger as a percentage for a single core than with more than one core.

We can see here perfect scalability, the ideal situation. Almost certainly test-compute-speed would scale very well to 16 cores. Most applications have no hope of scaling this well, either from poor engineering, or simply the nature of the work to be done.

# Cores MB/sec Comments
1 201.2 1.0X
2 403 2.00X
4 809 4.02X
8 1615 8.03X

Limitations on scalability

The limiting factors for scalability is usually software. It requires top-notch engineers to design and build correct and efficient applications—it’s hard stuff.

Specialty programs like Genuine Fractals 6.0 have engineered efficient use of 8 cores. Other programs use multiple CPU cores to some degree, but might not consider the “low hanging fruit”; for example Adobe Photoshop CS4 is single-threaded while opening and saving files, and won’t even allow the user to work on something else while an open or save is in progress; 7 of 8 cores on a Mac Pro sit idle during the process.

Apple iPhone 7
Only $799 $295


MN9H2LL/A (USA/Global Unlocked)
Used, Mint Condition, Factory Unlocked

Memory bandwidth

When disk I/O is involved, there are techniques to overlap disk I/O with simultaneous computation, so even programs that use the disk heavily can exploit multiple cores efficiently, at least so that the disk alone becomes the limiting factor (CPU speed having little effect).

For compute-bound applications, the main bottleneck is memory access. This is why we keep seeing faster and faster memory (and larger on-chip caches) in each generation of computer; fast CPUs aren’t so fast when they’re forced to compete for access to memory. The 2008 Mac Pro improved memory speed to 800MHz from 667MHz (20% faster), and a 2009 Mac Pro will almost certainly move to 1066MHz or faster memory.

 


MacPerformanceGuide.com
View all handpicked deals...

Sigma 14-24mm f/2.8 DG DN Art Lens for Sony E
$1399 $1399
SAVE $0

diglloyd.com | Terms of Use | PRIVACY POLICY
Contact | About Lloyd Chambers | Consulting | Photo Tours
Mailing Lists | RSS Feeds | Twitter
Copyright © 2019 diglloyd Inc, all rights reserved.
Display info: __RETINA_INFO_STATUS__