Photoshop CS5 Scalability and Low-hanging Fruit
Clem Cole and Russell Williams discuss Photoshop's long history with parallelism in an article at queue.acm.org (thanks to reader Fazal Majid for pointing me at this article). It’s a long read, and gets somewhat technical, but should be of interest to those with a technical bent.
I'm startled by the discussion of the I/O bug which took a decade to find. Having a background in computer science, including some experience with formal mathematical proofs of program correctness, all that needed to happen was to apply that methodology to the I/O code, and the bug would have popped right out. Clearly they needed another set of eyes on the problem— sort of like the same person proofreading the same copy over and over, yet never seeing the obvious typo. Or (as my personal experience), walking someone else through the code, stating why it is correct, and then finding the contradiction.
Decade-old code base
I paid for CS 1/2/3/4/5 so far. With CS5, I hoped (but did not expect) a revamp of the multi-threading capabilities, or at least a fix to the hideously long save and open times (more on that below).
That the Photoshop code base is a decade old is acknowledged, and that it works “pretty well” on 2-4 processors is also plain. This confirms my testing, which shows minimal gains with 6 cores, and a degradation with 12 cores.
Excerpt from the article:
“Photoshop has well over a decade of using multiple processors to perform the fundamental image-processing work it does on pixels. That has scaled pretty well over the number of CPUs people have typically been able to obtain in a system over the past decade—which is to say either two or four processors.”
The forest is not seen for the veining of the leaves on the branches of the trees. The “main challenge” has been overlooked; the main challenge is a failure to see what the real everyday issues are: usability and performance with everyday frequent operations.
Low-hanging fruit for massive improvements
Most striking is how experts can be oblivious to simple changes that could provide a massive performance improvement with near-zero complexity; perhaps it’s not intellectually challenging enough to do something simple and effective.
The right algorithm can sometimes avoid the complexity of synchronization and threading with CPU cores. But solutions unlooked-for go un-found; defining the problem incorrectly is the logical flaw which leads down the path of “we can’t make it better because X, Y and Z apply”. The problem being that X/Y/Z presume A, but A is false.
Who cares if making Smart Sharpen faster saves 0.1 second on a 2 second operation when 380 seconds are wasted every time I save? Incremental software optimization of already fast operations is foolish when large-scale productivity losses are not considered first. Leaves and forest.
The save-speed slowdown occurs because Photoshop requires PSB format for files over 4GB, and Photoshop insists on compressing everything using a single CPU core (PSD format has the same issue regardless of file size). With my main volume running at 500MB/sec, the save ought to take about 20 seconds, instead of 400 seconds or so. For files under 4GB, use uncompressed TIF for very fast file saves.
There are at least three algorithmic anti-optimizations in how Photoshop saves (and opens) files:
- The save is modal: the save could proceed while the user continues working on other files; instead the user is locked out until the save has finished.
- The save operation gives me no choice about compression: that’s easily solved by offering an option to not compress, thus allowing the save to run at the speed of my 500MB/sec volume.
- The save operation is single-threaded: multiple CPU cores should whack away at the save, especially with multiple layers. But that‘s still marginal, since an option to not compress is the real solution for the fastest speed.
- One can’t simply run a second copy of Photoshop at the same time on the same machine, which would deal with the above issues as well (proceed with work in another copy of Photoshop). This is not technically hard; Adobe just won’t allow it as part of its copy protection scheme.
Too many threads
A basic flaw in Photoshop is how it allocates a very large number of threads which then slow it down; if the contention of Adobe’s scientists in that article is correct, then why doesn’t CS5 allow the user to limit the number of CPU threads via a preference? My testing shows that disabling CPU cores has a beneficial effect on the 12-core Mac Pro. So instead of bemoaning Amdahl’s law, and then programming CS5 to have its shins kicked by that law (if actually true), add a preference. Or test, and hard-code a limit into the code itself.
But I do not accept the idea that a better job cannot be done with at least with some functions; see PTGui vs Photoshop pano. Many other programs scale very well with image processing, such as PhotoZoom Pro. The real problem is an organizational one.