Introduction to Storage Performance
The storage landscape is changing in 2008/2009. Whereas the term “hard drive” had captured the idea for all practical purposes, today we have hard drives, solid state (SSD) drives composed of memory chips, network-attached storage (NAS), and perhaps soon, omnipresent “cloud” storage. This Guide concerns itself primarily with hard drives, with SSD drives to follow.
Solid state drives PERMALINK
Solid state disks contain memory chips with no moving parts. SSD is the future of performance and low power consumption, a magic combination.
Currently-available SSDs suffer from the handicaps of limited capacity and pricing that is far higher per gigabyte than a conventional hard drive. View prices. This area is evolving rapidly, and 2009 will undoubtedly bring further price drops along with performance and capacity improvements. SSDs will become a tsunami for mainstream products once the price drops by 70%; this might happen in as little as 18 months.
Caution is advised with SSD: it is not a panacea for all uses, and lower priced models may involve reduced performance or other glitches.
Unlike hard drives, most SSDs offer consistent performance for the entire capacity, quite different from hard drive performance. There is no seek time and no spin latency with an SSD—no moving parts to wait for. SSD speed is still limited to the interface speed. In particular, SATA II (3 gigabits per second) can be a limiting factor for SSD speed!
Read performance is most important for most applications, and top-quality SSDs appear to offer approximately twice the speed of the fastest conventional 7200 rpm SATA desktop hard drives.
Write performance varies depending on brand and model, and can be inferior to conventional hard drives, though that is changing in 2010.
SSDs and RAID
SSDs can be striped or mirrored into a RAID. If you are seeking the highest possible performance at any cost, choosing SSD (4-drive stripe) might be the way to go. You’d want the OWC Mercury Elite SSD.
Striped hard drives (RAID) offer far higher transfer rates than a single SSD. SSDs can be striped also, but the cost is tremendous and the capacity far too limited for most uses.
Hard drives PERMALINK
A hard drive is the tangible electronic brick that you can hold in your hand. You plug it into your computer. A hard drive has one or more circular platters inside it on which the data is stored, and all sort of electronics that make it work. Hard drives come in 3 sizes: 3.5" (desktop computers), 2.5" (laptops), and 1.8" (iPod-like devices). The 1.8" and 2.5" variety might soon disappear with the advent of SSDs.
A partition is concentric ring on the platter (or a stack of concentric rings on the platters). Think of it as a disk drive donut! Shown below is a schematic representing an outer partition and an inner partition (we will ignore other special and tiny partitions used by the operation system).
With most hard drives, data density is constant across the platter. So at the outside of the platter (the Outer Partition), more data passes by the disk read-heads with each revolution, roughly twice as much at the outer edge as at the inner edge. What this means is that high performance is best achieved by limiting stored data to the Outer Partition. For more on a great performance trick, see Why you need more space than you need.
Building on top of the partition concept, a volume is the operating system representation of the storage space provided by one or more partitions, each on its own hard drive (in practice, volumes are never constructed from partitions on the same hard drive). When more than one partition/hard drive is used, we can achieve higher performance and/or higher reliability.
On Mac OS X, volumes appear on the desktop looking something like this:
The volume consists of the partition(s), along with all the record keeping the operating system needs to keep track of your files. Each volume has its own set of these records, and this is why segregating system/applications from your data is a good idea: corruption of one does not corrupt the other.
Hard drive speed factors PERMALINK
Hard drive speed involves several factors:
- seek time, which is the time for the disk “heads” to physically move across the disk before they can read or write data;
- transfer rate: how much data can be moved per unit time eg megabytes per second. Do not confuse this with interface speed;
- interface speed: the theoretical speed at which data can move from the hard drive to the computer. It’s real for SATA (in bursts) and bogus in practice for USB 2 and Firewire.
- workload: hard drives implement caching algorithms that can match up well (or badly) to any particular usage scenario. A drive that is fastest in one scenario might not be fastest in another.
Note that the new solid state “drives” (SSD) have no moving parts; their seek time is zero. Their performance depends on interface speed and internal electronics and memory speed (highly variable by brand/model).
Why a fast boot drive is unimportant PERMALINK
See Boot Drive Dogma.
When drive speed matters PERMALINK
Aside from the Photoshop scratch disk and similar special cases, here are a few other scenarios to consider—
Batch processing 500 RAW files from your digital camera? The wrong drive can be a bottleneck (eg a USB or Firewire 400 drive).
Ensure that the disk speed keeps the CPU cores busy as much as possible (use Apple’s Activity Monitor to spy on things). High transfer rates can be important here: processing a single Canon or Nikon image can move between 25 and 150 megabytes of data for each image, depending on file type. However, some RAW converters are so CPU-bound that the disk speed is a minor issue; it all depends on particulars.
Video capture cannot tolerate seeks (dropped frames), so seek time is irrelevant—seeks are not allowed! Instead a minimum transfer rate is required, so it boils down to sustained transfer rate across the volume. This is a specialty area to which video users will need to pay close attention. See Why you need more space than you need.
Backup speed is often overlooked. If you data set is large (mine is over 1TB), it’s far too time-consuming with slow drives. Simply reading the data can take hours (ignoring the need to write it elsewhere).
If the drive can average 70MB/sec for a 930MB data set (about what a 1TB drive can hold), that’s 3.8 hours just to read the data (it has to be written somewhere also). A 4-way striped RAID can cut the read-time down to under one hour—much more viable for backups. See Why you need more space than you need.
Are the very fastest hard drives best? PERMALINK
Like everything else, there is a sweet spot of price and performance. The best performance means much higher cost (5-8 times higher per gigabyte or even 30X for solid state drives) and/or much lower capacity.
Intrigued by the 10,000 RPM Western Digital 300GB VelociRaptor hard drives, which offer blazing seek times and fast data transfer rates?
Spinning at 10000 RPM versus the usual 7200 RPM, the VelociRaptor theoretically provides a 38% increase in data transfer rate. But that ignores data density, which has increased with the advent of 1.5TB hard drives as well as the fact that its smaller size (2.5") means a smaller circumference as compared with 3.5" hard drives. It’s not clear that the VelociRaptor is very much faster than the fastest 1TB hard drives today in terms of data transfer: a future test is planned. Still, it’s still at the top of the heap as of late 2008.
The VelociRaptor drives offer unusually low seek times (the time it takes for the read heads to move over the drive platters). Comparative DiskTester results at barefeats.com show that the Velociraptor is indeed very fast for random access. But seek times can be all but eliminated by appropriate partitioning for specialized uses, such as a Photoshop scratch disk.
The Velociraptor is also particularly good at handling large numbers of I/O requests quickly. But again, this is a modest gain over the fastest 1TB hard drives; whether this matters depends on the specific workload.
The Velociraptor drives are exceptionally fast, but in terms of bang-for-the-buck they’re a losing proposition. First, the mounting solution inside a Mac Pro is a hassle and extra expense. Second, capacity is unrealistically small for many photographers.
To provide the storage of a single 1TB drive, you would need three (3) Velociraptor drives, and you’d spend around 8 times the money (about $300X3 = $900 vs $189 for a 1TB Seagate enterprise-class drive). Use them only if your storage needs are modest, and you want the ultimate in performance. Superior performance for most uses can be achieved with three 1TB drives versus two Velociraptors (3TB vs 0.9TB!).
The Velociraptor is also subject to the same slowdown across the drive as any other hard drive! So don’t confuse speed when empty with speed half-full! Use them as special-purpose drives, say a striped and partitioned pair of them for a scratch volume.
You can get the same performance with the 1TB hard drives at far lower cost and much higher capacity. You might need 3 drives instead of two, but the total cost is much lower and the capacity far higher.
Enterprise-grade drives vs regular drives PERMALINK
What makes an “enterprise grade” drive worth paying for, when very similar drives can cost substantially less?
Enterprise-grade drives — considerations:
- Typical cost differences might be $167 for an enterprise-grade 1TB drive vs $130 for a standard drive (as of May 2009). That’s a 28% premium, but even if you buy 4 drives for a Mac pro, it’s still only $148 total, which is a small fraction of the cost of a Mac Pro. If you really need 4 drives and a $4000 Mac Pro, the cost starts to look pretty minor in terms of total system cost.
- There’s no reason to use enterprise grade drives for offline backup (eg in a drawer). For that purpose, buy the cheapest drives. There’s nearly always a brand that’s on sale at OWC at a great price. On the other hand, if the backup drives spin 24X7 and/or spin up/down a lot, enterprise grade might be worthwhile.
- Performance and compatibility
- Enterprise grade drives might be faster, or might perform about the same. Or they might be more or less compatible (weird glitches). It’s really a drive-specific issue, and there are no hard and fast rules.
- Resistance to vibration
- RAID enclosures typically place multiple hard drives in close proximity to each other. Place your hand on a spinning 7200 rpm drive and you’ll feel just how much vibration they create. These vibrations can interact with each other to create resonances that can cause soft errors, potentially slowing down a RAID setup (or even non-RAID). Enterprise-grade drives are designed to tolerate vibration to avoid soft errors. Whether this is actually an issue depends on the number of drives, their proximity to each other, etc. For uses where guaranteed speed is required (like HD video capture), hiccups might be unacceptable. For other uses, hiccups are probably something you can live with.
- Higher MTBF
- Enterprise-grade drives typically have longer MTBF ratings (mean time between failure). What’s the cost of downtime should a drive fail? What if your backup is a day old or a week old? There is real value in keeping the probability of failure down, especially if you’re striping 2/3/4 or more hard drives, where failure of one drives means total loss of the volume. For some users, any downtime is very expensive, and the choice is clear. On the other hand, a robust backup strategy (to the hour) can cover your backside quite well. Still, the downtime is a disruptive event and no fun. Let your own circumstances be the guide.
- Longer warranty
- A longer warranty is a plus—sort of. A year or two after purchase, it’s of little value, since you can buy better and faster drives for a lot less one year later. In short, a five year warranty sounds good, but is fairly pointless. And if you’re running a RAID, you need a replacement immediately, not 30-45 days later when it arrives from Taiwan (personal experience). So the warranty is probably a red-herring. A worthwhile warranty however, is one that gives immediate replacement for the first 90 days, one good reason to order any hard drive from OWC.
In short, an enterprise-grade drive buys you a better probability of avoiding the headache of a failed drive or sporadic performance glitches, but it’s no guarantee.
Ask yourself how much data loss is tolerable, and what it would cost in terms of time or money or data if a drive failed. Then consider the increased cost for enterprise-grade drives within the total system cost, and choose for your own scenario.
All that said, some high-grade drives are not much different from the enterprise-grade ones. Hitachi drives in particular are generally spec'd quite similarly as of 2009.
My own usage
Update September 2009: the new Hitachi 7K2000 2TB drive is also a great choice.