A larger and costlier hard drive is more expensive than a smaller and cheaper one. Or is it?
A solid metric of storage cost is cost per GB. That ignores performance, suitability for RAID, etc but it’s a starting point.
As per current pricing, a 20TB hard drive is 15% to 20% more expensive than an 18TB drive, for only 11% more capacity. Or 51% more expensive per GB than a 16TB drive for only 25% more capacity.
But... a larger hard drive with a slight cost premium can end up being far less expensive.
Anticipated data storage capacity needs
The key concept here is anticipated capacity usage over the lifespan of the drives. Get it wrong, and total cost surges, to replace the too-small drives too soon. But there is more to it:
Anticipated capacity usage over the lifespan of the drives that delivers acceptable performance along with maximum reliability.
Not just capacity, but performance and reliability.
More drives for the same capacity means more complexity and reduced reliability
A drive filled past 80% delivers far lower performance than when empty. For that reason alone you might need to replace the drives sooner than planned. Good advice here is to buy 25% more nominal capacity than need otherwise suggests.
Twice as many drives for the same capacity cuts the reliability in half*. Worse, you will need twice as many enclosures and drive bays and maybe cables and power cords too, and those things can also fail.
Twice as many drives means splitting your data across volumes along with a correspondingly twice as many backup drives and twice the backup hassle.
RAID? To make two drives into a large single volume means RAID-0 striping. But failure of one drive is the same as failure of both drives in a striped pair. RAID-4/5 or RAID-1+0 can address that but now you are buying more drives and more enclosure space! There are good reasons to use RAID-4/5, but doing so to fix a problem created by buying too-small drives is not one of them.
* This can be partially mitigated by using fault-tolerant RAID such as RAID-4/RAID-5 or RAID1+0. But in each case you use more drives to mitigate the issue. Those same drives could instead be used for frequent backups including for versioning—if you wipe out a file on a RAID, its gone—RAID is not a backup.
Suppose you have 10TB of data expected to grow towards 18-20TB over the 3-4 year lifespan of the hard drives. Substitute your own numbers; concept remains the same.
Buying 12TB drives makes it a near certainty that you will have to buy higher capacity drives in very short order at considerable added expense (and time and hassle). And with very poor performance right out of the gate, as those drives are already at the 80% mark.
Buying 16TB drives takes you a little longer, but the same issues.
Small initial savings can be dwarfed by a MUCH higher lifespan cost of replacing too-small drives prematurely. And you might need to buy a second enclosure just so data can be transferred from old to new, further raising the cost.
Perhaps worse, the downtime (a real cost!), hassles, and risks of replacing multiple hard drives is not to be underestimated.
I’ve done this to myself all too often, not because of poor planning, but rather because I was buying the largest drives I could, which were not large enough. That problem seems constant but most folks are luckier in having much less data than the largest capacity hard drive. I would be keen on 32TB hard drives to carry me through a 4-5 year cycle.
A mix can be OK
There are limited cases where buying smaller drives makes sense.
For example, you might choose 4 X 12TB drives in a RAID-4/5 in order to gain fault tolerance for your main storage volume. That 4-drive setup offers 36TB of usable storage in RAID-4/RAID-6 mode. Partitioned into two 18TB volumes (each fault tolerant RAID-4), backup can then be made to 18TB or larger single hard drives.
For RAID with your main volume stores, the key thing is to choose a backup drive size, and then ensure that your main volume(s) do not exceed that size. Otherwise you will be backing-up by splitting a single volume across multiple backup drives, or having to use other multi-drive enclosure also in RAID configuration. Either way, the complexity rises and the reliability drops.
These lessons will be rammed home sooner or later to those who buy too-small drives:
- Two drives require two enclosures or two drive bays, ultimately doubling the number of enclosures needed. You might evade this at first, but you’ll replicate the requirement for all your backups. A single drive in a simple external case like the OWC Mercury Elite Pro is a lot cheaper than 2-bay or 4-bay or 8-bay enclosures.
- To fit your backups onto lower-capacity drives, you will be doubling the number of enclosures or drive bays for those drives also, further raising costs.
- What happens if your data does not it onto a single volume? You can make two volumes on two drives, or use RAID-0 striping or similar. Now you have increased complexity along with decreased reliability, since twice as many drives means twice the potential failures.
- The cost of an enclosure is significant, particularly Thunderbolt. Paying 15% more for 11% more capacity is a rounding error in the context of a multi-bay unit with might itself cost $500 or $800.
Anticipate total storage requirements 4 years out, then buy hard drives having 25% more capacity. For example, if you anticipate 16TB of data in 4 years or so, buy 20TB drives. That extra space gives you some buffer room for unexpected increases in storage needs, and will also keep performance much higher than, say, 16TB drives.
In addition, it is usually best to standardize on hard drive size, so that your primary volumes match the size of backup drives, e.g. a 20TB main storage drive should correspond to 20TB backup drives.
Finally, if using any kind of RAID offering higher capacity, strictly avoid any single volume that is larger than the standard hard drive capacity. Otherwise, it will not be possible to backup that volume to a single drive.