which earn me advertising fees or commissions.
As an Amazon Associate I earn from qualifying purchases.
RAID Fault Tolerance : Keep a Cold Spare
See the discussion of fault tolerance.
A cold spare is a matching drive for the other drives in the RAID, set aside as a replacement to be used when needed. Some hardware RAID systems also support a hot spare (a drive that is already in the RAID, but not used until a failure occurs).
By keeping a cold spare on hand, the RAID can restore (“rebuild”) its fault-tolerant state, which can be a lengthy process (all day or longer). Without a cold spare, it might take some days to obtain a replacement before that process can begin, if a matching drive can be found at all (discontinued).
For example, the failure of one drive in a RAID-5 means that it reverts to a RAID-0: failure of another drive failure means total loss of the volume. It’s all a matter of odds, and perhaps the Murphy’s Law principle: when something bad happens, something worse might follow!. By starting the rebuild process ASAP, the risk of losing the volume is minimized.
The ideal cold spare is one already mounted in a spare drive carrier, and pre-tested and pre-initialized, ready for immediate use.