 2003-10-03, 13:36 #1 nomadicus     Jan 2003 North Carolina 24610 Posts Mean time between failures In discussing hardware reliability, I was looking at disk drives. Suppose a disk has a 300,000 hours MTBF (Mean Time Between Failures). If we have 100 disks on a system (assume equal usage), does that say the next failure of any single disk will be 300,000/100 or 3,000 hours? What if the 100 disks are spread equally between 4 systems? Does that change the probabilities? Do probability mathematics come into play? (of which I know very little).
 2003-10-03, 17:51 #2 xtreme2k     Aug 2002 2×3×29 Posts http://www.storagereview.com/guide20.../specMTBF.html This should help you understand MTBF. After you have read it you will see that lot of us has misconceptions about it.
Yes they do, possibly the most tortuous of mathematics when I was taking those courses. Because it is "mean" time before failure, you must account for the variances and sample sizes. Now, when you add more units, you have not only more chances of failure but also a higher probability of a premature failure ... consider building a 1000 disk array, its MTBF is probably zero ... because one of those suckers is DOA ... even when running, it wouldn't last long ...

 2003-10-06, 16:35 #4 nomadicus     Jan 2003 North Carolina 2×3×41 Posts So now I should consider infant mortality of a group of disks I get, the service life of the disks within that group, the operational MTBF (if I can get it), balanced along with the MTBF as a guideline toward understanding when a group of disks would become more prone to failure. Things are never as simple as they seem. Great pointer. Thanks!

