Fri, 09 Mar 2007
Disk Delving - 2 Good Papers and a Blog
"The Google team found that 36% of the failed drives did not exhibit a single
SMART-monitored failure. They concluded that SMART data is almost useless
for predicting the failure of a single drive."
-- StorageMojo - Google's Disk Failure
Experience
There have been two excellent papers on disk drive failures released recently, the Dugg and Dotted Google paper - Failure trends in a large disk drive population (warning: PDF) and the also excellent but less hyped Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?.
Both papers make very interesting reading, the comparisons of SCSI to SATA disks alone should turn some heads, but they are a little dry, so once you've worked your way through them it's worth looking at the summarised highlights over at StorageMojo, a top notch blog that was recommended to me by Kim Hawtin. StorageMojo covered both papers and I've linked to them in the quotes above and below.
"Further, these results validate the Google File System's central redundancy
concept: forget RAID, just replicate the data three times. If I'm an IT
architect, the idea that I can spend less money and get higher reliability
from simple cluster storage file replication should be very
attractive."
-- StorageMojo - Everything You Know
About Disks Is Wrong
Like this post? - Digg Me! | Add to del.icio.us! | reddit this!
Posted: 2007/03/09 09:21 | /sysadmin | Permanent link to this entry | This entry + same date

