Re: [Hampshire] [OT] MTBF

Top Page

Reply to this message
Author: James Courtier-Dutton
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] [OT] MTBF
2009/7/20 Adrian Bridgett <adrian@???>:
> On Sat, Jul 18, 2009 at 21:58:36 +0000 (+0000), Andy Smith wrote:
> [snip]
>> MTBF figures for hard drives (and possibly other things) are quoted
>> in terms of analysis of large numbers of that model.  The item is
>> rated for 2 years of use.
>
> They also tend to:
> a) be in very controlled environments
> b) like not actually being used
> c) ignore the first few months (infant syndrome)
> d) discount anything after a few years (which is fair enough)
>
> I looked a while ago and found:
> - disks ~ 2m hours
> - most other stuff (PSUs, memory, motherboards) for 100,000 hours
>
> So how come I have had _way_ more disks fail than anything else (even
> taking into account that each box has 2-4 disks I have).
>


Oh, that is easy, the figures for HDs are faulty!!!!
I am sure that the conditions for the HD MTBF figures do not include
one picking up the laptop, while switched on, and taking it to a
meeting. This sort of activity is much more damaging to a HD than the
motherboard itself.
I once had a nice case of having the laptop switched on at home.
I put it into standby, put it in my car and drove to work which is
about 1 hour away.
This was a windows laptop top. What I did not know is that after a
certain period of time, I think it is 15mins, windows will take the
laptop out of standby, power it on, and then place it into hibernate
mode. Now, if I happened to go over a bump while this was happening,
guess what happened. My laptop did not work when I got to work that
day.
I have to either hibernate or completely shutdown my laptop now
between home and work.

Now, SSDs have an advantage over HDs in that one picking up a laptop,
while switched on, and taking it to a meeting will have no damaging
affect on the SSD. So the MTBF figures are more likely to be
reasonable.

I think people don't seem to realize that HDs have very low resistance
to shock while switched on, and this is the main cause of HD failures.
On the HDs that failed on you, were you able to determine the max G
force that it received during its powered on life? I don't know how to
get that information, but I am sure it would provide for some
interesting stats. I think it would also be useful to have some study
that would try and look into why HDs fail early in life, and thus try
and recognize which ones are more likely to fail early. We might
eventually get to the HAL 9000 prediction that a device is certain to
fail in 72 hours, but until that time it will function normally.