I've got a machine here which is consistently locking solid after 1.5
to 2 hours uptime and I'm looking for possible explanations.

Shortly after the last installfest, I threw a wireless NIC into it and
ran it for a month before shutting it down on the basis that I hadn't
gotten around to actually setting anything up on it.  Last weekend,
I decided that I had the time to get back to it, so I brought it back
up and had it die after an hour and a half.

I've tried going back to the two previous kernels that I'd run on it
pre-wireless and these also lock up in the same timeframe.  Given that
one of those kernels has previously run for 31 days and another for 5
days without incident, I don't think it's a kernel issue.

To rule out memory problems, I installed memtest86 and that ran just
fine for approximately 30 hours without detecting any errors before I
shut it down.

The hardware is an Abit KT7A-RAID motherboard with a 900 MHz Athlon, 256M
RAM, dual 40G Maxtor drives (connected to the mobo's RAID controller,
but using the kernel's md RAID), 2 eepro100 and 1 dwl-500 NICs, a 40x
Acer CDROM, and a random ISA video card that I had laying around.

When it locks up, it just stops dead - no errors are logged at any time,
the keyboard and NICs are totally non-responsive (it won't even unblank
the screen or toggle caps/num/scroll lock).  It basically acts like the
CPU overheated and shut down, but I would expect heat problems to have
also affected memtest86.  Just in case, I've tried underclocking it,
but the CPU stubbornly runs at 900MHz no matter what I tell the BIOS.

Any suggestions for what other than heat may have killed this box's
reliability?  Is it probably heat even though memtest86 is fine?  (And,
if so, why is it overheating now but not a month ago?  The room it's in
is still around the same temperature.)

-- 
When we reduce our own liberties to stop terrorism, the terrorists
have already won. - reverius

Innocence is no protection when governments go bad. - Tom Swiss