Blog entries tagged with "failure"

Even more failed fans

Friday, October 19th, 2007 at 7:46 pm

After the failure of the rear 120mm fan in preston I ordered two replacement fans. One as a replacement and another as a spare.

But where did I get them? Initially I was looking at the site for a Melbourne based supplier of PC modding gear. The cheapest ‘low noise’ fan they has was AU$15.90, but postage was another AU$10. On the site for a popular Sydney based PC parts dealer a similar fan was at a similar price of AU$15.40. But that included postage. It should be obvious which one I went with.

Now, it was when I was installing the new fan that I noticed that the two fans in the power supply had also failed. For most of the week the only fans operating in the case were the front 120mm (low flow) and the CPU fan. I had been wondering where the hot component smell was still coming from.

So what to do? With a working rear 120mm fan enough air is drawn through the power supply to keep it cool. But this could be an opportunity to replace it with one that has removable cables that would reduce the clutter inside the case.

Tagged with: , ,

More fan failures

Sunday, October 14th, 2007 at 5:38 pm

Following on from the failure of the fan in my UPS I discovered that the 120mm fan in the back of preston has also failed. Considering that I got this fan along with a bunch of other random parts a long time ago (ie I didn’t get it new) it is actually surprising that it has lasted this long.

Digging through the cupboard I found another fan that it was a simple matter to swap with the failed one. I also took the opportunity to clean the 120mm fan and filter that is on the front of the case. It was almost solid with dust that had been accumulated over the past year.

One of the many things that has been on my list (only in my head, I should really give Hiveminder another go) was to give the system a once over in regards to dust and cooling. Right now I’m going to order some new (quite) fans and possibly a fanbus for greater control.

Related to this is that I also disconnected my systems from the UPS. A week ago there was a brief power flicker when I was out. Instead of keeping the systems up the UPS shut down. Then this afternoon the UPS decided to shut down for no (apparent) reason. This is where I step up my research into what UPS to get, in particular the linux support.

Tagged with: , ,

The smell of hot metal

Tuesday, October 2nd, 2007 at 10:18 pm

This morning the fan in my UPS (that I got for free) failed.

When I got up there was a strong ‘hot metal’ smell near my computer room, but it wasn’t coming from preston, my linux server. This evening while I was emptying the bin I noticed that the UPS was unusually warm as its internal fan appeared to have failed. A spare 92mm fan, some double sided tape, a bunch of molex connectors later and the temperatiure is dropping. Albeit in a fairly noisy manner.

For some time I have been planning on buying a new UPS and it looks like I need to move that up. Unfortunately the 259 day (8 and a half months) uptime of preston (since the big power outage) is under threat.

Although that uptime was also under threat by my other long standing plan to rearrange my computers which included finally getting rid of gromit which has been shut down for almost a year.

Tagged with: , ,

Shedding power

Tuesday, January 16th, 2007 at 9:29 pm

This afternoon there was a mass power outage that affected most of the state.

So how did this affect me?

Firstly it made the ride home interesting as I pass through a couple of major intersections that turned into a free-for-all as there was no power for the traffic lights. I had one near miss while crossing Ferntree Gully Road when a car decided to floor it through two moving lanes of trucks…

Secondly I was able to see how long the UPS I got for free would last which I can determine from the MRTG logs on preston (my linux box that was the only device running from the UPS):

  • My router last responded at 4:10PM. This gives me a five minute window (until 4:15PM) for when the power went out.
  • The last uptime entry (i graph the system uptime in seconds) was at 16:55PM, again with the five minute window.

This means that the UPS powered preston for between 40 and 50 minutes until the batteries ran flat. Once I move the router and cable modem onto the UPS I should easily get 30 minutes of runtime which is more than ample to handle most of the blackouts I have experienced. If shaun, my windows desktop, was also running I would be surprised if I get 10 minutes…

And what was the uptime? 8,209,932 seconds which is just over 95 days. I think this is the third longest uptime for preston (it wouldn’t have been so high without the UPS riding me through some small outages) with the longer times being 115 and 105 days. In comparison I never exceeded 100 days on gromit, my old linux box.

Tagged with: , ,

Webpages that automatically refresh are bad for download limits

Monday, September 12th, 2005 at 5:55 pm

As soon as I got up this morning I checked the radar image of the rain around Melbourne courtesy of the Bureau of Meteorology in order to determine if I was going to ride my bike or catch the bus to work today. As I wanted to know how it was changing I checked the looped version which led me to the decision to catch the bus.

I did this on one of my Linux boxes which is on 24/7 and did not close the window which has been ok in the past as it only downloads new images every 15 minutes. Unfortunately around half an hour after I did this something went wrong and it started to fetch the images multiple times a second and I did not notice this until I arrived home. An average of 40 kilobytes a second for almost nine hours is over a gigabyte that needlessly comes out of my 12 gigabyte (on-peak) per month cap :(

Tagged with: ,

It’s official…

Wednesday, November 3rd, 2004 at 10:14 pm

Earlier tonight I downloaded the SeaTools Diagnostic Suite and I used it to perform two full tests of all three of the Seagate drives in Preston.

And hdc, the odd drive out, was reported as having quite a number of bad sectors. I decided not to let the utility attempt to fix the errors as I then won’t be able to tell which files are corrupt without testing them all one by one.

Time to organise some new drives although I believe I will lose the ability to monitor the temperature of the drives directly if I go for Western Digitals rather than Seagates…

Tagged with: , ,

Hard drive errors

Tuesday, November 2nd, 2004 at 6:15 pm

Doing duty as my media storage and backup box is Preston, one of the two boxes that is always on. In it are four 80GB drives, one Western Digital WD800JB and three Seagate Barracuda IV’s. The three seagates are combined into a single logical volume using LVM.

On the weekend as I was copying music across to Shaun in order to back it all up on to DVD’s I noticed it pause a couple of times during copying. Further investigation led me to discover a large number of instances of the following two lines in the kernel log:

hdc: dmaintr: error=0x40 { UncorrectableError }, LBAsect=42315658, sector=42315480
hdc: dma
intr: status=0x51 { DriveReady SeekComplete Error }

I was able to tell from the transfer log which files had paused and I was able to verify that they were corrupt and the errors appeared in the kernel log.

While the three drives in the logical volume are the same model one does have a couple of differences:

  • It is mounted differently in the case which means it runs hotter as there is not as much airflow as over the other two
  • It is connected via the motherboard IDE controller rather than the Promise card
  • It is around about 4 months older than the other drives
  • It was used as the boot drive in Shaun before moving into Preston

Another thing the kernel log shows is that the previous occurances of these errors was back at the start of October, the same day that I transferred a copy of most of the data across to a friend’s box. This indicates that the error has been there for a while but it hasn’t been evident as I rarely access those particular files.

Fortunately only one of the three drives has experienced the errors, the one that is different as explained above. I seriously doubt that the issue is with the onboard IDE controller as the boot drive is operating fine. Googling around brings me to the conclusion that this drive has developed a fault and I should replace it…

But with what? Do I spend AU$99 on a replacement 80GB? What about spending AU$278 on a pair of new 160GB drives? This last option does have the advantage of giving me an additional 80GB capacity with one fewer harddrive to mount and keep cool. Then what do I do with the two perfectly good 80GB drives?

There is even the option of replacing the faulty drive with a 160GB. This gives me the additional capacity but has the much more tricky task of rearranging the data on the drives rather than copying from the old volume to the new volume.

Tagged with: , ,

Almost one hundred days…

Saturday, August 21st, 2004 at 8:27 pm

If the power hadn’t gone out earlier tonight then with an extra one day and eleven hours Gromit would have been up for one hundred days. As it is ninety-eight days is the longest Gromit has been up for anyway. Surpassing the previous record of eighty-two days

Anyway it gave me an opportunity to add the additional 128MB, that has been sitting around since I tested it almost a month ago, into Gromit. Not that it really needs it but it can’t hurt… Unless it is dodgy like the last time but there was a reason I tested it…

Now to find a reason to finally rebuild Preston and Shaun…

Tagged with: , ,