Crucial Paradigm Public Forums Forum Index
Author Message
<  Public Announcements  ~  Server4 Outage
Aaron
PostPosted: Fri Aug 19, 2005 8:27 pm  Reply with quote



Joined: 05 Feb 2004
Posts: 474

Server4 had a short outage at apporixmately 5.00am CDT, this was caused by a high load on the server. The server required a reboot, and an fsck.

We are still determining the cause of the high load, it is expected that the daily backups had a resource leak and caused the high load. The server normally runs at about 20-35% capacity.
Back to top
Aaron
PostPosted: Sat Aug 20, 2005 4:01 pm  Reply with quote



Joined: 05 Feb 2004
Posts: 474

We were not able to find the exact cause of the outage, but upon investigation we did find 5600+ emails in the mail queue which may have caused the load to spike and cause the server to become inaccessible. We also noticed cpanellog consuming a large amount of resouces. We are monitoring the server to check for further outages.
Back to top
Aaron
PostPosted: Sat Aug 20, 2005 4:01 pm  Reply with quote



Joined: 05 Feb 2004
Posts: 474

server4 had another small apache outage today, we are currently investigating the cause of the outage.
Back to top
Aaron
PostPosted: Sun Aug 21, 2005 11:54 pm  Reply with quote



Joined: 05 Feb 2004
Posts: 474

It appears that server4 had another outage today, we suspect it was caused by a high load again. This is the time of the day when all the server crons run, so we will be working our way through all the daily crons which run to try and find the one causing the server to crash.

We apologise for the inconvenience caused.

Regards,
Aaron
Back to top
Aaron
PostPosted: Tue Aug 23, 2005 3:28 pm  Reply with quote



Joined: 05 Feb 2004
Posts: 474

There seems to be ongoing stability problems with the server, we are not yet able to pinpoint the exact cause of the random lockups of the server. We will be performing a few tests and upgrades on the server which will result in a few short outages.

We suspect one of the following could be the cause for the massive increases in loads causing the server to crash.

1. Iowait bug with linux kernel 2.4+Dual Xeon+RHEL which was previously prelevant (6+months ago). This was resolved some time ago, but it appears the problem may have been resurrected.

2. A DoS attack of some sort, although we have no evidence to show this.

3. A script (client or server run) is consuming a large amount of server resouces causing the server to crash.

4. Faulty hardware, although there are no signs of this, its possible that there is faulty hardware. After some upgrades and tests, we may perform hardware replacements to rule out this option.

If you have any questions regarding this please feel free to contact us.
Back to top
Display posts from previous:   
All times are GMT + 10 Hours

Page 1 of 1
Crucial Paradigm Public Forums Forum Index  ~  Public Announcements

Post new topic   Reply to topic