Lightning :: Server reboot on April 9, 2005

Started by Jason, April 09, 2005, 03:25:24 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Jason

Service monitors emailed me that httpd went down at 8:27am EST this morning.  Charlottezweb support was alerted minutes later and determined that a reboot was required which was then sent to the datacenter.
The server was up but httpd had become unresponsive so it was rebooted at 8:53am EST -- about 26 minutes from start to finish. 

Support was aware of the issue a few minutes after it initially occurred, but the datacenter (EV1) is apparently having issues with their normal reboot request system today.  That delayed our reboot request from proceeding in as timely a manner as it usually is (typically within 5-10 minutes tops).  We had to open a ticket with them to get that done.

There are no immediate explanations for the outage of httpd so we believe it to be related to a quick spike of cpu loads or something hardware related.  Typically hardware problems will become apparent if we notice repetitive outages without cause.  To this end, we'll be monitoring the server closely over the next few days in case any patterns arise.  Otherwise, the daily offsite backups do run prior to the time of the outage and they occassionally run long and raise loads at times.  This could've been the cause if coupled with some other processes that exceeded normal levels.

I will post any updates here if applicable.

Regards,
Jason