March 20, 2006 :: Cyclone reboot

Started by Jason, March 20, 2006, 07:27:00 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Jason

Cyclone is presently waiting for a reboot due to a loss of services.

I expect this to be completed within the next few minutes.

Jason

Service has been restored.

I'll reply more on this later.

Jason

I'm still looking into this from an incident management perspective, however I do know that the server functioned fine throughout this whole issue.  It was network related in other words (which is a great relief, although it still needs to be addressed).

I just wanted to provide what update I have at the moment so there aren't any fears over the server itself.  It was up and functioning fine throughout the network issues.


DVR

jason,

fyi -

move this post if its in the wrong spot.

cyclone is not responding. none of my sites are responding and cpanel is down. also, ftp isnt connecting.

ping is timing out and tracert isnt making it to your servers.

i noticed this at 1:30am est tuesday march 21, 2006. and its now 1:36am and still no functionality.

thanks!
DirecTV Tivo HD DVR Receivers & Hard Drive Upgrade Kits: HR22 HR21 HR20 HR10-250 www.DVRPOWER.com

Jason

Thank you very much for the info.  It is some sort of network issue I've reported as it wasn't down -- alertra recorded no outages other than the initial one described yesterday morning. 

Still researching.

Please let me know here or by email if you still are having problems.  I'm reaching everything fine as of this moment though I am requesting additional investigation.


DVR

jason,

as of 3:00am this morning i was still unable to connect to my sites, cpanel or ftp. even ping was timing our still and tracert was failing still. at that time all other websites connections and such elsewhere were working fine. also, it was cyclone only as i tested thunder and the other server as well.

i tried again at 8:40am and connections are fine now.

your server status screens well always showing green on your website. not sure why that was if cyclone was down.

thanks!
DirecTV Tivo HD DVR Receivers & Hard Drive Upgrade Kits: HR22 HR21 HR20 HR10-250 www.DVRPOWER.com

Jason

Quote from: restino on March 21, 2006, 08:43:53 AM
your server status screens well always showing green on your website. not sure why that was if cyclone was down.

They're green because the server was never down.  Other than the *network* outage yesterday morning, I've not seen any other outages.

If you were (or do again), please post or email me the exact tracert you get.  I require that to have the datacenter check network connectivity.  It's likely that a network issue existed between the server and your particular ISP, geographic location, etc.  However, the server was reachable elsewhere and was never down.  Hence the green status light.

Regards,
Jason

Jason

Here's what caused the outage yesterday morning:

Quote
at approsximately 5:00 am for some reason our external fans that draw air through the dry cooler room both shut down. they did not lose power that we know of - but they just stopped. they are on automatic to run at a certain temperature. well this caused the dry cooler water loop to slowly heat up to the point that we started losing ac units on high head pressure.

once that happened - our power room had several main breakers trip due to heating.

we have turned the fans on manual - so they are on with no electronics intervention - sothey will just run 24x7 at this point.

some of the aggregation switches were on the breakers that went out so part of the dc was out. that is why some of your servers were out and others were not.

we are going to leave the fans on at this point rather than risk going to auto again since we are so close to opening the new dc anyway.

Note:  Our server was one of the ones that never lost power.

Restino, the datacenter has verified the the existence of the routing/network issue as well.  It's unrelated to the fan issue above and deals with their core routers.  It should be resolved as you stated, though if you witness it again, please immedatiately send me a tracert and I'll call them asap.  The outage was not global again, just isolated to some ip blocks.

Thanks,
Jason